External Index and Query (EIQ) Indexed Adapters

EIQ Indexed Adapters are unique federated data adapters that (1) read, and externally clean, transform and standardize source data that needs some form of pre-processing, and/or data sources that need external SQL query processing, e.g., files and legacy systems, (2) build and maintain independent external pre-processed data INDEXES, usually through one of twelve different ways to capture changes in data sources, and (3) execute standard SQL queries against these indexes to isolate and read (a) previously cleaned, transformed, standardized and secure data from indexes, and/or (b) clean, transform, standardize and secure results data from sources.

EIQ Indexed Adapters perform structured and unstructured queries WITHOUT submitting complex queries to data sources, WITHOUT copying or moving source data, and WITHOUT transforming data schemas.

There are three main reasons to pre-process and/or index source data:

  1. Address any data quality and standards issues through pre-processing – cleaning, transforming and standardizing, e.g., people-related names, SSN, email addresses, mailing addresses, phone numbers, organizations and products. Typically, this slow-changing data in a source is relatively small volume compared to much larger volume fast-adding transcation data, which tends to be non-human/machine-generated and therefore substantially less likely to need pre-processing.
  2. There may be a need to make added-value indexes available, e.g., extracted entites from unstructured data, or build and maintain pre-aggregated, pre-calcuated and/or pre-joined transcation data in materialized views to greatly accelerate reporting, BI and analytics queries.
  3.  The source system is less capable/incapable of supporting an external query, e.g., mainframe files, flatfile systems, email, etc.

EIQ Indexed Adapters ARE NOT conventional federated data adapters, data marts, data warehouses, enterprise search, or extract, transform and load (ETL) tools, i.e., EIQ Indexed Adapters are NOT conventional technology.

EIQ Coventional Adapters ARE conventional federated data adapters that have no independent data pre-processing or indexes and can be used in conjunction with EIQ Indexed Adapters, as EIQ Hybrid Adapters, on the same data source, or as stand-alone adpaters data sources that cannot or may not be independently read and indexed.

WhamTech’s index and query processing engine uses the open-source, widely-used PostgreSQL RDBMS for EIQ Indexed Adapters and, in some cases, EIQ Federation Servers™ (see below), and in addition to other tools and applications, enables data virtualization, data federation and virtual data integration. EIQ Indexed Adapters were built around an index and query processing engine to cover the full range of data management, master data management and other capabilities – from discovery to insights – on structured, semi-structured and unstructured data, regardless of data source, type, format, platform or location.

EIQ Indexed Adapters enable radically improved processes that allow almost any data source(s) to appear to be very clean, standardized, high performance individual databases or intgerated through MDM as a single database to a standard application, analyst or business user.

WhamTech’s EIQ Indexed Adapters (and EIQ Federation Servers) are Not Conventional.

EIQ Indexed Adapters take a very different approach to the major problems facing almost all organizations – large or small, as they are designed to answer one huge question: How to integrate, share and interoperate data and information in near real-time without…

  1. creating a huge additional infrastructure, e.g., massive data lakes and data warehouses, to address data and data source issues,
  2. overloading and 100% depending on existing systems to respond to external queries, e.g., queries by conventional federated data adapters and query engines to data sources that are 100% dependent on the data in sources and data source capabilities, and
  3. losing the ability to execute advanced uniform standard SQL, Python, R and GraphQL queries on any and all data sources, regardless of the underlying capabilities of specific data source systems?

Note: On each of the above points, EIQ Indexed Adapters can:

  1. plug-and-play in existing infrastraucture, transparently reside between applications and data sources, and complement, leverage and be agnostic to existing systems,
  2. use EIQ Indexed Adapters™ to read, pre-process and query data in independent external indexes, in some cases, without imposing any load on data sources,
  3. with reading, pre-processing and indexing almost any type of data source and data, also submit unstructured/search engine queries in combination with or in addition to structured SQL queries.

EIQ Adapters™ offer the advantages of the above-mentioned three conventional approaches to data integration (data lakes/warehouses, conventional adapters, and enterprise search), without the disadvantages associated with each of them. Not only do EIQ Adapters™ enable data and information (hereinafter collectively referred to as data) virtualization, federation, (virtual) integration and interoperability within and between organizations, but they also “turbocharge” existing data source access and add significant features, such as:

  • Advanced high performance query processing to support standard apps such as reporting, BI and analytics
  • Semi-automated data discovery, profiling and mapping to standard data views
  • Heuristic data mining
  • Added business objects and pre-joined, pre-aggregated and pre-calculated indexed views for direct queries and monitoring
  • Key performance indicators (KPIs) and key data monitoring and event processing using Business Process Management (BPM) software
  • Querying across and between multiple different types of data sources and organizations, including mainframes, databases, files, documents and e-mail
  • Link mapping and link analysis at a standard simple and complex entity-level (connecting seemingly unconnected data and information)

EIQ Federation Servers™

EIQ Adapters™ include a highly flexible middleware/sub-middleware product, called EIQ Federation Server™ that resides between applications/middleware and other EIQ Adapters™ and other EIQ Federation Servers™, to allow indefinite performance, scalability, load balancing, redundancy and failover. EIQ Adapters™ can apply business rules to enable the best combination of data for querying and pass back to applications. Applications and middleware access EIQ Adapters™ through standard ODBC, JDBC, REST APIs and a native driver. The following diagram, EID1, illustrates where EIQ Adapters™ reside in the application/data source architecture:


EID1: EIQ Adapters™ reside between applications/middleware and data sources.

EIQ Adapter™ Process In Detail

There are some similarities to conventional approaches, but the differences become clear as the process is broken down:

  • Automatically discover data sources using ODBC calls, if available, and/or other means.
  • Automatically profile data using raw indexes, which consist of trees and lists (see Technologies for more information). The trees are histograms of domain values and are used as data profiles. The lists do not need to be retained.
  • Develop data quality transforms, which cleanse, transform and standardize data read for building and maintaining indexes. The data profiles allow a “before” and “after” iteration process.
  • Read source data using one or more of twelve ways.
  • Clean, transform and standardize data used for indexes.
  • Build and maintain production indexes, usually in near real-time, using the same index schemas as data sources – no major Schema transforms.
  • Discard data.
  • Map Standard Data Views to EIQ Indexes™. More than one standard data view can be mapped.
  • Applications/middleware access EIQ Products™ through standard drivers and/or Web services.
  • Present a single virtual indexed view of all data sources as if a single database to applications.
  • Execute advanced, high performance SQL queries for structured database queries and unstructured text search almost entirely in the EIQ Indexes™.
  • Generate a list of pointers, URLs, RDFs, etc., to raw results data in data sources.
  • Use pointers to retrieve raw results data from data sources through user-level access with appropriate authentication and security.
  • Clean, transform and standardize raw results data.
  • Consolidate, or otherwise post-process, results data from multiple data sources.
  • Present results to applications/middleware in any format.

The above process is illustrated in some detail in the following diagram EID2:

EID2: EIQ SuperAdapter process in detail.

EIQ Adapter Technology Key Points

  1. Virtual, private CLOUD-like access through multiple distributed universal and uniform INDEXES and query engines for structured queries and text search on ALL data sources, from mainframes to Web documents to email – seen as one consistent data source conforming to one or more standard data models.
  2. Data warehouse query and results quality, and performance, due to:
    • Cleansed, transformed and standardized indexes (and result sets)
    • Multiple indexes available, including fuzzy, pre-aggregated and pre-calculated, join and Link Indexes™
    • Complete control over query processing
    • Combined structured queries and text search
  1. Obtain results when data sources are unavailable through index inversion.
  2. Archive options available through (i) storing changed data in a separate database, (ii) date-time stamping indexes and/or (iii) embedding changed data in indexes.
  3. Leave source data in place in original format and index schemas as per original data source – only INDEXES mapped to standard data models.
  4. Continue to use legacy applications and data, but allow for:
    • Modern application access to legacy data
    • Legacy application access to modern databases
    • Use as a bridge/transition/migration tool from legacy to modern systems
  1. Scale through non-tiered and/or multi-tiered, independently maintained indexes with single-point for an application and/or multi-point access for middleware, including Web services and XML.
  2. High performance queries using PostgreSQL and variants, and scalable configuration and deployment options, e.g., Containers and Kubernetes.
  3. Near real-time updates immediately available to queries and monitoring for alerts/notifications – watch-lists, event-triggers, subscriptions and thresholds.
  4. Create almost any and all indexes, such as pre-aggregated, pre-calculated, composite, fuzzy, advanced text, Link Indexes™ and Embedded Value Indexes™, which allow data and metadata to be stored in indexes.
  5. Using a combination of indexes and the appropriate standard data model terms, create VIRTUAL data warehouses/data marts for multi-dimensional queries.
  6. Link Indexes™ add-on option to enable link mapping, allows for more advanced queries such as complex joins, degrees of separation, physical, logical and ontology model discovery, merge, combination, validation and presentation, and link analysis. The combination of normal content indexes and Link Indexes™ externally manages obvious/direct links and reveals non-obvious/indirect links among disparate data and information across multiple disparate data sources.
  7. Other add-on options including open source or commercial categorization software, WhamTech’s eDiscovery tool, Teracase™, Intelligent Search’s fuzzy match, open source GATE entity extraction and WhamTech’s advanced text search.
  8. Significant benefits compared to conventional adapters in federated systems, data warehouse or enterprise search – combines the best aspects of these approaches and overcomes the worst

For more information on WhamTech products, click here.