Many organizations, such as hospitals or other health care providers, have a recurring need to analyze their data, which may be stored at several locations across disparate resources (e.g., with the hospital example, separate databases for financial information, supply information and clinical information). A service provider such as the assignee of this invention can provide a service to such organizations whereby the service provider collects this data and then houses the data in a normalized data warehouse for improved customer access and analysis.
The data collection effort involves extracting desired data from the appropriate data sources within the organization. In an effort to improve the process of data extraction from client systems, the inventors herein disclose a technique for remotely managing the queries and connection strings that are used during the extraction process from client systems. By remotely managing these queries and connection strings, users on the client side of the system are alleviated of much of the burden that past extraction systems have imposed upon them.
Thus, in accordance with an exemplary aspect of the disclosure, the inventors disclose a computer-implemented data extraction method comprising (1) receiving, at a first computer system, a query string and a connection string from a second computer system, wherein the second computer system is remote from the first computer system, (2) connecting to a data source within the first computer system based on the received connection string, (3) querying the data source based on the received query string, the query string defining the data sought to be extracted and a translation of the data sought to be extracted from a format of the data source to a format of a destination, (4) receiving data from the data source in response to the query, (5) translating the received data to the format of the destination based on the received query string, (6) assembling the translated data into a data structure, (7) sending the data structure to the destination, wherein the method steps are performed by a processor resident within the first computer system.
In accordance with another exemplary aspect of the disclosure, the inventors disclose a computer program product for data extraction comprising a plurality of instructions that are executable by a processor to (1) receive, at a first computer system, a query string and a connection string from a second computer system, wherein the second computer system is remote from the first computer system, (2) connect to a data source within the first computer system based on the received connection string, (3) query the data source based on the received query string, the query string configured to define the data sought to be extracted and a translation of the data sought to be extracted from a format of the data source to a format of a destination, (4) receive data from the data source in response to the query, (5) translate the received data to the format of the destination based on the received query string, (6) assemble the translated data into a data structure, and (7) send the data structure to the destination, wherein the plurality of instructions are resident on a non-transitory computer-readable storage medium.
In accordance with yet another exemplary aspect of the disclosure, the inventors disclose an apparatus for data extraction comprising a processor resident on a first computer system, the processor configured to (1) receive a query string and a connection string from a second computer system, wherein the second computer system is remote from the first computer system, (2) connect to a data source within the first computer system based on the received connection string, (3) query the database based on the received query string, the query string configured to define the data sought to be extracted and a translation of the data sought to be extracted from a format of the data source to a format of a destination, (4) receive data from the data source in response to the query, (5) translate the received data to the format of the destination based on the received query string, (6) assemble the translated data into a data structure, and (7) send the data structure to the destination.
In accordance with yet another exemplary aspect of the disclosure, the inventors further disclose a computer-implemented data extraction method comprising a client data extractor (CDE) module executing on a first computer system to perform a data extraction from a database of the first computer system, wherein the CDE module executing step comprises (1) the CDE module determining whether a data extraction is to be performed, and (2) in response to determining that a data extraction is to be performed (i) the CDE module sending a request to a second computer system, (ii) obtaining configuration data from the second computer system in response to the sent request, the configuration data comprising a query string and a connection string for use in the data extraction, (iii) connecting to the database using the connection string, (iv) extracting data from the connected database using the query string, wherein the extracting step includes translating the extracted data from a format of the database to a format of a destination during extraction at a query level based on data within the query string, (v) assembling the extracted data into a data structure, and (vi) sending the assembled data structure to a destination.
In accordance with still another exemplary aspect of the disclosure, the inventors further disclose a system for data extraction, the system comprising (1) a first computer system, and (2) a second computer system for communication with the first computer system via a network, wherein the first computer system comprises a data source and a processor, the processor configured to execute a client data extractor (CDE) module to perform a data extraction from the data source, wherein the second computer system comprises at least one server and a memory, wherein memory is configured to store configuration data in association with a plurality of identifiers, the configuration data comprising a plurality of query strings and a plurality of connection strings, a plurality of the query strings being configured to define (1) the data sought to be extracted and (2) a translation of the data sought to be extracted from a format of a data source to a format of a destination, wherein the CDE module is configured to (1) determine whether a data extraction is to be performed, and (2) in response to a determination that a data extraction is to be performed (1) send a request to the second computer system, the request comprising a request for configuration data and an identifier, wherein the at least one server is configured to (1) receive the request for configuration data, and (2) in response to the received request, automatically (i) access the memory to identify the configuration data associated with the identifier within the received request, and (ii) communicate the identified configuration data to the first computer system, wherein the CDE module is configured to (1) obtain the communicated configuration data from the second computer system, the communicated configuration data comprising a query string and a connection string for use in the data extraction, (2) connect to the data source based on the connection string, (3) extract data from the connected data source based on the query string, wherein the extracting operation is configured to translate the extracted data from a format of the data source to a format of a destination during extraction at a query level based on data within the query string, (4) assemble the extracted data into a data structure, and (5) send the assembled data structure to the second computer system, and wherein the at least one server is further configured to (1) receive the sent data structure, and (2) in response to the received data structure, automatically store the extracted data within the received data structure in the memory
In accordance with still another exemplary aspect of the disclosure, the inventors further disclose a computer-implemented method for remotely managing a data extraction, the method comprising (1) storing a data structure in a memory of a first computer system, the data structure comprising a plurality of query strings and connection strings, each query string and connection string being associated with an identifier, a plurality of the query strings being configured to define (i) the data sought to be extracted and (ii) a translation of the data sought to be extracted from a format of the database to a format of a destination, (2) receiving a request for a query string and a connection string from a second computer system, the second computer system being remote from the first computer system, and the received request including an identifier, (3) accessing the data structure to identify the query string and the connection string associated with the identifier included in the received request, and (4) communicating the identified query string and connection string to the second computer system for use by the second computer system to extract data from a database within the second computer system, and wherein the method steps are performed by a processor resident within the first computer system.
In accordance with yet another exemplary aspect of the disclosure, the inventors disclose a computer program product for remotely managing a data extraction, the computer program product comprising a plurality of instructions that are executable by a processor to (1) receive a request for a query string and a connection string from a second computer system, the second computer system being remote from the first computer system, and the received request including an identifier, (2) access a data structure in a memory of a first computer system, the data structure comprising a plurality of query strings and connection strings, each query string and connection string being associated with an identifier, a plurality of the query strings being configured to define (i) the data sought to be extracted and (ii) a translation of the data sought to be extracted from a format of the database to a format of a destination, (3) identify the query string and the connection string within the accessed data structure that are associated with the identifier included in the received request, and (4) communicate the identified query string and connection string to the second computer system for use by the second computer system to extract data from a database within the second computer system, and wherein the plurality of instructions are resident on a non-transitory computer-readable storage medium.
Moreover, in accordance with yet another exemplary aspect of the disclosure, the inventors disclose an apparatus for remotely managing a data extraction, the apparatus comprising (1) a memory for storing a data structure, the data structure comprising a plurality of query strings and connection strings, each query string and connection string being associated with an identifier, a plurality of the query strings being configured to define (i) the data sought to be extracted and (ii) a translation of the data sought to be extracted from a format of the database to a format of a destination, and (2) a processor for communication with the memory, the processor configured to (i) receive a request for a query string and a connection string from a remote computer system, the received request including an identifier, (ii) access the data structure in the memory to identify the query string and the connection string associated with the identifier included in the received request, and (iii) communicate the identified query string and connection string to the remote computer system for use by the remote computer system to extract data from a database within the remote computer system.
Through the soft configuration techniques disclosed herein, customers can be insulated from the myriad of connections and queries that are needed to support desired data extractions. That is, hard configurations can be avoided, and embodiments of the disclosure can leverage existing hardware on the customer's computer system without requiring additional software beyond the extraction software described herein. As such, queries and connections can be managed remotely from the customer and data extractions can be updated with new queries and connection strings without the customer needing to reinstall new software or make similar changes.
Furthermore, by performing source-to-destination translation at the query level, embodiments of the disclosure can avoid the need for separate data mapping components and improve the efficiency and flexibility of extraction.
These and other features and advantages of preferred embodiments of the present invention will be apparent to those having ordinary skill in the art upon review of the specification and drawings contained herein.