The background description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
The evolution of computing and networking technologies have made data collection, storage, and analysis increasingly easier to perform, and at a continuously larger scale. The ever-decreasing size of network-capable computing devices have increased the number of data sources gathering or creating data, the types of data available from these sources, and the overall amount of data available. Likewise, advancements in data communications and storage technologies have enabled entities interested in data from these sources to collect increasingly larger amounts of data in databases or other data stores. This exponential growth in digitized data generation and collection continues to be fueled by machine generated data in provenance of devices such as sensors and probes that can monitor, measure and assert health, behavior, state, environment and performance of many types of machines and man-made systems, as well as humans and many aspects of the natural world.
The collection of data on such a scale allows for analysis that can result in discoveries and advancements, across various fields of study, which were not previously possible. For example, a medical researcher can use medical information gathered by wearable devices or sensors outside of a hospital setting to analyze health or medical patterns across a population. In another example, an advertiser can use online behaviors of a population of users to determine product trends, interests and advertisement effectiveness within a population.
However, certain data generating devices such as sensors or probes are capable of generating data flows that, while digital, reflect their analog nature (and moreover, is often non-linear) or simply cannot be classified as symbolic and human-readable.
To explore, discover and extract pertinent information out of these new data flows, an incremental process is required that allows for starting at a state where very little is known about the data, and provides for development towards a full data model at both the data consumption side and also at the data repository level. The complexity of this task requires methods far beyond simple numeric comparison and/or textual search. For example, signal data, instead of processing it before it is stored (resulting in loss of information), should be stored as it is and signal processing techniques (e.g. FFT) then be used to extract a relevant view of the signal. This process is recursive in nature and, as such, deciding the meaning of data (e.g., classifying, categorizing, segmenting, etc.) a priori cannot be performed.
Adding to this the rapidly widening gap between digital data production capabilities and network bandwidth capacity (at any scale), it becomes imperative to store the source data close to their point of production and only distribute across the network the data relevant to the task at hand.
Unfortunately, existing data management solutions (e.g., relational databases, non-relational databases, data stores, and other data collection techniques) have traditionally required static, pre-defined database structures, rules and schema that are created for the database when the database is established. As such, users requesting the data are limited to data access according to static schema (that may be outdated), from a database whose structure might be inefficient and costly. Additionally, updating the database structure, rules or schema in existing solutions requires re-starting the database from scratch.
Others have put forth efforts towards adaptive database systems. For example, U.S. Pat. No. 5,983,218 to Syeda-Mahmood is directed to modifying a relevance ranking of databases based query and response patterns for the databases. However, Syeda-Mahmood lacks any discussion of a modification of the databases themselves.
United States pre-grant publication number 2011/0282872 to Oksman, et al (“Oksman”) is directed to updating a system to increase the effectiveness of future queries. However, in Oksman, the system's updating is performed based on usage of query results or other feedback to the query results, rather than based on the results themselves. Similarly, United States pre-grant publication number 2012/0296743 to Velipasaouglu, et al (“Velipasaouglu”) is directed to updating a database based on a query and a user's activity following a query response.
United States pre-grant publication number 2007/0294266 to Chowdhary, et al (“Chowdhary”) is directed to using time-variant data schemas for database management based on database modification requests. However, in Chowdhary, the system simply stores new versions of schema stored along with older versions. Additionally, Chowdhary lacks any discussion regarding using query responses to generate new or updated versions of data schema.
All publications identified herein are incorporated by reference to the same extent as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference. Where a definition or use of a term in an incorporated reference is inconsistent or contrary to the definition of that term provided herein, the definition of that term provided herein applies and the definition of that term in the reference does not apply.
The following description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
In some embodiments, the numbers expressing quantities of ingredients, properties such as concentration, reaction conditions, and so forth, used to describe and claim certain embodiments of the invention are to be understood as being modified in some instances by the term “about.” Accordingly, in some embodiments, the numerical parameters set forth in the written description and attached claims are approximations that can vary depending upon the desired properties sought to be obtained by a particular embodiment. In some embodiments, the numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of some embodiments of the invention are approximations, the numerical values set forth in the specific examples are reported as precisely as practicable. The numerical values presented in some embodiments of the invention may contain certain errors necessarily resulting from the standard deviation found in their respective testing measurements.
Unless the context dictates the contrary, all ranges set forth herein should be interpreted as being inclusive of their endpoints and open-ended ranges should be interpreted to include only commercially practical values. Similarly, all lists of values should be considered as inclusive of intermediate values unless the context indicates the contrary.
As used in the description herein and throughout the claims that follow, the meaning of “a,” “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
The recitation of ranges of values herein is merely intended to serve as a shorthand method of referring individually to each separate value falling within the range. Unless otherwise indicated herein, each individual value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g. “such as”) provided with respect to certain embodiments herein is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention otherwise claimed. No language in the specification should be construed as indicating any non-claimed element essential to the practice of the invention.
Groupings of alternative elements or embodiments of the invention disclosed herein are not to be construed as limitations. Each group member can be referred to and claimed individually or in any combination with other members of the group or other elements found herein. One or more members of a group can be included in, or deleted from, a group for reasons of convenience and/or patentability. When any such inclusion or deletion occurs, the specification is herein deemed to contain the group as modified thus fulfilling the written description of all Markush groups used in the appended claims.
Thus, there is still a need for system that can dynamically adapt the structure, schema and/or metadata of its data archives.