The present invention relates to systems, methods, and computer programs that ingest, archive and retrieve data.
There are fundamental issues surrounding storing and retrieving very large amounts of data on the scale of terabytes and petabytes. Ingesting large amounts of data rapidly to a queryable state is problematic. Organizing the data for retrieval, as done by conventional databases, can create significant performance overhead, thereby limiting ingest rates.
Databases also have traditionally had limits on the amount of data they can store as well as difficulties maintaining efficient searches of very large data collections or retrieving large amounts of data rapidly. Typically, aged data is removed from databases and archived to a separate system such as an off-line tape library. Accessing the archived data can be slow.
In the case of tape archives, the correct tapes must be located, mounted, and read to on-line storage. Some of the recovered data may not be in a compatible format with the current database and it may not be possible to merge the two data sources together for unified queries. In very large systems a significant amount of data may not be readily available for portions of the data lifecycle.
It would be desirable to have the ability to easily develop by code generation, software code that would ingest, query, and retrieve data from storage systems. Heretofore, it was necessary to create code templates that were replicated, and then hand-edited to obtain parameter differences. This is tedious and time consuming.
It would be desirable to have systems, methods, and computer programs that are able to ingest, correlate, archive, query, and retrieve very large amounts of complex data very quickly.