Some embodiments of the present disclosure are directed to an improved approach for implementing query-level access to external petabyte-scale distributed file systems.
With the explosion of online accessible data comes the need for huge data repositories. Such repositories are augmented daily, and many have aggregate sizes on the petabyte (and larger) range. In some cases although a huge amount of data can be made accessible, it is often the case that only a portion of the huge data is needed for any particular application or analysis. It is often also the case that the aforementioned particular application or analysis is conveniently implemented in a database engine (e.g., an Oracle™ system). Accordingly it would be convenient to access such huge data (e.g., as stored in an external “big data appliance”) from within such a database engine.
Legacy approaches have partially addressed the function of query-level access to external data through use of a query language construction called “external tables”. Legacy implementations of external table constructions in a database engine query have provided the limited function of importing data from an external system and bringing it into storage locations within the database engine. While this technique can be used for modestly-sized datasets, or even for large datasets, use with petabyte-sized data stores introduces new problems to be solved. Indeed, although external tables are convenient for importing modestly-sized datasets, or even for large datasets from a location stored outside of a database engine into standard database engine tables, using legacy techniques, it can be impractical to do so for petabyte or larger datasets that are common in big data appliances.
Certain legacy approaches have been touted: One approach is to build an application that reads from the big data appliance (e.g., a Hadoop file system file) and write the contents of the file to a disk that is accessible to database engine. The database engine can then map the file on disk to an external table. This technique imposes additional IO overhead that worsens as the size of the file on disk grows. In some cases, the IO overhead increases by a factor of three. That is, data is read from the external big data repository, and then written to a local disk before being processed by the database engine via a query (e.g., using the aforementioned external table construction). Very often big data appliance files are large so this IO overhead is substantial. Similarly, often big data appliance files are so large (e.g., multiple petabytes and larger) that it is impractical to host an entire copy of the big data within the database engine (e.g., usually smaller than petabytes).
Another approach is to rely on data storage from within a user space (e.g., using a Linux File System in User Space). This technique spoofs files so they appear as local files. This approach requires operating system “root” administration privilege to install, and as such, this approach incurs a substantial performance penalty, at least because it requires software layers between the user space and the database engine, which in turn incurs switching between software layers. Moreover, often big data appliance files are so large that it is expensive and/or impractical to replicate an entire copy of the big data. Still more, legacy techniques often require buffer copies between buffers resident in one environment (e.g., within a Java virtual machine) and buffers resident in another environment (e.g., within a C implementation above the Java virtual machine).
Techniques are needed to provide database engine access to petabyte-scale files while avoiding the aforementioned impracticalities and performance impacts.