Hierarchical data encoding formats such as JSON and XML are often used as encoding formats to represent different forms of data. Data-intensive systems and applications heavily rely on hierarchical data encoding formats to function.
For example, key-value databases and Big Data analytics frameworks often use hierarchical data encoding formats to represent semi-structured data. Cloud based and web technologies also used hierarchical markup data encoding formats such as JSON as the main encoding format in client-server communications as well as in microservices applications.
In the majority of scenarios where hierarchical data encoding formats are employed, they are used as the boundary between a data source and a language runtime such as a JavaScript/Node.js virtual machine. Interactions between the language runtime and data source can often become a performance bottleneck for applications that need to produce or consume significant amounts of hierarchically marked-up data.
If the hierarchically marked-up data resides in a data source that is external to the memory space of the language runtime, the language runtime must materialize the data in its language-private heap memory space before consuming the data.
Additionally, encoding and decoding libraries in modern language runtimes often rely on general-purpose techniques that do not consider the structure of the data that they are manipulating. The adoption of such general-purpose libraries is mostly motivated by the fact that hierarchical data encoding formats are used in the context of dynamic languages such as JavaScript or Python, where it is not possible to know in advance the characteristics of the hierarchical data that will be processed by the application.
In other words, such applications do not use a pre-defined schema that could be used to speed up data access. However, the lack of a pre-defined schema does not necessarily imply that some form of structure could emerge in the way hierarchical data is created or accessed at runtime. Very often, a hidden schema may exist at runtime for dynamic languages.
Based on the foregoing, an approach for generating optimizations for applications that present hidden schemas at runtime is desired.