User search-engine behavior, such as queries or user requests, query results and search-engine results pages, and search-engine user interaction information, can be collected, stored, and used for improving search-engine user experience and product quality. One format used for logging the user interaction information is the JavaScript Object Notation (JSON) format. For example, a Search Activity Log of user interaction information contains one row per page view to store the user interaction information, which may include user requests, the search-engine response, and user interaction in or associated with the search engine website or application.
The size of the logs used to store the user interaction information can reach dozens of petabytes, and the growth rate may continue to increase over time. As a result, the logs can be compressed to reduce storage usage. Previous solutions for log compression, such as more general compression schemes like Microsoft Xpress Compression Algorithm [MS-XCA], are not tailored for hierarchical index log compression and preserve neither structural information nor global indexes of the log.