Organizations often need to store and manage large amounts of digital data. Such data may belong to different users and/or be stored in different locations on various storage systems. To provide storage resources to the organization's users, an organization may utilize a storage area network (SAN), any number of different storage appliances (e.g., filers), and/or other means. For example, an organization may purchase and configure various commercially available filer appliances to provide storage for the organization's users to access from across a network.
In addition to storing and providing access to digital data, organizations often need to track, record, and/or analyze patterns of access to the stored data. For example, an organization may need to track access to conform to government regulations, to implement charge-back based on actual usage, to identify owners of given content, or for performing various other functions.
To track data access patterns, organizations may implement a data access tracking system. For example, a traditional access tracking system may collect access event logs from different filers (e.g., using filer-specific APIs) and record the events in a traditional, general-purpose database, such as a managed relational database. Once the events are stored in the database, an administrator can use various querying mechanisms (e.g., relational database management system—RDBMS) to query the database for data. For example, a traditional RDBMS may allow the administrator to query the data using a query language, such as SQL.
One shortcoming of storing large volumes of data access events in a traditional database is performance. The amount of access event data that an organization tracks can grow very quickly over time. As the amount of grows, performance limitations of traditional databases can make querying, indexing, and/or otherwise maintaining the data prohibitively expensive and/or slow.