The present embodiments relate to complex file structures, and associated file storage. More specifically, the embodiments relate to utilizing the complexity of a storage array together with decomposition of complex files to manage the files.
A computer file is a self-contained piece of information available to an operating system and a computer program. The file is a collection of data or information. Different types of files store different types of information. Conventionally, files have been known to include a raw sequence of bytes, characters, or records interpreted solely by a user application. An associated traditional file system views these files as raw byte sequences and splits the files into blocks or extents which are then stored in an underlying storage device. However, from a user's and application's point of view files can be considered as containers of smaller logical objects.
Recently, file formats have grown in complexity. Examples of these complex formats include, but are not limited to, compressed and uncompressed archives, compressed sets of assorted objects, tables, indexes, and integrated video and audio streams. The majority of modern files can be considered as containers of smaller logical objects, as opposed to a single homogeneous stream of bytes. These internal objects exhibit diverse properties, access patterns, and performance requirements. At the same time, traditional file systems do not look inside the files to address these complexities. Rather, these file systems view files as a raw byte sequence. Specifically, files are split or otherwise separated into blocks or extents which are then stored on an underlying data storage device. This approach simplifies file system design, but at the same time misses significant opportunities for improving file system management efficiency.