The present invention relates to accessing data in a file.
Data files and file systems are abstractions used to organize and access data. A file is a set of data that can be accessed by a program or application with the file system of a particular operating system. Most conventional file systems have a naming system for files, and organize the named files in named folders or directories, each of which can, in turn, be included in higher-level directories. Thus, most file systems have a system of data in files, with the files organized in a directory hierarchy.
The data in a file are typically delivered to or created for an application in response to a request from the application. The application program interface (API) of some file systems allows applications to retrieve particular bytes or combinations of bytes of data in a file, even when the data are delivered when streamed over a network connection. For example, the application can use a seek function to find a particular location and then sequentially read data for some interval. Such APIs provide what is typically called random access, and allow retrieval of data from anywhere in a file. Other mechanisms for accessing data in a file, for example sequential access, normally are implemented on top of a random access mechanism.
Whether the retrieved data are decipherable by the application can depend, however, on the particular application and the format of the file. Data that are stored in standardized formats such as ASCII text are readily accessible. For example, such data can be deciphered, or processed, by any word processor or text editor and, if streamed over a network, can be processed as they are received.
When data are delivered by streaming, some file formats may require that the file data be provided in its entirety before any of the data in the file can be processed by a particular application. For example, some viewers for Adobe portable document format (PDF) files may require access to data at the end of the PDF file in order to present the first page of the PDF document. In this case, access to the first page is delayed until the entire file-including—data that are not needed to present the first page—is retrieved.
The application of encryption to a file also can require that the entire file be retrieved before any portion of it can be accessed. That is, to find objects in a file that is encrypted as a block of data, the entire file must first be decrypted. The application of compression to a file can also limit access to the data in the file. The file must be decompressed from its beginning, until the desired data has been decompressed and can be located. Thus, in general, the user must choose between random access to data in a file and compression or encryption of the data in the file.
Delays in the delivery of requested portions of encrypted, compressed, and specialized files can be significant if the file is large relative to the requested portion of the file or if the requested portion is close to the end of the file. In these cases and without random access, the user must wait for the delivery of relatively large portions of unrequested data. Any such delay can be exacerbated if the file is being delivered over a network, where the rate of delivery is limited by hardware capabilities and network traffic.