Data streaming allows data to be obtained from storage on an as needed basis. In data streaming, data is requested from the storage system, for example a file or database system. Chunks of data are obtained sequentially until the request is fulfilled. Typically, each chunk of data in the sequence includes a specified number of bytes. Thus, conventional data streaming typically fetches equal-sized chunks of data in order until sufficient data has been obtained to fulfill the request.
Character-based data includes encoded data used to represent characters. For example, the character-based data may be stored in mixed-byte encoding. Mixed byte encoding utilizes a varying number of bytes to encode each character. However, other encoding schemes may be used. Such encoding schemes may vary the number of bytes use to encode a character or may used a fixed number of bytes to encode a character. Character-based data can be converted into characters, for example text.
Data streaming may be desired for character-based data. FIG. 1 depicts a conventional method 10 for performing streaming of character-based data that may be encoded using an encoding having a variable number of bytes per character, such as in mixed-byte encoding. FIG. 2 depicts a conventional system 30 for performing streaming of character-based data. The system 30 includes an input stream reader 32, a client 34, and a storage system 40 used to store the data. Referring to FIGS. 1 and 2, a request for character-based data is provided to the input stream reader 32 from the client 34, via step 12. The request is from a user and is, therefore, typically for a fixed number of characters. Thus, for mixed-byte encoding, requests for the same number of characters may vary in length based upon the number of bytes used to represent the characters.
The input stream reader 32 fetches from the storage system 40 a sufficient amount of character-based data to satisfy the request, via step 14. The input stream reader 32 converts the character-based data that has been fetched into characters, via step 16. The number of characters sufficient to fulfill the request is provided to the client 34, via step 18. Thus, the fixed number of characters is output in step 18. Any remaining data is discarded, via step 20.
Although the conventional method 10 and system 30 function, one of ordinary skill in the art will readily recognize that the method 10 and system 30 are inefficient. As discussed above, the request is for a fixed number of characters. However, for encoding schemes such as mixed-byte encoding, the same number of characters may correspond to differing numbers of bytes of character-based data. The exact amount of character-based data for the fixed number of characters in a particular request is unknown. As a result, the sufficient amount of data to satisfy any request, not just the request at hand, is fetched in step 14. Thus, a large amount of data, for example an entire document, is typically fetched in step 14. However, the request may be only for a small portion of the document. Consequently, a large amount of data may be unnecessarily fetched, converted, and then discarded. The method 10 and system 30 are, therefore, inefficient.
Other conventional methods for performing character-based data streaming may function as conventional data streaming. In such conventional methods, a request is made and a fixed number of bytes is fetched and converted using the converter (input stream reader) 32. This process is repeated, fetching and converting sequential chunks of data, until the request is fulfilled. However, such a conventional method may not be capable of handling encoding schemes in which the number of bytes per character varies, such as mixed-byte encoding. This is because a chunk of the character-based data may not correspond to a whole number of characters.
Accordingly, what is needed is a method and system for performing streaming of character-based data that is more efficient and capable of utilizing encoding schemes having a variable number of bytes per character. The present invention addresses such a need.