1. Field of the Invention
The present invention relates to the fields of data processing. More specifically, the present invention relates to the sending and receiving of data structures in a bandwidth reduction form.
2. Background Information
Recently, with advances in the Internet and web based applications, semi-structured, data structures, such as Extensible Markup Language (XML) data have become an industry standard mechanism to either transfer or store data. Semi-structured data structures are favored over other conventional fixed and/or application specific data structures because of the extensibility, transparency, platform-independency and manageability. These data structures allow two pieces of software programs that are independently developed to communicate with each other. However, transmission of these semi-structured data structures has at least two drawbacks, a) the size of the data structure having to be transferred and (b) the associated processing cost (especially on the receiver side).
Size: Semi-structured data structures, such as XML data structures, are typically very redundant when compared to other conventional fixed, application specific data structures. Many tag names and attribute names must be repeated over and over again. For example, it usually takes 100–300% more bytes to represent the same data in XML. In addition, it is very common that there are many duplicate attribute values. Consider the example “Employees” XML data structure illustrated in FIG. 4a, the tag name “Employee” and attribute names “Employee ID” and “Title” are repeated over and over again.
Processing Cost: Semi-structured data structures, such as XML, are also very expensive to parse. Typically, the data sender either builds the data structure directly concatenating a number of strings or feeding them into a stream, or builds an object hierarchy and then serializes it into a string or stream. On the receiver side, the receiver code must then scan the data string/stream to sequentially look for space characters to tokenize, and compare each tag names and attributes with known keywords. Further, such parsing requires a lot of memory, especially if each token is stored as a separate string object.
These drawbacks are especially problematic for smaller devices with limited CPU-power and small amount of memory (such as wireless mobile phones and palm sized personal digital assistants) with lower data transmission speed. In certain applications, such as Nippon Telephone Telegraph—DoCoMo's iMode, the operation cost can be significantly higher, as the application operator charges for the service on a per-packet basis.
Thus, a more efficient approach to transmitting such data structures is desired.