The ubiquity of computers in business, government, and private homes has resulted in availability of massive amounts of information from network-connected sources, such as data sources accessible through communication networks, such as the Internet. In recent years, computer communication and search tools have become widely available to facilitate the finding and availability of information to users and software applications. Most computer communication and search tools implement a client-server architecture where a user client computer communicates with a remote server computer over a communication network. In order to achieve better system performance and throughput in the client-server architecture, large communication network bandwidths are needed as the number of client and server computers increases.
There are several basic approaches to increasing communication bandwidths, including increasing the rate of data signal transmission over communication medium, increasing the number of communication channels and pathways to transmit data in parallel, using server farms to increase service capacity, and using efficient data compression methods to reduce the number of bits transmitted over the communication medium for a given piece of information.
Aside from data transmission, data format also affects communication bandwidth. Data format is a representation of data in a particular format. Data formatting generally takes place at a higher level of abstraction than pure data transmission. Data interpretation and processing is generally applied at the data format abstraction level. For example, whether data received at a server is character data or numerical data depends not merely on the bits received, but rather on the encoding and format of the bits. As such, efficient encoding and format can result in efficient data processing by reducing the need to transform data from one format to another before interpreting the data and processing it.
There are many data formats in use today, many of which depend on the application and type of data. For example, audio data is formatted differently than video data. Similarly, numbers used in financial applications may be formatted differently to avoid round-off errors as compared with numbers used in a telephone directory to represent telephone numbers. In recent years, new data formats have been introduced, such as XML (Extensible Markup Language), that provide flexibility for formatting various types of data. In XML, the user can define any data format that is suitable for the user's application. The simplicity and flexibility of such markup languages, like XML, however, can create certain inefficiencies and disadvantages. For example, XML has a single data type to represent information as character strings. Other types of data must be represented as character strings, which are later processed at a receiving application program to obtain the data. For example, some data, like numerical data, must first be converted by a receiving application program from a character string representation in XML to its intended format before any further processing can be performed. This conversion of data format adds overhead and reduces overall system throughput, while increasing system cost and complexity. Additionally, data formatting languages like XML generally transmit the markup tags that are used to format the data along with the data. The markup tags consume a considerable amount of communication bandwidth, often more than the bandwidth consumed by the data, adding to the communication overhead.
In complex business applications, many software modules are involved, each of which may process the same data at different times for different purposes. For example, in a purchasing application, a front end server may receive orders from a customer, pass the ordering information to a business server, which may evaluate the customer's credit rating, and upon approval, pass the ordering information on to a shipping server where the customer's order is fulfilled. While the data is traversing multiple servers and multiple software modules from a front end to a back end server, the data input initially may be transformed from one format to another repeatedly for use by different software modules. A data format which is usable by multiple software modules in a business application reduces or eliminates the need for format transformations across software modules and servers, and thus increases the overall efficiency of data processing and data communication between servers.