1. Field of the Invention
The present invention generally relates to a method and system for lazy data serialization in computer communications, and more specifically to the field of computer communications and distributed computing systems.
2. Description of the Related Art
With modern software development practices, software used for a distributed computing environment is designed using a variety of layers, abstractions, and models. These layers of software cause much inefficiency in communications. One particular example is serialization of computer objects when need to be transmitted over a computer network to another computer. For all such conversions, the object is converted into a network format for transmission.
The current state of the art used for converting computer objects for transmission on a network is illustrated in FIGS. 1 and 2. FIG. 1 shows in flow diagram 100 the steps taken by a computer in order to transmit objects over a computer network. The procedure is entered into step 101 during a course of execution of a program in a computer system. In step 103, the procedure takes an object, which is a representation inside a computer memory of some information, and converts the object to a canonical format.
The canonical format could be the rendition of all elements into a big-endian format, the rendition of an object into an Extensible Markup Language (XML) representation according to some defined schema, or the rendition of an object into a binary format that is agreed as a standard means of exchange. In step 105, the procedure establishes a network connection to an external computer. The network connection may be established to one or more than one external computers. In step 107, the object is transmitted. The sending procedure then terminates in step 109.
As an example, many current bindings of internal Java Object use XML as a way to communicate to remote parties using Simple Object Access Protocol (SOAP). In order to convert a local representation, they may define a new converter. The codes excerpt below shows a typical example of a software package which can be used to convert a local representation “LocalFormat” to an XML format. The converter factory acts as a indirection point for looking up the right converter subroutine for a specified format, and is a commonly used pattern by Java (and other languages using object oriented paradigm) programmers.
public class SomeClass... {public static Convert(Object obj) {String name = obj.getClass( ).getName( )ConverterFactory factory = new ConverterFactory( );Converter XML_converter =factory.newXMLConverter(“name”);String converted_xml = XML_converter.convert(obj)); }}
The program fragment above corresponds to step 103 of FIG. 1. Different implementations and different class names may be used.
After the conversion to a canonical XML representation in the String format, the converted information is sent to the other party. Different pieces of software can be used for this type of transmission. A typical example can be provided in Java by a transfer using the socket interface in Java. The sending party will use a routine with a code fragment like the one shown below. This call to new Socket( . . . ) corresponding to step 105, while the call to Str.writeBytes( . . . ) corresponds to step 107.
Public class SomeClass {Public void send (String converted_xml, String host, int port) {Socket s = new Socket(host, port);DataOutputStream str = s.getOutputStream( );Str.writeBytes(converted_xml)}
If the remote address were running the same environment as the local server, e.g., had the same operating environment or middleware, then the conversion to XML could have been replaced by a more efficient serialization process. However, when the call to convert to XML is invoked, there is no indication of which address the object is going to be transmitted to. Also, there is no indication that the conversion is being done in order to transmit the data on the network.
The conversion may have been done in order to store the object into a file, to print out the object to be viewed by a person, or to store the object into a database. Therefore, in the current state of the art, such optimizations are not feasible since neither the purpose for conversion to a canonical format, nor the destination of the canonical format are known when the object is converted to the canonical format.
FIG. 2 shows the steps in flow diagram 200 used in the current state of the art by the receiving computer. The procedure is entered in step 201, and in step 203 the receiving computer establishes a connection with the sending computer. In different implementations, the method to establish a network connection may differ. For example, one may listen to a socket, other one may connect to a socket, while others may use some classes built atop the socket abstraction. In step 205, the object is received from the network connection. In step 207, the object is converted to the local format from the canonical format in which it was transmitted on the network. The procedure then terminates in step 209, and the receiving computer can perform further processing for the object that is received.
In many cases, for example, when the applications are running on the same system using hypervisors, or when the object is being sent to a machine running the same hardware and software environment, such conversion may be bypassed. However, due to the layering of the software stack, the conversion routines are unaware of the destination of the object, and cannot readily bypass such conversion.
One example of such conversion to a network format is the use of ntoh and hton macros in computer communications using an Internet Protocol Suite (TCP/IP) network transmission format. The macro hton is used to convert a data type such as integer, floating point number, or character to convert it from the host format to the network format (Big Endian format—if needed). When the local host is Little Endian, the macros are used to convert to the network format. However, if the local and remote host are both Little Endian, then the conversion at both ends can be eliminated. When the data is being formatted for transmission, the nature of the host at the other side of communication is not known, and thus it is difficult to make such an optimization.
Another example of such conversion to a network format is in the case of services-oriented architecture and web-services. In such communications, clients and servers usually communicate using XML and SOAP. The network format is an XML encoding of an object maintained in the local repository. When the object is being sent to a party which has similar characteristics as that of the sending party, such conversion may not be necessary. However, since the identity of the party to which the information is being sent is not known, it is not possible to make such an optimization easily.
Therefore, there is a need to develop a system that can delay the actual conversion of an object to the network format until the identity and nature of the remote party of a communication is known, and to determine whether the conversion can be bypassed.