Data object creation, management and access are currently moving from a single computer model to a distributed network model. Instead of limiting data objects to a single client computer, organizations and users are adopting a collaborative structure, where multiple users can work on multiple data objects from multiple locations. Data objects may be stored in one or more locations, and users can interact with those data objects using a web-based browser or other user interface. Using a single network interface, or portal, for data objects may help track which data object is the most recent version, while at the same time unifying user-data object interaction to a single virtual location.
Distributed collaboration is also seen as a way to achieve platform independence, since data objects created and accessed using this model may not depend upon a single computing platform or operating system. As a result, there are a number of different types of data objects now being created and used in the distributed collaboration setting, such as those generated from online web applications. The most common distributed collaboration setting is the web-based application model, but other intranet and networked models are available as well. One such networked model is Microsoft SharePoint® Services. Web applications like Microsoft SharePoint Services offer a network-based document management platform accessible through an interne or intranet portal.
As web applications become more popular, it is important that the transmission and storage of web application data objects remains efficient. Non-web based software applications designed to work with traditional client-based or client-server based models should function equally well in the distributed collaboration environment. For example, data backup is just as crucial for web-based data objects as it is for client computer-based data objects. Present data backup software applications need to be able to handle the different types of data objects used in the web-based environment. To this end, some backup software applications, such as EMC Software's Backup Manager for SharePoint, monitor, manage and maintain backups of the data objects accessed in the distributed collaboration environment. The efficiency of these applications is directly related to their ability to pass large amounts of data objects from one software module to another module, or one software application to another, regardless of whether the transmission is over a directly connected computer or a network.
An issue with current distributed collaboration systems is that the volume of data objects can overload the memory resources of the software applications involved. The reason may be that data objects are be kept in temporary memory for longer periods of time, or that software applications have not been properly configured to handle web application data objects. This may also be due to the fact that in a distributed collaboration environment, users may be working on many different types of data objects. For example, in an online office productivity suite, users may be collaborating on a number of word processor, spreadsheet, presentation and scheduling data objects, all at the same time. As each web application transmits and accesses each data object, this can overload their collective memory resources.
In order to efficiently distribute and manage such data objects, efforts have been made to streamline their transmission and storage. One such effort applies the method of serialization. One skilled in the art will appreciate that serializing is a way to simplify a data object by converting it into a string of data, then transmitting or “streaming” it to another application or storing it to disk. In other words, serializing a data object will “flatten” it into a more basic format for transmission to a destination. After the serialized data object arrives at the destination, it will be “unflattened” or deserialized back to its original form. One will appreciate that there are many serialization and deserialization techniques, including converting to binary or text-readable formats. Serializing a data object to a binary stream may speed transmission and enable higher volumes of transfer, since the data object has been converted to a more streamlined data structure comprised of ones and zeros rather than kept in its original format.
Current serialization techniques provide for serialization and deserialization using a single data object per stream. As such, data objects may be serialized, then streamed to a destination one data object at a time. For example, five data objects may be serialized into five flattened objects, which are then transmitted using five separate data streams. While serialization does lessen the impact on memory resources at the point where the data object is flattened, serialization by itself does not improve transmission time or load. Present distributed collaboration environments, as well as many distributed computing environments, involve interaction with multiple data objects. Unfortunately, present serialization and deserialization techniques require that each object must be transmitted using its own binary stream, so multiple streams may be simultaneously transmitting. In other words, a hundred serialized data objects will require a hundred data streams. This places a toll on the transmission pipeline, and results in an “out of memory” response from the associated software application.
What is therefore needed is an improved way to stream serialized data objects. What is further needed is a way to improve web application and web application system performance by reducing memory requirements.