1. Field of the Invention
Embodiments of the invention described herein pertain to the field of computer software. More particularly, but not by way of limitation, one or more embodiments of the invention determine a blob chunk threshold and optional chunk transfer size to enable efficient communication with a database comprising binary large object (BLOB) data.
2. Description of the Related Art
Database systems generally provide the ability to permanently store and retrieve various types of data including characters, numbers, dates, objects, images, video and other types of data such as binary large object data. It has become increasingly more common as hard drive capacities and network bandwidth have continually increased for database users to store and retrieve large data objects such as multimedia including images, videos and audio. Storing and retrieving large data objects can inhibit the efficiency of a DBMS system and reduce overall system performance. This is because most database systems are not designed to handle widespread access to large data objects and hence there is a significant burden placed on the system when access to these large data objects becomes a regular part of day to day operations. This reduced performance during storage and retrieval of large objects is caused by a number of different factors.
When a large data object is retrieved by a client the database reads the entire object with one read operation. This retrieval operation in turn causes the database to allocate a segment of memory on the database server in which to read the entire data object being retrieved. On a small scale such allocations can reduce performance, but do not necessarily cause a substantial drop in overall system performance. However, when such requests become more common and many users across the entire system request large objects, systems are required to handle many threads asking for similar functions thus causing a significant reduction in system performance. If for instance, 30 users initiate requests to retrieve different PDF objects where each object is approximately 100 mb in size, then the server allocates approximately 3 Gb of memory. In many cases the occurrence of such an allocation requirement will impact system performance.
The impact of retrieving and storing large data objects on memory occurs when a database performs commands such as insert, update and select for example. Microsoft SQL Server, for instance, typically allocates 4 times the amount of memory of the object to be inserted. So in cases where a 50 MB object is to be inserted the server allocates approximately 200 MB of memory to the insert task. Another performance related problem that occurs when large data objects are transmitted between a client and server is that the transmission of such objects causes an increase in the number of network collisions which in turn places a noticeable burden on the network and reduces overall system efficiency.
To alleviate the burdens placed on a system when utilizing large blocks of data a technique known as “blob chunking” is used to read smaller blocks of data from a BLOB field until the entire BLOB field is read. Blob chunking may also be used to write a series of smaller blocks of data to a BLOB field until the entire BLOB field is written. To date there has been no way to intelligently determine the size of blocks to break a read or a write of a BLOB field into as current attempts at BLOB chunking merely attempt to allow a system to operate without returning an “out of memory” error for example. In addition, the threshold at which the blob chunking is utilized may not be the best size for transferring data to minimizing network utilization. The blob chunk size and transfer chunk size may differ as the memory of a database system and the network throughput of a system comprising the database have their own specific characteristics and utilizations that vary in time generally independently.
Because of the limitations described above there is a need for a system that can determine and optimize the blob chunk size threshold and optionally determine a transfer chunk size to allow greater efficiency in situations where large data objects are accessed and transferred between a server and a client.