With the rapid increase and advances in digital consumer products (i.e., smart phones, digital cameras, PDAs), more digital information is being generated than ever before. According to International Data Corporation, the total amount of digital information in the world will come to 2.7 zettabytes by the end of 2012. Majority of newly generated digital information is data like log data, digital video, images and sound files. This puts up a huge challenge for existing database management systems to search, analyze and retrieve the information.
One solution is to implement parallel data collections and processes for performing database management and database operations. Multiple instances of data streams are created to divide work among many parallel processes or threads. Each instance processes some fraction of the overall data set in parallel.