Acquiring and analyzing “big data” is an important part of today's scientific and technological progress. The term big data generally refers to any collection of data sets that is very large and complex, so that it cannot be captured or processed by traditional tools, such as common databases or common data processing techniques. Large-scale datasets may be collected from different sources, for instance, financial markets, internet interactions, mobile phone users, industrial productions, consumer behavior, and so on. Some large-scale datasets reach the sizes of many petabytes, a petabyte being 1015 bytes or a thousand terabytes. Collecting and analyzing these datasets is very challenging for today's computer technologies.
Some existing techniques attempt to manage big data. These techniques include using distributed file systems for storing large datasets or using multi-processing for analyzing those datasets. These techniques, however, cannot keep pace with the ever growing scale of the large datasets. Moreover, these techniques are often inaccessible to the common users, but instead are retained and available to institutional, corporate, or government users. In addition, existing technologies often require authorized users to have knowledge of the underlying storage and processing architecture and the related software packages.