The analysis of massive amounts of data is becoming a routine activity in many commercial and academic organizations. However, analyzing these data sets may require processing tens or hundreds of terabytes of data. Such large data sets have become known as “big data.” A data set characterized as big data is prohibitively large such that, for example, it is beyond the capabilities of commonly used software tools to manage/process the data, or at least to do so within a reasonable time frame.
Existing big data analtyics and management solutions, however, typically tend to focus only on scalability and reliability related issues. Thus, there is a need for improved data analytics and management techniques, in general, and in the context of big data applications.