The term “big data” refers to data sets that are so large that the data sets cannot be processed using traditional database management tools and traditional data processing applications and techniques. Big data is sometimes described in terms of volume, velocity and variety. The amount of data available for processing is increasing because of the volume of data that has been captured over the years, because of the quantity of data collected by sensors and other machines, because of the wide-spread use of new applications including but not limited to social media, because of the proliferation of mobile devices and for many other reasons. Data is being collected at a velocity that previously was impossible. Finally, the variety of the different formats in which data is provided is unprecedented. Structured, unstructured, numeric, text, audio, video and many other forms of data are generated. There is great interest in using this wealth of information in predictive analysis and in various other ways.