According to the online encyclopedia Wikipedia, the term “big data” refers to the use of large amounts of data from multiple sources with a high processing speed to produce an economic benefit. Problems include primarily the capture, storing, searching, distribution, statistical analysis and display of large amounts of data. The volume of these amounts of data is in the terabyte, petabyte and exabyte ranges.
Due to the range of data to be processed, conventional electronic data processing systems are often not suitable, or are only suitable to a limited extent, to usefully process such extensive data. For example, relational database systems, which use, for data storage, an individual, local mass storage device and a schema which is identical for all data sets, are generally unsuitable for storing or processing such extensive data. Likewise, in the statistical evaluation of data, many programming languages are unsuitable because they do not have sufficiently specialized libraries for this purpose.
The R programming language is known inter alia from the book “R in a Nutshell,” 2nd edition, O'Reilly, 2012. The R programming language is particularly suitable for statistical calculations based on extensive data. Therefore, the R programming language is suitable in principle for processing big data problems, e.g., for implementing so-called “reduction functions” as used in the so-called “MapReduce” approach for processing big data problems.
One problem with the R programming language resides in the fact that the runtime environment used to execute it interprets the source code written in the programming language R. Due to the complexity associated with the interpretation, inter alia parsing of the source code, interpreted programming languages have a reduced running speed compared with programs written in other programming languages.
It could therefore be helpful to provide devices and methods which can accelerate the processing of extensive data, in particular the processing of big data problems using the R programming language. Preferably, existing components should be built on as far as possible to reduce the costs of developing new components.