An increasing amount of data is generated by machines, as the so-called Internet of Things gains momentum. Human-generated content was the focus of the original Internet. Now many types of machines are online and connected. These machines generate many types of data, most of which is never viewed by a human. A single machine can generate many distinct types of data.
It is challenging to make sense of machine generated data. One of the challenges is developing schemas and extraction rules. Often, the format of the data being collected has not been determined or formally described when data collection begins. Issues to be addressed may not be appreciated when the data is collected. This makes schema and extraction rule development a moving target.