As information available to computing devices has exponentially increased, the amount of this information that is unstructured has correspondingly increased. Unstructured data can be described as “raw data,” “minimally-processed raw data,” or, more generally, as data that lacks relational structure such as a relationship between the data and identifiers of what the data is and/or what the data is like. Unstructured data presents problems for both humans and computing devices alike, but often in different manners. Computing devices, unlike humans in most cases, cannot automatically discern what a particular piece of data is or is like. Without something more, computing devices may mistreat unstructured data and/or fail to perform an operation with the unstructured data that the computing device is configured to use. For example, a computer could be configured to auto-populate fields of a document with information such as names, addresses, brand models, etc. when the computing device has access to such information. However, if the computing device is not aware that data to which it has access corresponds to one of those data types, the computing device would not be able to populate the fields.
On the other hand, human users may not be familiar with a particular type of data and therefore may not be able to characterize it. For example, a United States citizen may easily recognize a string having a pattern (###) ###-#### as a phone number, but that same person may not be able to identify or distinguish foreign phone numbers from addresses or licensing numbers. Furthermore, humans may mentally mischaracterize data based on personal experience that does not broadly account for factors outside an individual's experience. For example, a non-technically trained person could mischaracterize an IP address as a foreign phone number.
Furthermore, both humans and computing devices often suffer from a lack of latent (e.g., hidden, not immediately obvious, inferential) attributes of data. Although human minds, unlike computers, can infer data (e.g., guessing a gender or nationality of a name, guessing a year of a car model based on prior knowledge about the range of years), at the time a user is impressed with data the user may not be able to infer latent data without prior knowledge and/or without finding more information regarding the data (e.g., a degree held by an individual associated with a name and address of the data, an IP address associated with a location or individual).