An Unstructured Information Management Architecture (UIMA) refers to a software system that provides large-scale analysis of unstructured data to discover useful information. The UIMA architecture includes a set of three frameworks: the Java Framework, the C++ Framework, and the UIMA Asynchronous Scaleout (UIMA-AS) Framework. The frameworks define a series of interfaces and manage a variety of components and data flows between those components. The components may have different functionalities in analyzing unstructured data. For example, components analyzing unstructured text may include a language identification component, a language specific segmentation component, a sentence boundary detection component, and an entity detection component (an entity may be, for example, a natural person, a geographical place, etc.). The UIMA may be deployed on a cluster of networked nodes, with UIMA having the capability to wrap its components as network services scaled on large volumes by replicating processing pipelines over the cluster.