Text streams are ubiquitous and contain a wealth of information, but are typically orders of magnitude too large in scale for comprehensive human inspection. Organizations often collect voluminous corpora of data continuously over time. The data may be, for example, email messages, transcriptions of customer comments or of phone conversations, recordings of phone conversations, medical records, news-feeds, or the like. Analysts in an organization may wish to learn about the contents of the data and the changes that occur over time, including when and why, such that they may understand and/or act upon the information contained within the data. Because of the large volume of data, reading each document in the corpora of data individually to determine the changes and summarize the contents can be expensive as well as difficult or impossible.