The use of computer systems and computer-related technologies continues to increase at a rapid pace. This increased use of computer systems has influenced the advances made to computer-related technologies. Indeed, computer systems have increasingly become an integral part of the business world and the activities of individual consumers. Computer systems may be used to carry out several business, industry, and academic endeavors. The wide-spread use of computers has been accelerated by the increased use of computer networks, including the Internet.
Many businesses use one or more computer networks to communicate and share data between the various computers connected to the networks. The productivity and efficiency of employees often requires human and computer interaction. Users of computer technologies continue to demand that the efficiency of these technologies increase. Improving the efficiency of computer technologies is important to anyone that uses and relies on computers.
In the genealogy industry, it has become useful to extract information from various types of documents and records into a format that can be easily discovered using modern computerized search techniques. This approach has become popular for a variety of record types including census records, birth certificates, and military records.
One record type that usually is not extracted is published family history documents. Information in family history documents usually is not predictably organized in a way that is easily and affordably extracted using techniques typical in the industry. As a result, many published family history collections have been processed only with basic optical character recognition (OCR) software. This leaves the data with a large number of inaccuracies. More importantly, useful inferred information such as presuming the children to have the same surname as their father and relationship information cannot be effectively captured. The OCR data representing the family history document can be searched in a generic fashion using typical free-form document search techniques, but the amount and type of data that can be used effectively as part of the search is limited and typically highly unreliable.
Some effort has been made to create a completely automated relationship information extraction process for family history documents without any human interaction during the extraction process. This effort has not been seriously pursued due to the extreme inaccuracies that result from such an approach.