Systems that collect and manage data records may include data that comes from many sources of information. For example, in the case of systems that manage genealogical data records for family trees, the underlying data may come from directories, birth and death records, marriage records, and census data, as well as supplemental information that may represent the personal recollections of individuals, all collected over time.
Further, the amount of data records available for use in creating family trees has become increasingly larger. One user of a genealogical system may develop a personal family tree and create a data record for each individual in the family tree, based on currently available sources of information. That user may also add information to the data records (e.g., information on relationships, birthplaces, personal achievements, or other notable facts of which they are personally aware) to make the data record more complete. Another user creating a family tree (e.g., a relative of the first user) may go through the same process and create data records for some of the same individuals, but with additional or different information than the data record created by the first user. Thus, there may be multiple data records associated with a single individual that may be represented in one or more family trees.
Genealogical records that become available from many sources (and users) may create technical problems in the management of data by a genealogical system. Not only is there a great deal of information that needs to be sorted when information on individuals is requested (to retrieve only the records for the individuals of interest), but there may be many records that may relate or appear to relate to one individual. With many people contributing to these records, the data may be inconsistent—each contributor may have a different way of entering information or may provide inconsistent data such as birth dates, marriage dates, and so forth, pertaining to the same individual.
The result is that a genealogical system (and its associated databases) may have, for a single person, many data records (sometimes, depending on the number of contributing users, hundreds of individual data records) for one person, some of which may only appear to relate to the same person but could actually relate to a different person (a person having the same name, or other similar background information). When a person is seeking information or building a family tree and requests data relating to a person of interest (such as relatives or ancestors), the system may return many records that potentially relate to that person. This not only reduces the efficiency of the system (by having to query its associated database to retrieve all of the potentially relevant records), but also present to the user a large number of data records that need to be individually reviewed in order to determine whether they actually pertain to the person in question.
There has thus arisen a need for systems (such as genealogical systems), which may have multiple records relating to the same subject, to process the data records in a way to make the retrieval of information more efficient and to provide the most relevant data records to a user.