1. Field
The field of the disclosure is related to data discovery in a database. And more specifically, to ontology-guided reference data discovery in multiple databases with differencing reference schemas.
2. Description of the Related Art
Modern databases frequently contain a large number of tables storing an extensive amount of data. These databases may contain an equally large number of reference tables for storing reference data characterizing the other data in the database. Although two databases may contain equivalent data, the databases may contain different reference data. For instance, a first database may contain a lookup table mapping the country code “GE” to the country “Greece,” where a second database may contain a lookup table mapping the country code “GR” to the country “Greece.” In this scenario, data fields from the first database using the country code “GE” are equivalent to data fields from the second database using the country code “GR,” as both data fields refer to the country “Greece.” In other words, even though the reference values “GE” and “GR” are different, both reference values refer to the same country “Greece.”
Converting an existing database to a different reference data schema is often a cumbersome task. Currently, this task is done manually and requires a user to determine associations between a first reference schema and a second reference schema. Once determined, the user must then manually create translation tables which may be used to translate reference data from the first schema to the second schema. This can be a very time consuming activity for the user, particularly in the case where the database contains a large amount of reference data. Additionally, when the user is unfamiliar with either the first schema or the second schema, or both, the task becomes just that much more difficult. Furthermore, this manual approach is also error prone, as the user must examine each and every reference table and attempt to determine relationships from one reference schema to another, often without the aid of any documentation.