This specification relates to processing data about entities.
A prior art system that exists presents information about entities having geographic locations to users and allows the users to interact with the system to modify the presented information, e.g., if they believe that it is inaccurate or out of date. The entities include businesses, monuments, museums, and other entities capable of being presented on a map. For a particular entity, the information provided by the prior art system can include the name of the particular entity, the location of the entity on a map, the address of the entity, the phone number of the entity and values of other attributes that describe the entity. In addition to modifying the presented information, users can also submit additional content related to the entity, e.g., a review of the entity or a rating of the entity, to the system. The information presented to users by the system and information submitted by the users, e.g., user edits or reviews, is tied to a system-generated identifier that the system uses to identify the entity to which the information refers.
Each edit or review submitted by a user is treated by the system as an action. Each action is applied in the order in which it is received to a set of attribute-value pairs describing the entity and identified by the system-generated identifier. For systems having a large number of users, this can degrade the user experience. For example, if two users attempt to edit the same attribute of the same entity within a short span of time, e.g., before the system can process the first edit and update the presented attribute value, the user submitting the second edit may receive an error message.
Once the system updates the appropriate attribute in the set or adds the user review to the set, i.e., applies the action, the action is discarded. Alternatively, the system can determine that the action should not be applied, e.g., because the user has been determined to not be trustworthy or the modified information has been determined to not be reliable. Once the determination is made not to apply the action, the action is discarded. If it is later discovered that the action should have been applied, the action will no longer be available to the system for application.
Additionally, the prior art system receives information about entities that is to be presented to users from many different data providers. These data providers provide information as a feed of actions, with different data providers providing information of variable reliability and at varying intervals. Each action identifies the entity to which it refers using the system-generated identifier for the set of attribute-value pairs that describe the entity.
The prior art system may receive large amounts of information about an entity, with each received piece of information being tied to a system-generated identifier for the entity. The information is received at different intervals and is not always reliable or consistent with other information received about the entity. Additionally, entities having geographic locations can change their locations. For example, a coffee shop at a first location can move to a second location, and an automobile repair shop can open at the first location. Afterwards, some information about the coffee shop may still be valid, e.g., a user review indicating that the coffee shop brews excellent coffee, or the name of the coffee shop; but other information may no longer be applicable, e.g., a user review about the view from the coffee shop, or the address of the coffee shop. If the system-generated identifier for the coffee shop is generated based at least in part on the location of the coffee shop, once the coffee shop changes location, the system-generated identifier will also change. This may result in the loss of information previously associated with the coffee shop, even if the information is still valid. Additionally, if the system receives information that indicates that the coffee shop is actually multiple businesses, e.g., a coffee shop and a separate deli, it may be difficult to determine which previously received information should be applied to which business.