Data duplication is a serious problem in law enforcement agencies. Duplication is generally attributed to two problems. The first is simply that users often misspell names. When an officer creates a person with the last name “Rodrigues,” but the correct spelling is “Rodriguez,” most systems would identify these persons as separate entities. The second problem is that not enough preventative measures are taken to protect against data duplication. While users are inputting identifying information about a person, the system should be regularly checking for possible matches in the database. Few systems do this, so users are rarely alerted that a duplicate may exist.
Solving the problem of removing duplicate pieces of information from a system is complex. It is difficult to make a determination as to whether two people are duplicates, especially when the piece of data is as complex as a person's profile.