The Duplicate Constituent Management Tool uses a hard-coded algorithm to determine if records in the database could be possible duplicates.
- First, the records are put into buckets based on having the same first TWO letters of their Last Name (or Organization Name if applicable), and the same first TWO letters of their First Name (when applicable). In very specific scenarios, the first THREE letters of the First and Last Names are compared.
- Additional records can be added into those buckets if they meet certain other requirements, such as their Alias and Last Name having the same first TWO letters (or initials, if applicable).
- After the records have been placed into buckets, different fields on those records are compared to determine if they could possibly be duplicates. The more fields that match, the higher the potential is for those records to be duplicates. Once the records appear on the report as potential duplicates, they are output side by side with information to help determine which is the "master record" if they are indeed duplicates (ex: which one has a greater number of gifts). The tool itself does not contain information about what exactly was matched on the records to qualify them as being potential duplicates. Some examples of how the report can process are below.
Additional processing notes:
- Name comparisons are strictly textual, and are compared on a "sounds-like" basis, rather than a proper name comparison. For example, "Judy" sounds very close to "Judie" textually, and although spelled differently, could still possibly be duplicates
- Similar to the previous note, the algorithm allows for spelling errors to be considered (same example of "Judy" vs. "Judie")
- Constituents that are defined as relationships to each other will not be counted as duplicates
- Suffixes are also considered. If Suffixes are present on both records, but are different (ex: Jr. vs. III), then the individuals are not considered a match and no further comparisons occur
- When addresses are being compared between records...
- Only the Preferred Address is considered
- If the State/Province is different between records, the remaining address fields (City, ZIP, Address Lines) will not be compared
- If there is an exact match for phone numbers and/or email addresses, they are returned as a possible duplicate