Last name, ZIP code, and address block are sent in to be checked for duplicates.
If Last name or ZIP are blank, the check stops here.
If Last name and ZIP are filled in, the check continues.
Last name goes through a Soundex check for matching http://en.wikipedia.org/wiki/Soundex
Postcode matches from the left, so the first XX digits must match. This setting comes from the CONSTITUENTDUPLICATESEARCHSETTINGS table.
Full name, address block, and ZIP go into the fuzzy matcher that gets a weighted match percentage for potential duplicates.
If all three fields go in, each is weighted at 33% of the total match.
If only name and ZIP go in, each is weighted at 50% of the total match.
The percentage that comes back is compared to a minimum match percentage setting from the CONSTITUENTDUPLICATESEARCHSETTINGS table.
There is a hard-coded 70% minimum, so the match percentage can be set to 85%, for example, but not to 60%.
* Last/Org/Group/Household Name does not match based on the Algorithm in Step 2 so check stops
Different from the original algorithm, the algorithm in American Soundex is as below.
The Soundex code for a name consists of a letter followed by three numerical digits: the letter is the first letter of the name, and the digits encode the remaining consonants. Similar sounding consonants share the same digit so, for example, the labial consonants B, F, P, and V are each encoded as the number 1.
The correct value can be found as follows:
Retain the first letter of the name and drop all other occurrences of a, e, i, o, u, y, h, w.
Replace consonants with digits as follows (after the first letter):
b, f, p, v => 1
c, g, j, k, q, s, x, z => 2
d, t => 3
l => 4
m, n => 5
r => 6
Two adjacent letters (in the original name) with the same number are coded as a single number; also two letters with the same number separated by 'h' or 'w' are coded as a single number, whereas such letters separated by a vowel are coded twice. This rule also applies to the first letter.
Continue until you have one letter and three numbers. If you run out of letters, fill in 0s until there are three numbers.
Using this algorithm, both "Robert" and "Rupert" return the same string "R163" while "Rubin" yields "R150". "Ashcraft" and "Ashcroft" both yield "A261" and not "A226" (the chars 's' and 'c' in the name would receive a single number of 2 and not 22 since an 'h' lies in between them). "Tymczak" yields "T522" not "T520" (the chars 'z' and 'k' in the name are coded as 2 twice since a vowel lies in between them). "Pfister" yields "P236" not "P123" (the first two letters have the same number and are coded once as 'P').