Something that has been an important part of my work for many of my clients is removing duplicate records from their database. They don't want their sales team making multiple calls to the same person or mailing more than once to each person on their list.

I want to tell you about a process I developed for a client over the past couple months to give you some idea of the scope this can include. If you just want to know how to "de-dup" your files, skip ahead to the section title "The Main Course."

THE BIG PICTURE
In this recent big project, my client gets lists of possible contacts from many sources at least once a week. He imports them into his FileMaker Pro (FMP) database in a multiple step process. I have it compare in multiple fields as described below and combine duplicate records, assuming that most of the contact information in the newest record is more recent and therefore more accurate. It retires the older duplicates so they can be retrieved if need be. Then the process shows him a list of any records that are considered near duplicates so he can manually check the few that remain. After those are retired, he assigns them to his employees for followup phone calls.

For a little different point of view, employees at one association I work with were adding new member or prospect records without checking to see if the caller was already in the file. We instituted a method of controlled data entry. It's important to get your people to ask callers how to spell their names. It could be "Thom" instead of "Tom" or "Jon" instead of "John." After initial data entry, we would present the user with a list of possible name matches. The next screen would check for possible company matches.

THE MAIN COURSE
I use multiple calculation fields to compare records in order to root out possible duplicates that may have crept into the file. First I look for duplicate full names. But if you have a lot of records in your database (with many John Smiths for instance), you may also want to add in the zip code - and depending on if you use zip plus four in the zip code field, you may only want to use the left five characters in the zip field. You can see an image by copying the following link and pasting it into your browser.

http://www.datadesignpros.com/images/dup_fields.jpg

For a second comparison I use a calculation field that combines the first word in the last name field and zip code. The reason I only use the first word is that a person may have an appendix like Jr. or II or PhD, but sometimes you'll get the same person in a different record without those little extras. Full names aren't good enough by themselves because of nicknames and the previously mentioned alternate spellings of first names.

Then I use a calculation field that combines company name and last name. But that won't catch everything because of abbreviations and misspelling of company names. So you'll periodically want to sort your database by company name and provide the exact same company name in all records.

If at all possible, use some unique identifier like Social Security Number or an ID assigned by a national organization. But even then don't trust it completely. We found multiple contacts that had been given the same number for one reason or another. Even though the data comes from a computer, somebody had to enter that data, and they could have had a bad night's sleep or lazy fingers.

Look above for the link to an image that shows the field dialog and four example calculation fields that may be useful for de-duping. After you create the fields, put them on a layout and perform a Find for duplicates. Simply start a Find, put the exclamation mark "!" in one of the fields and finish the find. Now sort the records by the field you're using and look at them in a list view. You'll get a pretty good idea of how much work you have ahead of you.

Although you can find duplicates using this method, combining them may be a little more difficult. Which data do you keep - the newer one, the older one, or parts of each? And if you have related records such as invoices or notes in a portal, how will you make sure those records end up with the surviving record? You will probably need some professional help in order to accomplish that. If it turns out that's what you need, get hold of me.

J **

*****

You can get to my other newsletters by clicking here.

*****

Etc...

Of course, the main purpose of this newsletter is to remind you that I'm here to serve your FileMaker needs - development, support, and training.

If you can think of anyone you might refer me to, I offer a 5% "Thank You!" for the first 2 years of billing on referrals.

 

 
   
Email us here.