(Please read this article for a general discussion about duplicates: http://xrm2011.wordpress.com/2011/08/07/crm-2011-and-duplicate-detection/
The way CRM 2011 handles duplicates has always been a bit of a mystery to me. Detection of duplicates through matchcodes is quite straightforward, but reporting on them and understanding the effectiveness of deduplication rules is difficult, and this is compounded by the CRM 2011 UI, which allows to check duplicates only one by one.
So I decided to do a little research, of which this project is a result. It contains a set of stored procedures that allow to obtain a list of duplicates that can then be consolidated in a report.
At a minimum, one can see how duplicates are related to each other by looking at the queries. This is not for the faint-hearted, you need to have some good understanding of the CRM 2011 data structure and, of course, of T-SQL.
Please note that this is not by any means an official way to look at duplicates, it is only my understanding, obtained by reverse engineering.
Also note that if you run the scripts in this project you will end up with three stored new procedures in your CRM database schema. Although the procedures are quite harmless, this might cause problems when you install rollups or upgrades, therefore it is suggested that you remove them before any major re-haul of the deployment.
Please also note the following good practices:
- Thoroughly test on your pre-production environment before deploying into production. Even better: do not deploy to production. If you have a pre-production environment that is a close mirror to production, running the procedures in pre-production will give you the same value than running them in production.
- Only run these procedures when few or no users are online, especially if you have a large database.Important Notice
I received some feedback about the fact that in some cases the queries don't return any duplicate. This was because the user had never run a duplicate detection job. The queries don't detect duplicates, they merely report on duplication detection jobs. If you haven't run any, don't expect the queries to find any duplicates.