Home > Features > Data > Duplicates

Duplicates

ReportsDNA provides several options for handling duplicate records. These options can be used to find, eliminate or aggregate duplicate records.
Examples shown on this page may be found here.

This following data set will be used to show how each feature changes the reports output.

Data set
ID First Name Last Name Order
1 Jane Doe Milk
2 John Doe Eggs
3 Eleni Smith Milk
1 Jane Doe Butter


Show

This setting determines how duplicates will be handled. The available settings and their results are detailed below.

Setting Description Unique ID Result
All Records

Returns all records

n/a

ID First Name Last Name Order
1 Jane Doe Milk
2 John Doe Eggs
3 Eleni Smith Milk
1 Jane Doe Butter
Only Duplicates

Only returns records that have duplicates.

ID

ID First Name Last Name Order
1 Jane Doe Milk
4 Jane Doe Butter
Only Unique

Eliminates all duplicates from the result.
Each first unique record will be displayed.

I

ID First Name Last Name Order
1 Jane Doe Milk
2 John Doe Eggs
3 Eleni Smith Milk


















Columns for Unique ID

This setting determines which column or columns uniquely identify a record. The fields that comprise the key are highlighted in blue in the two examples below.

Example: Unique ID consists of a single field: Last Name

Columns for Unique ID
Unique Records (duplicates removed)
Last Name
ID First Name Last Name Order
1 Jane Doe Milk
3 Eleni Smith Milk







Example: Unique ID consists of a the combination of two fields: First Name and Last Name

Columns for Unique ID Unique Records (duplicates removed)
First Name, Last Name
ID First Name Last Name Order
1 Jane Doe Milk
2 John Doe Eggs
3 Eleni Smith Milk









Show

This setting determines how duplicates will be handled. The available settings and their results are detailed below.

Setting Description Unique ID Result
All Records

Returns all records

n/a

ID First Name Last Name Order
1 Jane Doe Milk
2 John Doe Eggs
3 Eleni Smith Milk
1 Jane Doe Butter
Only Duplicates

Only returns records that have duplicates.

ID

ID First Name Last Name Order
1 Jane Doe Milk
4 Jane Doe Butter
Only Unique

Eliminates all duplicates from the result.
Each first unique record will be displayed.

I

ID First Name Last Name Order
1 Jane Doe Milk
2 John Doe Eggs
3 Eleni Smith Milk
















Case Sensitive

The Case Sensitive flag indicates how duplicates are determined.  When checked, values must match by letter and by case to be identified as a duplicate (e.g., doe and DOE are not duplicates).   When unchecked. values must match by letter, but the case does not have to match (e.g. doe and DOE ARE duplicates).


Rollup non-ID Fields

When checked, all Rollup Fields may contain multiple values. This setting will concatenate all unique values in each Rollup Field.
Enabled: only when Show is set to Only Unique

Rollup Fields

All values in these fields will be displayed in the one record that represents all duplicates. Each value will be separated by the Rollup Separator character(s). Each value will be displayed only once; therefore, Rollup Fields will not have duplicate
Enabled: executed when Show is set to Only Unique and Rollup non-ID Fields is checked.

Rollup Separator

Specifies the character that will be placed between each unique value in the Rollup Fields
Enabled: executed when Show is set to Only Unique and Rollup non-ID Fields is checked.

Note: Click on

EXAMPLE

Setting Value
Unique ID Last Name
Rollup non-ID Fields On
Rollup Fields Order
Rollup Separator ,
Data set

ID First Name Last Name Order
1 Jane Doe Milk
2 John Doe Eggs
3 Eleni Smith Milk
1 Jane Doe Butter

Report
ID First Name Last Name Order
1 Jane Doe Milk,Eggs,Butter
3 Eleni Smith Milk

See also