Remove Duplicates

It is useful to remove multiple entries in the rows a single or multiple columns. At the end of the operation unique rows remain and duplicates are removed. Although the meaning of duplicate is intuitive, it is useful to solidify it with a couple examples cases.


A) Single column

This is the simplest and most intuitive case. In the following image, notice that #1 is duplicated.


Clicking on "Remove Duplicates" button will remove the duplicates in the selected column. After removal an information message will be displayed as shown:


As seen below, #1 is removed as it was duplicated.



B) Multiple columns

Here, the idea will be demonstrated with 2 columns; however, the same idea is applicable to any number of selected columns. In the figure below, two columns have been selected and we would like to remove the duplicates.

Notice that we have 3 A's in the first column.


After removal of duplicates, the remaining rows are shown below:


Although initially we had 3 A's, only 1 A has been removed and 2 A's remained. Had we selected the first column only 2 A's would have been removed. However, by selecting two columns we pair the consecutive cells, therefore, the data in first row is considered as (A,1) and in the fifth row as (A, 10). Since the data pairs (A, 1) and (A, 10) are different, it was not considered as a duplicate. The rationale is extensible to n-tuples.


Pitfall: When comparisons are made, the values in the cells are considered as strings.