I have an excel sheet.
Under column E, I have 425 cells with data. I want to check if the same data (i.e. text inside the cell) is repeated anywhere else in any of the remaining 424 cells under column E. How do I do this?
For example, in E54 I have
Hello Jack
How would I check this value to see if it was in any other of these cells?
If you have to compensate for blank cells, take the formula supplied above by @brettdj and,
Add a zero-length string to the COUNTIFS's criteria arguement.
=SUMPRODUCT((E1:E425<>"")/COUNTIF(E1:E425,E1:E425&""))
Checking for non-blank cells in the numerator means that any blank cell will return a zero. Any fraction with a zero in its numerator will be zero no matter what the denominator is. The empty string appended to the criteria portion of the
COUNTIF
is sufficient to avoid#DIV/0!
errors.More information at Count Unique with SUMPRODUCT() Breakdown.
This formula outputs "unique" or "duplicates" depending if the column values are all unique or not:
This is an array formula. You don't type the enclosing {} explicitly. Instead you enter the formula without {} and then press cmd-enter (or something else if not a Mac - go look it up!) If you want to split your formula over multiples lines for readability, use cmd-ctrl-return on a Mac.
The formula works by comparing two SUM() results. If they are equal, all the nonblank entries (numeric or text) are unique. If they are not equal there are some duplicates. The formula does not tell you where the duplicates are.
The first sum is what you get by adding up the row numbers of every non-blank entry.
The second sum does a lookup of each nonblank entry using MATCH(). If all entries are unique, MATCH() finds each entry at its own position, and the result is the same as the first sum. But if there are duplicate entries then a later duplicate will match an earlier duplicate and the later duplicate will contribute a different value to the sum, and the sums won't match.
You might have to adjust this formula:
if you want cells containing "" to count as blank, then use LEN(...)=0 for ISBLANK(...). I suppose you could put other tests in there if you wanted, but I have not tried that.
if you want to test an array not starting at row 1, then you should subtract a constant from ROW(...).
if you have a huge column of cells, you might get integer overflow when computing this sum. I don't have a solution to that.
It's a shame that Excel does not have an ISUNIQUE() function!
This may be a simpler solution. Assume column A contains data in question. Sort on that column. Then, starting in B2 (or first non-blank cell, use the following formula: =IF(A2=A1,1,0).
Than sum on that column. When sum = 0, all values are unique.
I came here because I had this question and one of the responses helped me come up with this...
highlight E and on the home tab select conditional formatting > Highlight Cell Rules > Duplicate Values... It will then highlight everything that is repeated.
Use Conditional Formatting on all the cells that will highlight based on this formula:
This is based on the column being E, and starting on E1, modify otherwise.
In Excel 2010 it's even easier, just go into Conditional Formatting and choose