Excel array formula to find duplicate row across m

is there a way to indicate duplicate rows across multiple columns using an array formula?

Data:

AA1   BB1   CC2   duplicate
AA1   BB2   CC1
AA1   BB1   CC2   duplicate
AA1   BB1   CC1

In the above table, rows 1 and 3 are the ones I need to indicate, by putting "duplicate" in column 4.

I know of the remove duplicates functionality in Excel, but I have to see the duplicate lines before actually deleting them. Also, adding a hidden helper column is not an option because of what happens with the file further down in the process...

If data was just in one column, a countif formula would work. So I was hoping some sort of countif(col1 & col2 & col3, range(A:A & B:B & C;C)) could do the trick...

Thanks!

标签： excel-formula

3条回答

够拽才男人

2楼-- · 2019-12-16 20:11

It;s not necessary here for array formula COUNTIFS will do the job.

=COUNTIFS($A$1:$A$4,A1,$B$1:$B$4,B1,$C$1:$C$4,C1)

0人赞添加讨论(0) 举报

Ridiculous、

3楼-- · 2019-12-16 20:12

To your point where removing the duplicate lines is the objective, not deleting all rows including the first occurrence, and a helper column is not an option, here is how to achieve it.

Using a slightly different formula from Adirmola's answer:

At column D, observe how the addresses are locked... e.g. A$1:A1... for formula at row 1. As you fill down the formula, the left part row number stays the same, but the right part row number increases. Therefore counting the instance of the duplicate occurence.

Then if adding a helper column is not an option, lets bring in the conditional formatting for the purpose of highlighting those 2nd, 3rd, 4th.. occurence, filter by color, and delete them.

Here is how, you will first select the region where the duplicates occur. The active cell (cell in white instead of grayed of the selected region) must be at the first row of the selection.

Add a conditional formatting, using the same formula in column D above for row 1, but this time, lock all the columns, and put a condition >1 behind.

Apply the condition, and you can go ahead and filter by color and delete the duplicates!

Additional info: COUNTIF and COUNTIFS is a very inefficient formula for very large data (about 10,000 rows above depending on how many columns involved). You may feel slow Excel response so it might be a good idea to delete the formula away after removing the duplicate rows. Otherwise, add a double quote to disable the formula so that it can be reused next time. ="COUNTIFS($A$1:$A1,$A1,$B$1:$B1,$B1,$C$1:$C1,$C1) > 1"

Hope this helps

0人赞添加讨论(0) 举报

迷人小祖宗

4楼-- · 2019-12-16 20:26

You have to understand what does a duplicate mean. It means if there is occurrence of any more occurrences of the original value. In you example, the first row is NOT a duplicate because it does not have any occurrences before. The next value is a duplicate because it has a second occurrence. I have prepared for you a method to extract out duplicates and mark them as need.

Formula in cell D1:

=CONCATENATE(A1,B1,C1)

Formula in cell E1:

=COUNTIF( D$1:D1, D1 )

Formula in cell F1:

=IF(E1>1,"Duplicate","")

--Edit:

If you want to show all duplicates(including the original value)

Formula in cell D1:

=CONCATENATE(A1,B1,C1)

Formula in cell E1:

=IF(COUNTIF($D$1:$D$4,D1)=1,0,1)

Formula in cell F1:

=IF(E1>0,"Duplicate","")

Cheers!

0人赞添加讨论(0) 举报

Excel array formula to find duplicate row across m

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间