可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I want to find the indices all the rows of a matrix which have duplicates. For example

A = [1 2 3 4
     1 2 3 4
     2 3 4 5
     1 2 3 4
     6 5 4 3]

The vector to be returned would be [1,2,4]

A lot of similar questions suggest using the unique function, which I've tried but the closest I can get to what I want is:

[C, ia, ic] = unique(A, 'rows')

ia = [1 3 5]
m = 5;
setdiff(1:m,ia) = [2,4]

But using unique I can only extract the 2nd,3rd,4th...etc instance of a row, and I need to also obtain the first. Is there any way I can do this?

NB: It must be a method which doesn't involve looping through the rows, as I'm dealing with large sparse matrices.

回答1:

How about:

[~, ia, ic] = unique(A, 'rows')

setdiff(1:size(A,1), ia( sum(bsxfun(@eq,ic,(1:max(ic))))<=1 ))

回答2:

Three other possibilities:

Sort rows of the matrix (with sortrows), detect equal rows (with diff) and use indexing to undo the sorting:
```
[As inds] = sortrows(A);
ind = find(all(diff(As)==0,2));
result = inds(union(ind,ind+1));
```

Directly compare every row against every other row (with bsxfun):

match = squeeze(all((bsxfun(@eq, A, permute(A, [3 2 1]))), 2));
result = find(any(match - eye(size(A,1))));

Use pdist with Hamming distance instead of bsxfun:

match = ~squareform(pdist(A,'hamming'));
result = find(any(match - eye(size(A,1))));

The advantage of approaches 2 and 3 is that you additionally get a (symmetric) matrix, match, which tells you which row equals which other. For your example,

    >> match
    match =
      1     1     0     1     0
      1     1     0     1     0
      0     0     1     0     0
      1     1     0     1     0
      0     0     0     0     1

回答3:

One way to identify duplicates is to apply accumarray on the ic vector from unique. Then, setdiff will return the full list if indexes of duplicate rows.

[~, ia, ic] = unique(A,'rows') dupRows = setdiff(1:size(A,1),ia(accumarray(ic,1)<=1))