Find set intersection of multiple arrays in MATLAB

2019-02-26 10:55发布

问题:

I tried to solve this problem, but I could not implement. Could you help me anything for this?

Problem

Mat1 | Mat2 | Mat3

 1 2 | 1 3  | 2 6

 1 3 | 2 6  | 2 5

 2 4 | 3 1  | 3 1

 3 1 | 3 5  | 5 2

 4 5 |

When there are 3 matrices(for example above), I want to get this result for the intersection rows in [column1 column2 matrixnumber] form.

The result for above example would be

1 3 1

1 3 2

2 6 2

2 6 3

3 1 1

3 1 2

3 1 3

It would be OK if the result is in the form [column1 column2 firstmatrix secondmatrix, ...]

1 3 1 2

2 6 2 3

3 1 1 2 3

For this problem, I want to use at most one for-loop.

Do you have any idea for this?

回答1:

Here an alternative solution (which seems to run faster than Gunther's) using MATLAB's intersect:

Mat = {[1 2; 1 3; 2 4; 3 1; 4 5],
       [1 3; 2 6; 3 1; 3 5],
       [2 6; 2 5; 3 1; 5 2]};

result = zeros(sum(cellfun(@(x)size(x, 1), Mat)), 3); % # Preallocate memory
k = 1;
for cc = transpose(nchoosek(1:numel(Mat), 2))
    x = intersect(Mat{cc}, 'rows');                   % # Find intersection
    y = ones(size(x, 1), 2) * diag(cc);               % # Generate matrix indices
    result(k:k + numel(y) - 1, :) = [[x; x], y(:)];
    k = k + numel(y);
end

result(all(~result, 2), :) = [];                      % # Discard zero rows
result = unique(result, 'rows');                      % # Discard repeated rows

The matrix result should now contain the unique intersection rows and their corresponding matrix indices, just like you want:

result =   
     1     3     1
     1     3     2
     2     6     2
     2     6     3
     3     1     1
     3     1     2
     3     1     3


回答2:

If I understand correctly, you have a number of sets of pairs: Mat1,Mat2, Mat3, ... MatN. Now you want to find the unique pairs and then find out in which set every unique pair appears.

If you have a large number of sets, I suggest you start using a cell array to hold them all, makes things a lot easier:

N = 3; % total number of data sets
Mat = cell(N,1);
Mat{1} = [1 2;
          1 3;
          2 4;
          3 1;
          4 5];
Mat{2} = [1 3;
          2 6;
          3 1;
          3 5];
Mat{3} = [2 6;
          2 5;
          3 1;
          5 2];
% etc.

First let's find the unique pairs:

uniq_pairs = unique(cat(1,Mat{:}),'rows');
M = size(uniq_pairs ,1);

Then use ismember to check which sets contain which pairs:

matcontpair = false(M,N); %preallocate
for ii=1:N % unavoidable loop
    matcontpair(:,ii) = ismember(uniq_pairs,Mat{ii},'rows');
end

To translate this intersection matrix to a set of matrix numbers for each pair, loop through it again and store the final result in a cell array (you can't use an array, because they might not be of same size (some pairs only found once, other twice, other three times ...)

pair_occurence= cell(M,1);
d=1:N;
for jj=1:M
    pair_occurence{jj} = d(matcontpair(jj,:));
end

Now you have a matrix uniq_pairs of size Mx2 containing the unique pairs, and a occurence cell array pair_occurence of size Mx1: each cell corresponds to a pair and contains a list of matrices where the pair is present.

If you want to remove pairs from the list which are only present in one matrix, use the following:

% find them
lonely_pairs = cellfun(@numel,pair_occurence)<2;
% and destroy them
uniq_pairs(lonely_pairs,:) = [];
pair_occurence(lonely_pairs) = [];