Vector of the occurence number

2019-04-13 11:03发布

问题:

I have a vector a=[1 2 3 1 4 2 5]'

I am trying to create a new vector that would give for each row, the occurence number of the element in a. For instance, with this matrix, the result would be [1 1 1 2 1 2 1]': The fourth element is 2 because this is the first time that 1 is repeated.

The only way I can see to achieve that is by creating a zero vector whose number of rows would be the number of unique elements (here: c = [0 0 0 0 0] because I have 5 elements). I also create a zero vector d of the same length as a. Then, going through the vector a, adding one to the row of c whose element we read and the corresponding number of c to the current row of d.

Can anyone think about something better?

回答1:

This is a nice way of doing it

C=sum(triu(bsxfun(@eq,a,a.')))

My first suggestion was this, a not very nice for loop

for i=1:length(a)
    F(i)=sum(a(1:i)==a(i));
end


回答2:

This does what you want, without loops:

m = max(a);
aux = cumsum([ ones(1,m); bsxfun(@eq, a(:), 1:m) ]);
aux = (aux-1).*diff([ ones(1,m); aux ]);
result = sum(aux(2:end,:).');


回答3:

My first thought:

M = cumsum(bsxfun(@eq,a,1:numel(a)));
v = M(sub2ind(size(M),1:numel(a),a'))


回答4:

on a completely different level, you can look into tabulate to get info about the frequency of the values. For example:

tabulate([1 2 4 4 3 4])

  Value  Count  Percent
  1      1      16.67%
  2      1      16.67%
  3      1      16.67%
  4      3      50.00%


回答5:

Please note that the solutions proposed by David, chappjc and Luis Mendo are beautiful but cannot be used if the vector is big. In this case a couple of naïve approaches are:

% Big vector
a = randi(1e4, [1e5, 1]);
a1 = a;
a2 = a;

% Super-naive solution
tic
x = sort(a);
x = x([find(diff(x)); end]);
for hh = 1:size(x, 1)
  inds = (a == x(hh));
  a1(inds) = 1:sum(inds);
end
toc

% Other naive solution
tic
x = sort(a);
y(:, 1) = x([find(diff(x)); end]);
y(:, 2) = histc(x, y(:, 1));
for hh = 1:size(y, 1)
  a2(a == y(hh, 1)) = 1:y(hh, 2);
end
toc

% The two solutions are of course equivalent:
all(a1(:) == a2(:))

Actually, now the question is: can we avoid the last loop? Maybe using arrayfun?



标签: matlab vector