I have a large cell array of strings in Matlab. I need to find the indexes of duplicate strings in this array. That is, the output I expect is an array of the indices of strings that appear two or more times in the cell array of strings.
How can I do this?
This can be done with unique:
You can order the array, and then check for each cell if it equals the following cell. Runtime =
O(N log(N))
I don't recall a built-in function for that.Another approach: get integer labels using
unique
, count their ocurrences withhistc
, and pick those that appear more than once: