Matlab: Dividing chunks of data randomly into equa

2019-07-30 05:22发布

问题:

I have a large dataset that I need to divide randomly into 5 almost equal sized sets for cross validation. I have happily used _crossvalind_ to divide into sets before, however this time I need to divide chunks of data into these groups at a time.

Let's say my data looks like this:

data = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18];

Then I want to divide them randomly into 5 groups in chunks of 2, e.g. like this

g1 = [3 4], [11 12]  
g2 = [9 10]  
g3 = [1 2], [15 16]  
g4 = [7 8], [17 18]  
g5 = [5 6], [13 14]

I think I can do this with some for-loops, but I'm guessing there must be a much more cost-efficient way to do it in matlab :-)

Any suggestions?

回答1:

I'm interpreting your needs to be random ordering of sets, but within each set, the ordering of elements is unchanged from the parent set. You can use randperm to randomly order the number of sets and use linear indexing for the elements.

dataElements=numel(data);%# get number of elements
totalGroups=5;
groupSize=dataElements/totalGroups;%# I'm assuming here that it's neatly divisible as in your example
randOrder=randperm(totalGroups);%# randomly order of numbers from 1 till totalGroups
g=reshape(data,groupSize,totalGroups)';             %'# SO formatting
g=g(randOrder,:);

The different rows of g give you the different groupings.



回答2:

You can shuffle the array (randperm) and then divide it into consequentive equal parts.

data = [10 20 30 40 50 60 70 80 90 100 110 120 130 140 150];
permuted = data(randperm(length(data)));
% padding may be required if the length of data is not divisible by the size of chunks
k = 5;
g = reshape(permuted, k, length(data)/k);