Dividing a matrix into two parts

2020-05-08 08:39发布

问题:

I am trying to classify my dataset. To do this, I will use the 4th column of my dataset. If the 4th column of the dataset is equal to 1, that row will added in new matrix called Q1. If the 4th column of the dataset is equal to 2, that row will be added to matrix Q2.

My code:

i = input('Enter a start row: ');
j = input('Enter a end row: ');
search = importfiledataset('search-queries-features.csv',i,j);
[n, p] = size(search);

if j>n
   disp('Please enter a smaller number!');
end

for s = i:j
    class_id = search(s,4);
    if class_id == 1 
       Q1 = search(s,1:4)        
    elseif class_id ==2  
       Q2 = search(s,1:4)
    end
end

This calculates the Q1 and Q2 matrices, but they all are 1x4 and when it gives new Q1 the old one is deleted. I need to add new row and make it 2x4 if conditions are true. I need to expand my Q1 matrix.

Briefly I am trying to divide my dataset into two parts using for loops and if statements.

Dataset:

I need outcome like:

Q1 = [30  64  1  1
      30  62  3  1
      30  65  0  1
      31  59  2  1
      31  65  4  1
      33  58 10  1
      33  60  0  1
      34  58 30  1
      34  60  1  1 
      34  61 10  1]

Q2 = [34 59 0 2
      34 66 9 2]

How can I prevent my code from deleting previous rows of Q1 and Q2 and obtain the entire matrices?

回答1:

The main problem in your calculation is that you overwrite Q1 and Q2 each loop iteration. Best solution: get rid of the loops and use logical indexing.

You can use logical indexing to quickly determine where a column is equal to 1 or 2:

search = [
  30 64 1 1 
  30 62 3 1
  30 65 0 1
  31 59 2 1
  31 65 4 1
  33 58 10 1
  33 60 0 1
  34 59 0 2
  34 66 9 2
  34 58 30 1
  34 60 1 1 
  34 61 10 1
];
Q1 = search(search(:,4)==1,:)  % == compares each entry in the fourth column to 1

Q2 = search(search(:,4)==2,:)

Q1 =
    30    64     1     1
    30    62     3     1
    30    65     0     1
    31    59     2     1
    31    65     4     1
    33    58    10     1
    33    60     0     1
    34    58    30     1
    34    60     1     1
    34    61    10     1
Q2 =
    34    59     0     2
    34    66     9     2

Warning: Slow solution!

If you are hell bent on using loops, make sure to not overwrite your variables. Either extend them each iteration (which is very, very slow):

Q1=[];
Q2=[];

for ii = 1:size(search,1) % loop over all rows
   if search(ii,4)==1
       Q1 = [Q1;search(ii,:)];
   end
   if search(ii,4)==2
       Q2 = [Q2;search(ii,:)];
   end
end

MATLAB will put orange wiggles beneath Q1 and Q2, because it's a bad idea to grow arrays in-place. Alternatively, you can preallocate them as large as search and strip off the excess:

Q1 = zeros(size(search)); % Initialise to be as large as search
Q2 = zeros(size(search));

Q1kk = 1; % Intialiase counters
Q2kk = 1;

for ii = 1:size(search,1) % loop over all rows
   if search(ii,4)==1
       Q1(Q1kk,:) = search(ii,:); % store
       Q1kk = Q1kk + 1; % Increase row counter
   end
   if search(ii,4)==2
       Q2(Q2kk,:) = search(ii,:);
       Q2kk = Q2kk + 1;
   end
end

Q1 = Q1(1:Q1kk-1,:); % strip off excess rows
Q2 = Q2(1:Q2kk-1,:);


回答2:

Another option using accumarray, if Q is your original matrix:

Q = accumarray(Q(:,4),1:size(Q,1),[],@(x){Q(x,:)});

You can access the result with Q{1} (for class_id = 1), Q{2} (for class_id = 2) and so on...