distance between box plots with unequal samples

2019-07-25 17:46发布

问题:

I would like to draw a bar chart with "unequal samples". Here is an example code

 A = [16 20 15 17 22 19 17]';
 B = [22 15 16 16 16 18]';
 C = [23 9 15 18 13 27 17 14 16 15 21 19 17]';
 group = [    ones(size(A));
     2 * ones(size(B));
     3 * ones(size(C))];
 figure
boxplot([A; B; C],group)
set(gca,'XTickLabel',{'A','B','C'})

The output is as below:

However, I would like to have a distance between group1,2 with group 3. As same as what you see in the figure below:(this figure is just a copy paste from another source but the distance between box plot of each group is visible)

I tried to use 'factorgap' in such command

 figure
 boxplot([A; B; C ],group,'factorgap',[50,1])

However, because the number of samples in each group is different it did not work.

Any suggestion?

回答1:

The first solution I propose you is in fact a small workaround that consists in inserting another, invisible group between the second and the third one:

A = [16 20 15 17 22 19 17]';
B = [22 15 16 16 16 18]';
C = [23 9 15 18 13 27 17 14 16 15 21 19 17]';

group = [
  ones(size(A));
  2 * ones(size(B));
  3;
  4 * ones(size(C))
];

figure();
boxplot([A; B; NaN; C],group);
set(gca,'XTickLabel',{'A','B','','C'});

Here is the output:


Now, let's build up something serious:

% Define the sample data...
A = [16 20 15 17 22 19 17]';
B = [22 15 16 16 16 18]';
C = [23 9 15 18 13 27 17 14 16 15 21 19 17]';

% Find the length of the largest vector...
A_len = numel(A);
B_len = numel(B);
C_len = numel(C);
max_len = max([A_len B_len C_len]);

% Transform vectors into fixed size vectors of length max_len...
A = [A; NaN(max_len - A_len,1)];
B = [B; NaN(max_len - B_len,1)];
C = [C; NaN(max_len - C_len,1)];

% Define labels and groups...
L1 = [repmat('A',1,numel(A)),repmat('B',1,numel(B))];
L2 = repmat('C',1,numel(C));
L = [L1 L2];
G = [repmat('1',1,numel(L1)),repmat('2',1,numel(L2))];

% Plot the boxes...
boxplot([A B C],{G';L'},'FactorGap',50);

Here is the output: