MatLab: Create 3D Histogram from sampled data

2019-09-10 06:13发布

问题:

I have sampled data in the interval [0,1] in an Array transitions=zeros(101,101) which I want to plot as a 3D-histogram. transitions is filled with data similar to the example data provided at the end of this thread.

The first columns refers to the first observed variable X, the second column to the second variable Y and the third column is the normalized frequency. I.e. for the first row: the observed normalized frequency of the variable pair (0,0) is 0.9459. The sum of the normalized frequencies for (0,Y)thus is 1.

I tried to make (sort of) a 3D histogram with the following code:

        x_c = (transitions(:,1) * 100)+1;
        y = (transitions(:,2) * 100)+1;
        z = transitions(:,4);
        %A = zeros(10,10);
        A = zeros(max(x_c),max(y));
        for i = 1:length(x_c)
            try
                    if(z(i)>0)
                        A(int32(x_c(i)), int32(y(i))) = abs(log(z(i)));
                    else
                        % deal with exceptions regarding log(0)
                        A(int32(x_c(i)), int32(y(i))) = 0;
                    end
            catch
                disp('');
            end
        end
        bar3(A);

However, since it is sampled data in a discrete space A the output looks like the plot below. This is somehow misleading as there are 'gaps' in the plot (z-value = 0 for coordinates where I have no sampled data). I rather would like to have the sampled data being assigned to their corresponding plots, thus resulting in a 'real' 3d histogram.

By the way, as a result of my 'hack' of creating A also the x-,y- and z-scale is not correct. The 3D histogram's axes (all three) should be in the interval of [0,1].

ans =

     0         0    0.9459
     0    0.0500    0.0256
     0    0.1000    0.0098
     0    0.1100    0.0004
     0    0.1500    0.0055
     0    0.1600    0.0002
     0    0.2000    0.0034
     0    0.2100    0.0001
     0    0.2500    0.0024
     0    0.2600    0.0001
     0    0.3000    0.0018
     0    0.3200    0.0000
     0    0.3700    0.0000
     0    0.4000    0.0010
     0    0.4200    0.0000
     0    0.4500    0.0007
     0    0.5000    0.0007
     0    0.5300    0.0000
     0    0.5500    0.0005
     0    0.6000    0.0005
     0    0.6300    0.0000
     0    0.7000    0.0002
     0    0.7400         0
     0    0.7500    0.0003
     0    0.7900    0.0000
     0    0.8000    0.0002
     0    0.8400    0.0000
     0    0.8500    0.0002
     0    0.8900    0.0000
     0    0.9000    0.0002
     0    0.9500    0.0001
     0    1.0000    0.0001
0.0500         0    0.0235
0.0500    0.0500    0.0086
0.0500    0.1000    0.0045

     .         .         .
     .         .         .
     .         .         .
     .         .         .
     .         .         .
0.9500    0.9000    0.0035
0.9500    0.9500    0.0066
0.9500    1.0000    0.0180
1.0000         0    0.0001
1.0000    0.0500    0.0001
1.0000    0.1000    0.0001
1.0000    0.1100    0.0000
1.0000    0.1500    0.0001
1.0000    0.1600    0.0000
1.0000    0.2000    0.0001
1.0000    0.2100    0.0000
1.0000    0.2500    0.0001
1.0000    0.2600    0.0000
1.0000    0.3000    0.0001
1.0000    0.3200    0.0000
1.0000    0.3700    0.0000
1.0000    0.4000    0.0002
1.0000    0.4200         0
1.0000    0.4500    0.0002
1.0000    0.5000    0.0003
1.0000    0.5300    0.0000
1.0000    0.5500    0.0004
1.0000    0.6000    0.0004
1.0000    0.6300    0.0000
1.0000    0.7000    0.0007
1.0000    0.7400    0.0000
1.0000    0.7500    0.0010
1.0000    0.7900    0.0000
1.0000    0.8000    0.0015
1.0000    0.8400    0.0001
1.0000    0.8500    0.0024
1.0000    0.8900    0.0002
1.0000    0.9000    0.0042
1.0000    0.9500    0.0111
1.0000    1.0000    0.3998

回答1:

I found a solution by working on the non-aggregated data. In particular each row of the data set transitions contains one observation of Xand Y. I used the code below to produce a normalized 3D histogram (and a 2D map) as folllows:

function createHistogram(transitions)
   uniqueValues = unique(transitions(:,1));
   biases = cell(numel(uniqueValues),1);

   for i = 1:numel(uniqueValues)
       start = min(find(transitions(:,1) == uniqueValues(i)));
       stop = max(find(transitions(:,1) == uniqueValues(i)));
       biases(i) = mat2cell(transitions(start:stop,2));
   end

   combinedBiases = padcat(biases{1},biases{2},biases{3},biases{4},...
       biases{5},biases{6},biases{7},biases{8},biases{9},biases{10},...
       biases{11},biases{12},biases{13},biases{14},biases{15},biases{16},...
       biases{17},biases{18},biases{19});

   bins = 0:0.1:1;
   [f, x] = hist(combinedBiases, bins);

   %
   % normalize
   %
   for i = 1:numel(f(1,:))
       for j = 1:numel(f(:,i))
            f(j,i) = f(j,i)/numel(biases{i});
       end
   end
   bHandle = bar3(x, f);
   ylim([-0.04,1.04])
   for k = 1:length(bHandle)
        zdata = get(bHandle(k),'ZData');
        set(bHandle(k),'CData',zdata, 'FaceColor','interp');
   end
   colormap('autumn');
   hcol = colorbar();
   axis('square');
   cpos=get(hcol,'Position');
   cpos(4)=cpos(4)/3; % Halve the thickness
   cpos(2)=0.4; % Move it down outside the plot#
   cpos(1)=0.82;
   set(hcol, 'Position',cpos);
   xlabel('Enrollment biases');
   ylabel('Aging biases');
   zlabel('Bias transition probability');
   title(strcat('Probability mass function of bias transitions (', device,')'));
   set(gca,'XTick',0:2:20);
   set(gca,'XTickLabel',0:0.1:1);
   print('-dpng','-r600',strcat('tau_PMF3D_enrollment-ageing-', device));
   view(2);
   cpos(1)=0.84;
   set(hcol, 'Position',cpos);
   print('-dpng','-r600',strcat('tau_PMF2D_enrollment-ageing-', device));
end


回答2:

From the comment on the question it appears you have the values you want to represent each bin count. If so an alternative solution is to plot using hist3 with "junk" data using correct x and y scales and then update the zdata of the surface object created with your bin data (modified to be in the correct format).
This modification to the bin data is fairly simple and consists of reshaping into a matrix then replicating and padding all the elements, the method is included in the code below.

Based on the ans variable at the end of the question, assuming

  • ans(:,1) gives x values
  • ans(:,2) gives y values
  • ans(:,3) gives the normalised bin counts

code

%// Inputs
zdata=ans(:,3);  %// zdata=rand(21*21,1); % for testing
xvalues = 0:0.05:1; 
yvalues = 0:0.05:1;

%// plot with junk data, [0,0] in this case
nx = numel(xvalues); ny = numel(yvalues);    
bincenters = { xvalues , yvalues };
hist3([0,0],bincenters);
Hsurface = get(gca,'children');

%// apply bin count format
pad = [0 0 0 0 0;0 1 1 0 0;0 1 1 0 0;0 0 0 0 0;0 0 0 0 0]; %// padding for each point
ztrans=kron(reshape(zdata,[nx,ny]),pad); %// apply padding to each point

%// update plot
set(Hsurface,'ZData',ztrans)

%// to set colour based on bar height
colormap('autumn');
set(Hsurface,'CData',ztrans,'FaceColor','interp')

output