在群集的k均值数据显示行(Show rows on clustered kmeans data)

2019-09-19 05:26发布

您好我想知道,当你的人物画面上聚集的数据是有一种方式来显示其行中的数据点属于滚动时过他们吗?

从上面的图片中,我希望会有其中,如果我选择的方式或卷动,我可以告诉它属于哪一行的点。

下面是代码:

%% dimensionality reduction 
columns = 6
[U,S,V]=svds(fulldata,columns);
%% randomly select dataset
rows = 1000;
columns = 6;

%# pick random rows
indX = randperm( size(fulldata,1) );
indX = indX(1:rows);

%# pick random columns
indY = randperm( size(fulldata,2) );
indY = indY(1:columns);

%# filter data
data = U(indX,indY);
%% apply normalization method to every cell
data = data./repmat(sqrt(sum(data.^2)),size(data,1),1);

%% generate sample data
K = 6;
numObservarations = 1000;
dimensions = 6;

%% cluster
opts = statset('MaxIter', 100, 'Display', 'iter');
[clustIDX, clusters, interClustSum, Dist] = kmeans(data, K, 'options',opts, ...
'distance','sqEuclidean', 'EmptyAction','singleton', 'replicates',3);

%% plot data+clusters
figure, hold on
scatter3(data(:,1),data(:,2),data(:,3), 5, clustIDX, 'filled')
scatter3(clusters(:,1),clusters(:,2),clusters(:,3), 100, (1:K)', 'filled')
hold off, xlabel('x'), ylabel('y'), zlabel('z')

%% plot clusters quality
figure
[silh,h] = silhouette(data, clustIDX);
avrgScore = mean(silh);

%% Assign data to clusters
% calculate distance (squared) of all instances to each cluster centroid
D = zeros(numObservarations, K);     % init distances
for k=1:K
%d = sum((x-y).^2).^0.5
D(:,k) = sum( ((data - repmat(clusters(k,:),numObservarations,1)).^2), 2);
end

% find  for all instances the cluster closet to it
[minDists, clusterIndices] = min(D, [], 2);

% compare it with what you expect it to be
sum(clusterIndices == clustIDX)

或者可能是簇的数据的输出方法,归一化和重新组织到那里原始格式上与排它属于从原来的“fulldata”结束柱appedicies。

Answer 1:

您可以使用数据游标功能,当你选择了一个情节点,这显示一个提示。 您可以使用修改后的更新功能来显示各种有关所选点的信息。

这是一个工作示例:

function customCusrorModeDemo()
    %# data
    D = load('fisheriris');
    data = D.meas;
    [clustIdx,labels] = grp2idx(D.species);
    K = numel(labels);
    clr = hsv(K);

    %# instance indices grouped according to class
    ind = accumarray(clustIdx, 1:size(data,1), [K 1], @(x){x});

    %# plot
    %#gscatter(data(:,1), data(:,2), clustIdx, clr)
    hLine = zeros(K,1);
    for k=1:K
        hLine(k) = line(data(ind{k},1), data(ind{k},2), data(ind{k},3), ...
            'LineStyle','none', 'Color',clr(k,:), ...
            'Marker','.', 'MarkerSize',15);
    end
    xlabel('SL'), ylabel('SW'), zlabel('PL')
    legend(hLine, labels)
    view(3), box on, grid on

    %# data cursor
    hDCM = datacursormode(gcf);
    set(hDCM, 'UpdateFcn',@updateFcn, 'DisplayStyle','window')
    set(hDCM, 'Enable','on')

    %# callback function
    function txt = updateFcn(~,evt)
        hObj = get(evt,'Target');   %# line object handle
        idx = get(evt,'DataIndex'); %# index of nearest point

        %# class index of data point
        cIdx = find(hLine==hObj, 1, 'first');

        %# instance index (index into the entire data matrix)
        idx = ind{cIdx}(idx);

        %# output text
        txt = {
            sprintf('SL: %g', data(idx,1)) ;
            sprintf('SW: %g', data(idx,2)) ;
            sprintf('PL: %g', data(idx,3)) ;
            sprintf('PW: %g', data(idx,4)) ;
            sprintf('Index: %d', idx) ;
            sprintf('Class: %s', labels{clustIdx(idx)}) ;
        };
    end

end

这里是如何看起来像在2D和3D视图(用不同的显示风格):



文章来源: Show rows on clustered kmeans data