-->

dc.js Scatter Plot with multiple values for a sing

2019-07-23 12:35发布

问题:

We have scatter plots working great in our dashboard, but we have been thrown a curve ball. We have a new dataset that provides multiple y values for a single key. We have other datasets were this occurs but we had flatten the data first, but we do not want to flatten this dataset.

The scatter plot should us the uid for the x-axis and each value in the inj field for the y-axis values. The inj field will always be an array of numbers, but each row could have 1 .. n values in the array.

var data = [
    {"uid":1, "actions": {"inj":[2,4,10], "img":[10,15,25], "res":[15,19,37]},
    {"uid":2, "actions": {"inj":[5,8,15], "img":[5,8,12],   "res":[33, 45,57]}
    {"uid":3, "actions": {"inj":[9],      "img":[2],        "res":[29]}
];

We can define the dimension and group to plot the first value from the inj field.

var ndx   = crossfilter(data);
var spDim = ndx.dimension(function(d){ return [d.uid, d.actions.inj[0]];});
var spGrp = spDim.group();

But are there any suggestions on how to define the scatter plot to handle multiple y values for each x value?

Here is a jsfiddle example showing how I can display the first element or the last element. But how can I show all elements of the array?

--- Additional Information ---

Above is just a simple example to demonstrate a requirement. We have developed a dynamic data explorer that is fully data driven. Currently the datasets being used are protected. We will be adding a public dataset soon to show off the various features. Below are a couple of images.

I have hidden some legends. For the Scatter Plot we added a vertical only brush that is enabled when pressing the "Selection" button. The notes section is populated on scatter plot chart initialization with the overall dataset statistics. Then when any filter is performed the notes section is updated with statistics of just the filtered data.

The field selection tree displays the metadata for the selected dataset. The user can decide which fields to show as charts and in datatables (not shown). Currently for the dataset shown we only have 89 available fields, but for another dataset there are 530 fields the user can mix and match.

I have not shown the various tabs below the charts DIV that hold several datatables with the actual data.

The metadata has several fields that are defined to help use dynamically build the explorer dashboard.

回答1:

I warned you the code would not be pretty! You will probably be happier if you can flatten your data, but it's possible to make this work.

We can first aggregate all the injs within each uid, by filtering by the rows in the data and aggregating by uid. In the reduction we count the instances of each inj value:

        uidDimension = ndx.dimension(function (d) {
            return +d.uid;
        }),
        uidGroup = uidDimension.group().reduce(
            function(p, v) { // add
                v.actions.inj.forEach(function(i) {
                    p.inj[i] = (p.inj[i] || 0) + 1;
                });
                return p;
            },
            function(p, v) { // remove
                v.actions.inj.forEach(function(i) {
                    p.inj[i] = p.inj[i] - 1;
                    if(!p.inj[i])
                        delete p.inj[i];
                });
                return p;
            },
            function() { // init
                return {inj: {}};
            }
        );
        uidDimension = ndx.dimension(function (d) {
            return +d.uid;
        }),
        uidGroup = uidDimension.group().reduce(
            function(p, v) { // add
                v.actions.inj.forEach(function(i) {
                    p.inj[i] = (p.inj[i] || 0) + 1;
                });
                return p;
            },
            function(p, v) { // remove
                v.actions.inj.forEach(function(i) {
                    p.inj[i] = p.inj[i] - 1;
                    if(!p.inj[i])
                        delete p.inj[i];
                });
                return p;
            },
            function() { // init
                return {inj: {}};
            }
        );

Here we assume that there might be rows of data with the same uid and different inj arrays. This is more general than needed for your sample data: you could probably do something simpler if there is indeed only one row of data for each uid.

To flatten out the resulting group, with we can use a "fake group" to create one group-like {key, value} data item for each [uid, inj] pair:

    function flatten_group(group, field) {
        return {
            all: function() {
                var ret = [];
                group.all().forEach(function(kv) {
                    Object.keys(kv.value[field]).forEach(function(i) {
                        ret.push({
                            key: [kv.key, +i],
                            value: kv.value[field][i]
                        });
                    })
                });
                return ret;
            }
        }
    }
    var uidinjGroup = flatten_group(uidGroup, 'inj');

Fork of your fiddle

In the fiddle, I've added a bar chart to demonstrate filtering by UID. Filtering on the bar chart works, but filtering on the scatter plot does not. If you need to filter on the scatter plot, that could probably be fixed, but it could only filter on the uid dimension because your data is too course to allow filtering by inj.