I was trying to create a bubble chart with D3. Everything worked exactly as in the example, but then I've noticed the data is rendered incorrectly.
So I ran an experiment: I've put four "groups" with different children combinations to create a group with a total value of 100
: 1 x 100
, 2 x 50
, 3 x 33.33
and 4 x 25
. E.g. I have the data like this:
[{
title: "X",
children: [
{
title: "100",
weight: 100
},
]
},
{
title: "X",
children: [
{
title: "50",
weight: 50
},
{
title: "50",
weight: 50
},
]
},
{
title: "X",
children: [
{
title: "33",
weight: 33.33
},
{
title: "33",
weight: 33.33
},
{
title: "33",
weight: 33.33
},
]
},
{
title: "X",
children: [
{
title: "25",
weight: 25
},
{
title: "25",
weight: 25
},
{
title: "25",
weight: 25
},
{
title: "25",
weight: 25
},
]
}]
Then I render the chart like this:
const rootNode = d3.hierarchy(data);
rootNode.sum(d => d.weight || 0);
const bubbleLayout = d3.pack()
.size([chartHeight, chartHeight])
.radius(d => d.data.weight); // toggling this line on and off makes no difference
let nodes = null;
try {
nodes = bubbleLayout(rootNode).descendants();
} catch (e) {
console.error(e);
throw e;
}
But the resulting bubbles are not even in any means:
To define the incorrectness of this renderer, consider the bubbles in the middle of the screenshot: the blue one with no children has the radius of 100
and its actual size is 180 px
. The two bubbles to the right of it both have radius 50
, so they should be 180 px
wide (when put along the same axis). But what happens is their total diameter is 256 px
, which makes me think this is incorrect render:
The questions are: why this happens and how to make this chart look correctly so that the circle with r = 100
has the same size as two circles with r = 50
both?
Based on the question I'm not necessarily clear on the end goal, but I we can go through each possibility for completeness.
I think you want inter-generational circles to have the same areal scaling factor or diameter scaling factor (area or diameter proportional to some specific value of each node across generations).
Alternatively, you might just want to have areas or diameters proportional to some specific value of each node across one generation, though I think this is less likely.
In addition to these organizing strategies, we could have areas or diameters proportional to some value of leaf nodes.
Given the discussion in the comments and another recent question on this topic, I'll take the opportunity to go over each of the above mentioned organizational strategies. Ideally that covers both this question and the linked one.
Here are the six strategies based on the above:
Proportionality of Areas
- Pack circles so that leaf (childless children) circles have proportional areas
- Pack circles so that one generation of circles has proportional areas
- Pack circles so that all or multiple generations have circles that are proportional in area.
Proportionality of Diameters/Radii
- Pack circles so that leaf (childless children) circles have proportional diameters
- Pack circles so that one generation of circles has proportional diameters
- Pack circles so that all or multiple generations have circles that have proportional diameters.
Outcomes
Essentially: One, two, four and five can be achieved with d3.pack()
. Three is not possible. Six is not a circle pack.
1. Proportional Areas For Leaves
This is the expected behavior for d3.pack()
, it doesn't require much discussion. Only leaves will have proportional areas, any parents will consist of circles that are a minimum enclosing circle. Their radii are determined by what is needed to enclose the children, not by any consideration for the sizing values of the children.
2. Proportional Areas For A Single Generation
This is also possible with d3.pack()
out of the box - but with a twist. d3.pack()
will give leaf nodes an area proportional to some sizing value. This cannot be altered without essentially re-writing the module (which is already the least friendly of all d3 modules to tamper with).
The algorithm can't give proportional areas to some arbitrary generation, so we can't accomplish this strategy unless we use multiple circle packs.
Example
If we wanted to scale the highest level parents (the root's first generation descendants, called parents for the rest of this section) then we could create a parent circle pack. The parent circle pack will only be fed a hierarchy that contains the root and the parents. Since all parents are leaves in this truncated hierarchy, they will all be scaled proportionally in area based on some assigned value. We then draw this circle pack using a g
for each node.
After we make each parent node in the circle pack spawn its own circle pack for its own descendants (this is also has a truncated hierarchy, dropping the original root, instead the root will be the parent for each circle pack). The area of each of the leaf nodes within a child circle pack will be sized proportionally to some assigned value. The scaling of the leaf nodes will differ between each child circle pack because the nature and structure of these now separately packed hierarchies will determine leaf scaling.
This approach requires us to keep track of the radii of the parent nodes to set the size of the child circle packs and to correctly position the circles in the child packs (I use a local variable for the latter in the below snippet). That's about as difficult as the implementation gets, the code is largely the same as it would be if you were appending two circle packs on the same page.
Here's a crude demonstration:
var svg = d3.select("svg"), diameter = +svg.attr("width"), g = svg.append("g").attr("transform", "translate(2,2)"), colors = ["#ffffcc","#a1dab4","#41b6c4","#225ea8"];
var pack = d3.pack().size([diameter - 4, diameter - 4]);
var local = d3.local();
var root = {"name": "root","children": [{"name": "Node A","size": 100},{"name": "Node B","size": 100},{"name": "Node C","size": 100}]}
var children = [{"name":"NodeA","children":[{"name":"Node1","size":34},{"name":"Node2","size":33},{"name":"Node3","size":33}]},{"name":"NodeB","children":[{"name":"Node1","size":50},{"name":"Node2","size":50}]},{"name":"NodeC","children":[{"name":"Node1","children":[{"name":"Nodea","size":15},{"name":"Nodeb","size":12},{"name":"Nodec","size":10}]},{"name":"Node2","size":10},{"name":"Node3","size":13},{"name":"Node4","size":9},{"name":"Node5","size":6},{"name":"Node6","size":10},{"name":"Node7","size":15}]}]
// parent pack:
root = d3.hierarchy(root)
.sum(function(d) { return d.size; })
.sort(function(a, b) { return b.value - a.value; });
// Parent Circle Pack
var node = g.selectAll(null)
.data(pack(root).descendants())
.enter().append("g")
.attr("transform", function(d) { return "translate(" + d.x + "," + d.y + ")"; })
.attr("fill", function(d) { return colors[d.depth]; });
// Parent circle:
node.append("circle")
.attr("r", function(d) { return d.r; });
// get radii
var radii = pack(root).descendants().filter(function(d) { return d.depth == 1; }).map(function(d) { return d.r; });
// Create child pack data:
var childRoots = children.map(function(child,i) {
var childPack = d3.pack().size([radii[i]*2 - 2, radii[i]*2 - 2]);
var childRoot = d3.hierarchy(child)
.sum(function(d) { return d.size; })
.sort(function(a,b) { return b.value - a.value; });
return childPack(childRoot).descendants();
})
// Swap node data for child node data, but keep the original data handy.
var childNodes = node.each(function(d,i) {
local.set(this, d); // but store the data in the local variable.
})
.filter(function(d,i) {
return i > 0;
})
.data(childRoots)
.selectAll("g")
.data(function(d) { return d; })
.enter()
.append("g")
.attr("transform", function(d) { var offset = local.get(this).r; return "translate(" + (d.x-offset) + "," + (d.y-offset) + ")"; })
.attr("fill", function(d) { return colors[d.depth+1]; });
// Append child elements to each node:
childNodes.filter(function(d) { return d.depth > 0 }) // skip parent - it's already drawn.
.append("circle")
.attr("r", function(d) { return d.r; });
childNodes.filter(function(d) { return !d.children })
.append("text")
.text(function(d) { return d.data.name; })
.attr("fill","black")
.style("text-anchor","middle")
.attr("dy", 5);
<svg width="600" height="600"></svg>
<script src="https://d3js.org/d3.v4.min.js"></script>
Each parent has a size value of 100, which is coincidentally (ok, not a coincidence, I intentionally did it) the cumulative total of all the deepest child (leaf) nodes' sizes for each parent. Each parent node is also the same size:
Naturally we could feed the parent circle pack the root's grandchildren if we wanted to scale that generation proportionately.
3. Proportionality of Areas across Generations
Let's use a simple two generation circle pack: a parent with some children.
If the parent has the same areal scaling factor as its children then the cumulative area of the children will be equal to the area of its parent.
If we are to pack these children into their parent, we must do so in a manner that does not create void space. This is not possible when dealing with circles.
Void space is why this is not possible - parent's of more than one child will always have an area greater than the sum of the areas of its children.
If intergenerational proportionality is critical then a treemap can achieve this, the tradeoff as described in the d3 documentation is:
Although circle packing does not use space as efficiently as a
treemap, the “wasted” space more prominently reveals the hierarchical
structure. (docs)
Exceptions
If parents have size values that are greater than the cumulative size values of their children then depending on the values, circle packing may be possible. To demonstrate the limited nature of this, consider a parent of two equally sized children. The most efficient circle pack with these two children will require a parent to be twice the area of the combined area of the children (note, if it is larger than 2x then we aren't circle packing as either we aren't using a minimum enclosing circle or the children don't touch).
Likewise it may be possible (depending on the values and hierarchy) to have the same areal scaling factor across generations if there are enough leaf nodes between the generations or in the parents so that the cumulative sizing value (and thus area) of the two generations nodes are not equal.
If we use the second approach above (single generation scaled proportionally) and the chosen generation only has only-children, then the children will be proportionately scaled as the only way to pack circles in another circle without voids is to pack a single circle in a parent.
The first two bullets would probably need manually corrected/validated values to still be circle packs, if they strayed from circle packing (minimum enclosing circles as parents - no padding or margins) then d3.pack() is no longer the correct tool.
I add these exceptions for completeness, I think they are extraordinarily unlikely, other than the exception arising from single children (but if scaled the same as their parents, completely cover the parents anyways).
4. Proportional Diameters for Leaves
If d3.pack()
assumes that the sizing value should be proportional to the area of the leaf circle, then we can use the relationship between area and diameter to get a sizing value that will create areas proportional to diameter:
size = Math.pow(size/2,2);
We're treating the initial size value as a diameter and finding out the area of the circle with that diameter (proportionately, so we don't need π since we would multiply every result by π). Here's a quick demo:
var svg = d3.select("svg"), diameter = +svg.attr("width"), g = svg.append("g").attr("transform", "translate(2,2)"), colors = ["#ffffcc","#a1dab4","#41b6c4","#225ea8"];
var pack = d3.pack().size([diameter - 4, diameter - 4]);
var root = {"name": "root","children": [{"name": "Node A","size": 100},{"name": "Node B",children:[{"name": "Node 1", "size":50},{"name": "Node 2", "size":50}]}]}
root = d3.hierarchy(root)
.sum(function(d) { return Math.pow(d.size/2,2); })
.sort(function(a, b) { return b.value - a.value; });
var node = g.selectAll(null)
.data(pack(root).descendants())
.enter().append("g")
.attr("transform", function(d) { return "translate(" + d.x + "," + d.y + ")"; })
.attr("fill", function(d) { return colors[d.depth]; });
node.append("circle")
.attr("r", function(d) { return d.r; });
<svg width="600" height="600"></svg>
<script src="https://d3js.org/d3.v4.min.js"></script>
And a visual of the snippet:
The (leaf) circle on the left has a size of 100, the (parent) circle on the right has two children (leaves), each with size 50 (cumulatively 100). It may appear as though by scaling this way that we have scaled both leaves and parents the same. This is just a happy coincidence that occurs when dealing with two identically sized child circles.
In a pack the children must touch. If the children are of the same size and touching, the minimum enclosing circle will always have a diameter equal to the sum of the children's. This pattern is only true when dealing with two children of the same size.
We can see that a parent with five children will not appear the same size as a parent with two children even if the sum of the diameters of the children circles is the same for each parent:
Here, the leaf nodes are all proportionate in terms of diameter (or radius). For example, the large first generation leaf on the left has a size value of 100 - it is 298 pixels across (1 : 2.98), the two leaves in the large right hand circle have a size value of 50 each and are 149 pixels across (1 : 2.98). The five leaves in the lower circle have a size value of 20 each, and are 59.6 pixels across (1 : 2.98).
Despite the proportionality of diameter (or radii) in the leaf nodes, this is lost as soon as one moves up the hierarchy: the five children circles at the bottom and the two children circles on the right have the same cumulative diameter (and the same cumulative size value in the data), but the parents are obviously different sizes.
5. Proportional Diameters For a Single Generation
Using the relationship between diameter and area we can create scaled values to pass to d3.pack() which represent areas for given diameters (the same as in #4 above).
Once we get an area value, the procedure is the same as scaling areas for a single generation (the same as #2 above). That's it.
6. Proportionality of Diameters across Generations
Unlike areas, this is possible. But not with d3.pack()
. We aren't packing circles here - we are packing diameters and diameters are lines. We are packing one dimensional lines (which happen to be represented by a circle).
Let's assume a simple one parent multiple children example. The parent's diameter must be equal to the sum of the children's diameters if the scaling factor is consistent across generations. There is only one way to achieve this with a minimum enclosing circle:
If you were to apply this to all generations, then all circles would be anchored on a line - as we are in fact line packing.
d3.pack
won't work here as it packs 2d circles in 2d space, where we just need to pack 1d lines on a 1d line to achieve this strategy.
This strategy can probably be achieved with some fairly simple math.
Exceptions
There are exceptions in certain cases, such as those examined in strategy #3.
There is one other exception: a hierarchy where every node has two children with equal size. I'm not sure, d3 might just plot it on a line, but it would work with d3.pack
. However, it is not clear why some sort of tree layout wouldn't be superior here.
Summary
Soapbox
Circle packing is a poor method to convey quantitative data in hierarchy. It is better for conveying the hierarchy, as noted above with Mike's quote. I'd further venture that areal judgements of circles by people is poor. I'd also suggest that sizing leaf nodes with the same scale factor is likely not intuitive for the reader if leaves are scattered across different generations. If quick and intuitive quantitative understanding of the underlying values is needed, circle packing is not the ideal solution. That said,
Conclusions
Circle packing does not and cannot represent all areas with a consistent areal scale factor: circle packing means void space, void space means parent circles will have areas greater than the sum of their child circle areas. If you need all generations to have a constant areal scale, a treemap may be what you need. yes, there are some exceptions noted in #3, but these are essentially theoretical with little practical use.
Circle packing can only represent either one generation with a constant areal scale factor or all leaf nodes - not both. Either approach can be accomplished with d3.pack
Circle packing can represent diameters proportionally for either leaves or one generation. Again either approach can be accomplished with d3.pack
.
Circle packing with diameters proportionate across some or all generations is not possible. A layout can be made - but it is not circle packing. We could tighten the arrangement of the three child circles in the image above, but then we don't have a minimum enclosing circle (and thus we don't have circle packing). Leaving them in a line also isn't circle packing. For this d3.pack()
is of no use - as we aren't packing circles anymore.
There could be other layout options that don't use minimum enclosing circles or that use different size scales for different generations (which would likely (in practice, not theory) always require ditching minimum enclosing circles as well). This moves us well outside circle packing and I'm not sure what is out there that can help.