Thinking with Joins

最新推荐文章于 2019-12-23 02:58:51 发布

samantha_wang

最新推荐文章于 2019-12-23 02:58:51 发布

阅读量436

点赞数

分类专栏： D3 文章标签： data joins D3

D3 专栏收录该内容

12 篇文章

订阅专栏

本文深入解析D3.js中数据关联（join）的概念，通过实例演示如何利用数据关联来创建SVG元素，并阐述其在数据可视化中的应用。文章详细解释了数据关联的三个关键状态：进入（enter）、更新（update）和退出（exit），并展示如何通过简单的代码实现动态数据可视化。此外，文章还介绍了如何通过数据关联进行元素操作，如添加、更新和移除元素，以实现平滑的数据集变化过渡。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

点击打开链接

February 5, 2012Mike Bostock

Thinking with Joins

Say you’re making a basic scatterplot using D3, and you need to create some SVG circle elements to visualize your data. You may be surprised to discover that D3 has no primitive for creating multiple DOM elements. Wait, WAT?

Sure, there’s the append method, which you can use to create a single element.

Here svg refers to a single-element selection containing an <svg> element created previously (or selected from the current page, say).

svg.append("circle")
    .attr("cx", d.x)
    .attr("cy", d.y)
    .attr("r", 2.5);

But that’s just a single circle, and you want many circles: one for each data point. Before you bust out a for loop and brute-force it, consider this mystifying sequence from one of D3’s examples.

Here data is an array of JSON objects with x and y properties, such as: [{"x": 1.0, "y":1.1}, {"x": 2.0, "y":2.5}, …].

svg.selectAll("circle")
    .data(data)
  .enter().append("circle")
    .attr("cx", function(d) { return d.x; })
    .attr("cy", function(d) { return d.y; })
    .attr("r", 2.5);

This code does exactly what you need: it creates a circle element for each data point, using the x andy data properties for positioning. But what’s with the selectAll("circle")? Why do you have to select elements that you know don’t exist in order to create new ones? WAT.

Here’s the deal. Instead of telling D3 how to do something, tell D3 what you want. You want the circle elements to correspond to data. You want one circle per datum. Instead of instructing D3 to create circles, then, tell D3 that the selection "circle" should correspond to data. This concept is called the data join:

DataEnterUpdateElementsExit

Data points joined to existing elements produce the update (inner) selection. Leftover unbound data produce the enter selection (left), which represents missing elements. Likewise, any remaining unbound elements produce the exit selection (right), which represents elements to be removed.

Now we can unravel the mysterious enter-append sequence through the data join:

First, svg.selectAll("circle") returns a new empty selection, since the SVG container was empty. The parent node of this selection is the SVG container.
This selection is then joined to an array of data, resulting in three new selections that represent the three possible states: enter, update, and exit. Since the selection was empty, the update and exit selections are empty, while the enter selection contains a placeholder for each new datum.
The update selection is returned by selection.data, while the enter and exit selections hang off the update selection; selection.enter thus returns the enter selection.
The missing elements are added to the SVG container by calling selection.append on the enter selection. This appends a new circle for each data point to the SVG container.

Thinking with joins means declaring a relationship between a selection (such as "circle") and data, and then implementing this relationship through the three enter, update and exit states.

But why all the trouble? Why not just a primitive to create multiple elements? The beauty of the data join is that it generalizes. While the above code only handles the enter selection, which is sufficient for static visualizations, you can extend it to support dynamic visualizations with only minor modifications for update and exit. And that means you can visualize realtime data, allow interactive exploration, and transition smoothly between datasets!

Here’s an example of handling all three states:

var circle = svg.selectAll("circle")
    .data(data);

circle.exit().remove();

circle.enter().append("circle")
    .attr("r", 2.5);

circle
    .attr("cx", function(d) { return d.x; })
    .attr("cy", function(d) { return d.y; });

To control how data is assigned to elements, you can provide a key function.

Whenever this code is run, it recomputes the data join and maintains the desired correspondence between elements and data. If the new dataset is smaller than the old one, the surplus elements end up in the exit selection and get removed. If the new dataset is larger, the surplus data ends up in theenter selection and new nodes are added. If the new dataset is exactly the same size, then all the elements are simply updated with new positions, and no elements are added or removed.

Thinking with joins means your code is more declarative: you handle these three states without any branching (if) or iteration (for). Instead you describe how elements should correspond to data. If a given enter, update or exit selection happens to be empty, the corresponding code is a no-op.

Joins also let you target operations to specific states, if needed. For example, you can set constant attributes (such as the circle’s radius, defined by the "r" attribute) on enter rather than update. By reselecting elements and minimizing DOM changes, you vastly improve rendering performance! Similarly, you can target animated transitions to specific states. For example, for entering circles to expand-in:

circle.enter().append("circle")
    .attr("r", 0)
  .transition()
    .attr("r", 2.5);

Likewise, to shrink-out:

circle.exit().transition()
    .attr("r", 0)
    .remove();

Now you’re thinking with joins!

Comments or questions? Discuss on HN.

Addendum

I’ve written a series of examples on the general update pattern as a followup to this post.

February 5, 2012 Mike Bostock