Most underused data visualization [closed]

2019-01-12 13:21发布

Histograms and scatterplots are great methods of visualizing data and the relationship between variables, but recently I have been wondering about what visualization techniques I am missing. What do you think is the most underused type of plot?

Answers should:

  1. Not be very commonly used in practice.
  2. Be understandable without a great deal of background discussion.
  3. Be applicable in many common situations.
  4. Include reproducible code to create an example (preferably in R). A linked image would be nice.

15条回答
Fickle 薄情
2楼-- · 2019-01-12 13:30

Check out Edward Tufte's work and especially this book

You can also try and catch his travelling presentation. It's quite good and includes a bundle of four of his books. (i swear i don't own his publisher's stock!)

By the way, i like his sparkline data visualization technique. Surprise! Google's already written it and put it out on Google Code

查看更多
在下西门庆
3楼-- · 2019-01-12 13:31

Mosaic plots seem to me to meet all four criteria mentioned. There are examples in r, under mosaicplot.

查看更多
再贱就再见
4楼-- · 2019-01-12 13:31

Summary plots? Like mentioned in this page:

Visualizing Summary Statistics and Uncertainty

查看更多
一纸荒年 Trace。
5楼-- · 2019-01-12 13:33

Horizon graphs (pdf), for visualising many time series at once.

Parallel coordinates plots (pdf), for multivariate analysis.

Association and mosaic plots, for visualising contingency tables (see the vcd package)

查看更多
Anthone
6楼-- · 2019-01-12 13:40

I really like dotplots and find when I recommend them to others for appropriate data problems they are invariably surprised and delighted. They don't seem to get much use, and I can't figure out why.

Here's an example from Quick-R: dotplot on car data

I believe Cleveland is most responsible for the development and promulgation of these, and the example in his book (in which faulty data was easily detected with a dotplot) is a powerful argument for their use. Note that the example above only puts one dot per line, whereas their real power comes with you have multiple dots on each line, with a legend explaining which is which. For instance, you could use different symbols or colors for three different time points, and thence easily get a sense of time patterns in different categories.

In the following example (done in Excel of all things!), you can clearly see which category might have suffered from a label swap.

Dotplot with 2 groups

查看更多
啃猪蹄的小仙女
7楼-- · 2019-01-12 13:42

Regarding sparkline and other Tufte idea, the YaleToolkit package on CRAN provides functions sparkline and sparklines.

Another package that is useful for larger datasets is hexbin as it cleverly 'bins' data into buckets to deal with datasets that may be too large for naive scatterplots.

查看更多
登录 后发表回答