How can I create a (100%) stacked histogram in R?

My dataset:

I have data in the following format (here, imported from a CSV file). You can find an example dataset as CSV here.

PAIR   PREFERENCE
1      5
1      3
1      2
2      4
2      1
2      3

… and so on. In total, there are 19 pairs, and the PREFERENCE ranges from 1 to 5, as discrete values.

What I'm trying to achieve:

What I need is a stacked histogram, e.g. a 100% high column, for each pair, indicating the distribution of the PREFERENCE values.

Something similar to the "100% stacked columns" in Excel, or (although not quite the same, a so-called "mosaic plot"):

What I tried:

I figured it'd be easiest using ggplot2, but I don't even know where to start. I know I can create a simple bar chart with something like:

ggplot(d, aes(x=factor(PAIR), y=factor(PREFERENCE))) + geom_bar(position="fill")

… that however doesn't get me very far. So I tried this, and it gets me somewhat closer to what I'm trying to achieve, but it still uses the count of PREFERENCE, I suppose? Note the ylab being "count" here, and the values ranging to 19.

qplot(factor(PAIR), data=d, geom="bar", fill=factor(PREFERENCE_FIXED))

Results in:

enter image description here

So, what do I have to do to get the stacked bars to represent a histogram?
Or do they actually do this already?
If so, what do I have to change to get the labels right (e.g. have percentages instead of the "count")?

_{By the way, this is not really related to this question, and only marginally related to this (i.e. probably same idea, but not continuous values, instead grouped into bars).}

标签： r plot ggplot2 histogram

1条回答

Lonely孤独者°

2楼-- · 2020-06-16 10:19

Maybe you want something like this:

ggplot() + 
    geom_bar(data = dat,
             aes(x = factor(PAIR),fill = factor(PREFERENCE)),
             position = "fill")

where I've read your data into dat. This outputs something like this:

enter image description here

The y label is still "count", but you can change that manually by adding:

+ scale_x_discrete("Pairs") + scale_y_continuous("Votes")

0人赞添加讨论(0) 举报