Comparing multiple data frames

2019-04-17 15:51发布

I need some help with data analysis.
I do have two datasets (before & after) and I want to see how big the difference is between them.

Before

11330    STAT1
2721    STAT2
52438    STAT3
6124    SUZY

After

17401    STAT1
3462    STAT2
0    STAT3
72    SUZY

Tried to group them with tapply(before$V1, before$V2, FUN=mean).
But as I am trying to plot it, on x axis am not getting the group name but number instead. How can I plot such tapplied data (frequency on Y axis & group name on X axis)?

Also wanted to ask what is the proper command in R to compare such datasets as I am willing to find the difference between them?


Edited

dput(before$V1)
c(11330L, 2721L, 52438L, 6124L)

dput(before$V2)
structure(1:4, .Label = c("STAT1", "STAT2", "STAT3","SUZY"),class = "factor")

1条回答
神经病院院长
2楼-- · 2019-04-17 16:49

Here are a couple of ideas.

This is what I think your data look like?

before <- data.frame(val=c(11330,2721,52438,6124),
                     lab=c("STAT1","STAT2","STAT3","SUZY"))
after <- data.frame(val=c(17401,3462,0,72),
                     lab=c("STAT1","STAT2","STAT3","SUZY"))

Combine them into a single data frame with a period variable:

combined <- rbind(data.frame(before,period="before"),
      data.frame(after,period="after"))

Reformat to a matrix and plot with (base R) dotchart:

library(reshape2)
m <- acast(combined,lab~period,value.var="val")
dotchart(m)

Plot with ggplot:

library(ggplot2)
qplot(lab,val,colour=period,data=combined)
查看更多
登录 后发表回答