Get multiple comparisons with GGplot2

2019-06-09 22:15发布

问题:

I produced a plot with the standard R base function PLOT. plot(dataframe) This plot looks like this:

Now i want to make the same plot with GGplot2. But everything i tried failed so far. My dataframe looks like this:

structure(list(tRap_pear = c(0.0350096175177328, 0.234255507711743, 
0.23714999195134, 0.185536020521134, 0.191585098617356, 0.201402054387186, 
0.220911538536031, 0.216072802572045, 0.132247101763063, 0.172753098431029
), Beeml_pear = c(0.179209909971615, 0.79129167285928, 0.856908302056589, 
0.729078080521886, 0.709346164378725, 0.669599784720647, 0.585348196746785, 
0.639355942917055, 0.544909349368496, 0.794652394149651), Mash_spear = c(0.158648548431316, 
0.53887352819363, 0.457888265527408, 0.563127988391531, 0.535626487998822, 
0.339363025936821, 0.347487640634066, 0.446668310403948, 0.327120869232769, 
0.597005214316607), tRap_spear = c(0.0401250136715237, 0.511012317625831, 
0.328979081566789, 0.518148084654934, 0.469847452665152, 0.264057161482016, 
0.312517231623128, 0.430052514388429, 0.338233671643239, 0.417881662695103
), Beeml_spear = c(0.0961259035034072, 0.70273493789764, 0.466746274696884, 
0.817805518009015, 0.722756585905275, 0.407861493627591, 0.423745193368859, 
0.534971415799068, 0.519199516553983, 0.748709415442623), Mash_pear2080 = c(0.823944540480775, 
0.816630852343513, 0.81134728399675, 0.801065036203532, 0.799630945085954, 
0.799195606444727, 0.798637867344115, 0.798478922129054, 0.798090734787886, 
0.797673368802285), Mash_spear2080 = c(0.687131069446869, 0.704882483221722, 
0.696045373880582, 0.716722524407137, 0.74354480616146, 0.684047794911021, 
0.718132260792985, 0.639437653298423, 0.671605390101442, 0.670239912705399
)), .Names = c("tRap_pear", "Beeml_pear", "Mash_spear", "tRap_spear", 
"Beeml_spear", "Mash_pear2080", "Mash_spear2080"), row.names = c("Aft1", 
"Alx3_3418.2", "Alx4_1744.1", "Arid3a_3875.1_v1_primary", "Arid3a_3875.1_v2_primary", 
"Arid3a_3875.2_v1_primary", "Arid3a_3875.2_v2_primary", "Arid5a_3770.2_v1_primary", 
"Arid5a_3770.2_v2_primary", "Aro80"), class = "data.frame")

I know its something with the facets of GGPlot but how to correctly implement this still remains a question to me.

回答1:

To get similar plot to plotmatrix() in ggplot2 package but with names on the diagonal, first, you need to reshape from wide format to long format.

This code (made by @Arun) makes all combinations of variable names (with expand.grid()) and then you put all data for each combination in one long data frame.

combs <- expand.grid(names(dataframe), names(dataframe))

out <- do.call(rbind, apply(combs, 1, function(x) {
  tt <- dataframe[, x]; names(tt) <- c("V1", "V2")
  tt <- cbind(tt, id1 = x[1], id2 = x[2])
}))

Next, make new data frame for texts - position of labels are calculated as mean value for each variable. Position is calculated to put label in middle of data range.

library(plyr)
df.text=ddply(out[out$id1==out$id2,],.(id1,id2),summarise,
                       pos=max(V1)-(max(V1)-min(V1))/2)

Now, replace those values were both variables are the same with NA (data on diagonals). This should be done after text data frame is made.

out[out$id1==out$id2,c("V1","V2")]<-NA

Now plot your data and use both variable ids for facetting and with geom_text() add texts to diagonals.

ggplot(data = out, aes(x = V2, y = V1)) + geom_point() +
  facet_grid(id1 ~ id2,scales="free")+
  geom_text(data=df.text,aes(pos,pos,label=id1))



回答2:

If your data is called mydf,

plotmatrix(mydf)

A warning says: "This function is deprecated. For a replacement, see the ggpairs function in the GGally package."

therefore:

library(GGally)

ggpairs(mydf, upper=list(continuous = "points", combo = "box"))

Have a look in the help page to play around with the parameters.



回答3:

I think your question is answered in this blog post. In particular,

plotmatrix(iris[1:4])

However, this function is now depreciated, so use the ggpairs function in GGally



标签: r ggplot2