working with ggplot2

My question is related to ggplot2 in R (3.2.3). I have 4 datasets containing 2 columns and about 80,000-100,000 rows. I used the below code for creating a plot for one the datasets:

dataset1 <- read.table("file1.txt", header=T)
ggplot(data=dataset1, aes(dataset1$length))+ geom_histogram (binwidth =500)+
scale_x_continuous(breaks=seq(300,1000,by=200),seq(1001,15000,by=1000))

Here, length is the 2th column of my dataset, I would like to have the plot, with x-axis shows the length between 300-1200 with interval of 200 (300,500,700,900,1200) and length between 1201-1500 with interval of 1000. So, I used the above code for scale_x_continuous, but it didn't produce what I want. Could you please help me with the correct code for this plotting?

Here is a short sample of data:

case length
C1099757    300
C1099759    300
C1099761    300
C1099763    300
C1100993    301
C1100995    301
C1100997    301
C1100999    301
C1101377    302
C1101379    302
C1101919    303
C1101921    303
C1102979    304
C1102981    304
C1102983    304
C1103475    305
C1103477    305
C1104267    306
C1104269    306

In the original data file, it continues to 12000 in the similar way that I posted. In your opinion, is ggplot2 suitable for this plotting, if not please suggest the right one.

Also, I'm looking for a way to show the length distribution of all 4 datasets in one graph in order to easily compare them. I would be highly appreciated if you could please let me know how I can do that?

Many thanks in advance.

标签： r plot ggplot2

1条回答

淡お忘

2楼-- · 2019-08-07 11:00

I think you should replace

scale_x_continuous(breaks=seq(300,1000,by=200),seq(1001,15000,by=1000))

scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000)))

scale_x_continuous(breaks=seq(300,1000,by=200)+
scale_y_continuous(breaks=seq(1001,15000,by=1000))

(wasn't entirely sure what you meant)

Based on your sample data i generated some artificial data which only contains length

 df1 = data.frame(length=runif(300,300,1200))
 df2 = data.frame(length=runif(300,300,1200))
 df3 = data.frame(length=runif(300,900,1200))
 df2 = data.frame(length=runif(300,300,12000))
 df4 = data.frame(length=runif(300,300,12000))

# plotting a single dataset
 ggplot(data=df4, aes(length))+ geom_histogram (binwidth =500)+
 +     scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000)))
#combine the datframes
df = data.frame(df1$length,df2$length,df3$length,df4$length)
library(reshape)
melted <- melt(df)
ggplot(data=melted, aes(value))+aes(fill=variable)+ geom_histogram (binwidth =500)+
     scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000)))

 ggplot(data=melted, aes(value))+aes(fill=variable)+ geom_histogram (binwidth =500,position="dodge")+
    scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000)))

To have a slightly nicer X-Axis labelling I redistributed the labels a bit and turned them by 45 degrees

ggplot(data=melted, aes(value))+aes(fill=variable)+ geom_histogram (binwidth =500,position="dodge")+
     scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1100,15000,by=1000)))+theme(axis.text.x = element_text(angle = 45, hjust = 1))

And the chart witch adjusts the Bins accordingly. I actually like the different bar sizes.

ggplot(data=melted, aes(value))+
        aes(fill=variable)+ 
        geom_histogram(breaks=c(seq(300,1000,by=200),seq(1100,15000,by=1000)),position="dodge")+

        scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1100,15000,by=1000)))+
        theme(axis.text.x = element_text(angle = 45, hjust = 1))

0人赞添加讨论(0) 举报

working with ggplot2

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间