-->

Low, med high colors for deciles in ggplot

2019-04-10 02:25发布

问题:

I want to have a polygon type of spatial plots using ggplot. In which, polygons are plotted and color of polygons are decided by its weight.

Sample data frame look like - (here is the data file CompleteDataFile)

    polyNr   x   y   centroidX   centroidY   weight
1   4459425.25  5328202.595264193   4459675.25  5328202.595264193   -13.055709633886783
1   4459550.25  5328419.101615138   4459675.25  5328202.595264193   -13.055709633886783
1   4459800.25  5328419.101615138   4459675.25  5328202.595264193   -13.055709633886783
1   4459925.25  5328202.595264193   4459675.25  5328202.595264193   -13.055709633886783
1   4459800.25  5327986.088913247   4459675.25  5328202.595264193   -13.055709633886783
1   4459550.25  5327986.088913247   4459675.25  5328202.595264193   -13.055709633886783
2   4457550.25  5337512.3683548765  4457800.25  5337512.3683548765  -118.36760699572329
2   4457675.25  5337728.874705822   4457800.25  5337512.3683548765  -118.36760699572329
2   4457925.25  5337728.874705822   4457800.25  5337512.3683548765  -118.36760699572329
2   4458050.25  5337512.3683548765  4457800.25  5337512.3683548765  -118.36760699572329
2   4457925.25  5337295.862003931   4457800.25  5337512.3683548765  -118.36760699572329
2   4457675.25  5337295.862003931   4457800.25  5337512.3683548765  -118.36760699572329
3   4475175.25  5336862.849302039   4475425.25  5336862.849302039   -3.397375074455629
3   4475300.25  5337079.355652984   4475425.25  5336862.849302039   -3.397375074455629
3   4475550.25  5337079.355652984   4475425.25  5336862.849302039   -3.397375074455629
3   4475675.25  5336862.849302039   4475425.25  5336862.849302039   -3.397375074455629
3   4475550.25  5336646.342951093   4475425.25  5336862.849302039   -3.397375074455629
3   4475300.25  5336646.342951093   4475425.25  5336862.849302039   -3.397375074455629
4   4464675.25  5343358.039830423   4464925.25  5343358.039830423   -51.57522722796112
4   4464800.25  5343574.546181369   4464925.25  5343358.039830423   -51.57522722796112
4   4465050.25  5343574.546181369   4464925.25  5343358.039830423   -51.57522722796112
4   4465175.25  5343358.039830423   4464925.25  5343358.039830423   -51.57522722796112
4   4465050.25  5343141.533479477   4464925.25  5343358.039830423   -51.57522722796112
4   4464800.25  5343141.533479477   4464925.25  5343358.039830423   -51.57522722796112
3438    4459050.25  5338378.393758661   4459300.25  5338378.393758661   1.066256760712294
3438    4459175.25  5338594.900109607   4459300.25  5338378.393758661   1.066256760712294
3438    4459425.25  5338594.900109607   4459300.25  5338378.393758661   1.066256760712294
3438    4459550.25  5338378.393758661   4459300.25  5338378.393758661   1.066256760712294
3438    4459425.25  5338161.887407715   4459300.25  5338378.393758661   1.066256760712294
3438    4459175.25  5338161.887407715   4459300.25  5338378.393758661   1.066256760712294

My steps are -

  1. Divide the whole data set into deciles as

    breaks=unique(quantile(df$weight,probs=seq(0,1,by=0.1))) df$deciles = cut(df$weight,breaks=breaks,include.lowest=TRUE)

  2. color scale ( I want to have positive numbers as red and negative as green)

    library(RColorBrewer) colours=brewer.pal(name="RdYlGn", n=nlevels(df$deciles)) names(colours)=rev(levels(df$deciles))

  3. plot

    library(ggplot2) ggplot(df,aes(x=x,y=y)) + geom_polygon(aes(group=polyNr,fill=factor(deciles))) + scale_fill_manual(values=colours)

This gives me a plot - look like -

But, my other requirement is - I want to have zero as white. In general, I can do this using

scale_fill_gradient2(low = muted("green"), mid = "white", high = muted("red"), midpoint = 0,)

But, I can't not use it with my discrete scale.

So first, Is that possible ? If yes, how can I get the low high mid colors along with decile scales. If this is duplicate question please locate the original question which I have missed.

P.S. - I am using the same code for different dataset, thus, setting manual color scale is not preferable.

EDIT -

For the color settings (red,white,green) I tried colorRampPalette also. (Thanks to @ Pewi for pointing out)

colours = colorRampPalette(c("red", "white", "green"))(11)

This give me the following plot.

Still setting of white color for zero weight is a major problem. I encountered the same issue in base package also.

回答1:

I interpret your question as "how do I create n colours ranging from x to y passing through white". One answer to that question would be to use the function colorRampPalettefrom package grDevices

library(grDevices)

colours <- colorRampPalette(c("red", "white", "green"))(n = 21)

plot(1:21, col=colours,pch=18,cex=4)

If you replace your part 2 with something similar you will probably be able to get the desired output.

Edit:

I am afraid this won't be very elegant but bear with me.

#sim data
dat = data.frame(x =-3:9)

#cut into deciles
dat$y = cut(dat$x,breaks=quantile(dat$x,seq(0,1,0.1)),include.lowest = T)

#Find in wich decile the value closest to zero is
dat$part = as.numeric(dat$y) <= as.numeric(dat$y[which(abs(0-dat$x)==min(abs(dat$x-0)))])

#split color range into two parts
highcolours <- colorRampPalette(c("red","white"))(n = sum(dat$part==TRUE))
lowcolours <- colorRampPalette(c("white","green"))(n = sum(dat$part==FALSE)+1)

#combine colors
cols = c(highcolours,lowcolours)

#both high and low contain midpoint (white) remove one of them
cols = cols[!duplicated(cols)]

#Example 
plot(1:nrow(dat), col=cols ,pch=18,cex=4)



回答2:

Here is finally what I did.

df = read.table("sampleData.txt",header=T) breaks=unique(quantile(df$weight,probs=seq(0,1,by=0.1))) df$deciles = cut(df$weight,breaks=breaks,include.lowest=TRUE) df$part = as.numeric(df$deciles) <= as.numeric(df$deciles[which(abs(0-df$weight)==min(abs(df$weight-0)))])

After this step, I can not just have as many as colors as data points. Thus I divided into color scale in the ratio of df$part

highCols = round (sum(df$part==FALSE)*10/(sum(df$part==FALSE)+sum(df$part==TRUE)))
lowCols = 10 - highCols
highColours = colorRampPalette(c("red","darkred"))(n = highCols)
lowColours = colorRampPalette(c("darkgreen","green","white"))(n=lowCols)
cols = c(lowColours,highColours)
cols = cols[!duplicated(cols)]
ggplot(df,aes(x=x,y=y)) + geom_polygon(aes(group=polyNr,fill=factor(deciles))) + scale_fill_manual(values=cols)

This give me the plot --

Though, the plot is fine still, for longer term I will prefer to have more direct method if possible. Thanks.