I have a graph where the extrapolation does not match the initial interpolation. I would like the heatmap to fill the entire image.
First, the interpolation code:
library(akima)
library(reshape2)
xmin <- signif(min(CBLo2$MD1))
xmax <- signif(max(CBLo2$MD1))
ymin <- signif(min(CBLo2$MD2))
ymax <- signif(max(CBLo2$MD2))
gridint <- 100
fld <- with(CBLo2, interp(x = MD1, y = MD2, z = Abundance,
xo=seq(xmin, xmax, length=gridint), yo=seq(ymin, ymax, length=gridint) ))
df <- melt(fld$z, na.rm = TRUE)
names(df) <- c("MD1", "MD2", "Abundance")
df$MD1 <- fld$x[df$MD1]
df$MD2 <- fld$y[df$MD2]
contour(fld) # test plot
I won't post the entire ggplot code (used for the plot below), just that needed to produce the heatmap:
ggplot() +
geom_tile(inherit.aes=FALSE,data = df, aes(x = MD1, y = MD2,fill = Abundance)) +
scale_fill_continuous(name = "Rain (mm)", low = "yellow", high = "green")
However, when I try to extrapolate the data (following examples from other posts), I get the following plot, which does not match the first well at all:
fld <- with(CBLo2, interp(x = MD1, y = MD2, z = Abundance, extrap=TRUE, linear=FALSE,
xo=seq(xmin, xmax, length=gridint), yo=seq(ymin, ymax, length=gridint) ))
Here is the data:
Abundance MD1 MD2
9 -0.59042 0.76793119
42 -0.48544284 -0.09465043
13 0.51250586 -0.24599322
84 -0.30857525 -0.21529624
2 0.90449257 0.679926
16 0.24536209 0.24016424
52 -0.43144002 -0.75474149
4 1.23830339 -0.11985391
37 -1.10235817 0.33886773
79 0.01757236 -0.59635386
What am I doing wrong? How can I make the extrapolation more accurate?
TLDR solution:
Add
linear = FALSE
to all theinterp()
codes for consistency, and specify the same limits inscale_fill_continuous()
.Explanation:
There are two issues here.
Issue 1: the code used to generate the first
fld
does not include the parameterlinear = FALSE
, while the code used for the second does.Let's compare the interpolated values:
Now add
linear = FALSE
to the firstfld
:Issue 2: The range of interpolated values are very different.
We can see that even when values match at the same MD1/MD2 coordinates, the range of values in the expanded df2 far exceeds the range for df3. In order to ensure the same mapping between Abundance values and colours, we have to specify the fill limits based on the combined range of the two.
I'll use an ugly but visually distinct gradient to illustrate this:
If we overlay the plots, they overlap exactly (transparency tweaked to show the edges for df3):