What is the meaning of include.lowest in reclassif

2019-07-12 18:46发布

问题:

Given the definition of open internal (does not include end points) and closed interval (includes end points), it's easy to understand the right argument in reclassify. But I am confused on the include.lowestargument. It mentions

indicating if a value equal to the lowest value in rcl (or highest value in the second column, for right = FALSE) should be included

the lowest value in rcl would be the first value, which according to right is not included by default, so setting include.lowest to true will include the lowest value. But the part about the "highest value in the second column", I don't understand what it's referring to. And what does "for right = FALSE" mean? The highest value in the second column should already be included anyway.

so if I have rcl=c(0,1,5, 1,Inf,10) by default it means 0>x>=1 becomes 5, and x>1 becomes 10. What happens if include.lowest is TRUE? 0>=x>=1 and....?

I find it confusing because the example given on the reclassify help file says that

all values >= 0 and <= 0.25 become 1, etc. m <- c(0, 0.25, 1, 0.25, 0.5, 2, 0.5, 1, 3)

but then the reclassify function in the example doesn't use the include.lowest so it shouldn't be all values >= 0 but >0.

EDIT: I find the help page very confusing, and given the answer the example's explanation in the help page is wrong.

回答1:

As I said in my comment, the way that right and include.lowest work are exactly the same as in R base function cut. For a simple illustration, I will use cut in below, with vector 1:10 and break points 1, 5, 10.

By default, right = TRUE, so all intervals will be left open and right closed, thus we have two intervals: (1, 5], (5, 10]. Note these together give another left open right closed interval (1, 10], where the lowest 1 is not included. include.lowest = TRUE will consider [1, 10] and do [1,5], (5,10]. Compare

cut(1:10, right = TRUE, breaks = c(1, 5, 10))
# [1] <NA>   (1,5]  (1,5]  (1,5]  (1,5]  (5,10] (5,10] (5,10] (5,10] (5,10]
#Levels: (1,5] (5,10]

cut(1:10, right = TRUE, breaks = c(1, 5, 10), include.lowest = TRUE)
# [1] [1,5]  [1,5]  [1,5]  [1,5]  [1,5]  (5,10] (5,10] (5,10] (5,10] (5,10]
#Levels: [1,5] (5,10]

Now, if we set right = FALSE, all intervals will be left closed and right open: [1, 5), [5, 10). In this case, the include.lowest = TURE essentially includes the highest value. Compare

cut(1:10, right = FALSE, breaks = c(1, 5, 10))
# [1] [1,5)  [1,5)  [1,5)  [1,5)  [5,10) [5,10) [5,10) [5,10) [5,10) <NA>  
#Levels: [1,5) [5,10)

cut(1:10, right = FALSE, breaks = c(1, 5, 10), include.lowest = TRUE)
# [1] [1,5)  [1,5)  [1,5)  [1,5)  [5,10] [5,10] [5,10] [5,10] [5,10] [5,10]
#Levels: [1,5) [5,10]

Back to raster::reclassify.

I find it confusing because the example given on the reclassify help file says that

all values >= 0 and <= 0.25 become 1, etc. m <- c(0, 0.25, 1, 0.25, 0.5, 2, 0.5, 1, 3)

Why? With above m, you have rcl matrix:

matrix(m, ncol = 3L, byrow = TRUE, dimnames = list(NULL, c("from", "to", value)))
#     from   to value
#[1,] 0.00 0.25     1
#[2,] 0.25 0.50     2
#[3,] 0.50 1.00     3

With right = TRUE and include.lowest = FALSE (default behaviour), you have

(0.00, 0,25]   --->   1
(0.25, 0.50]   --->   2
(0.50, 1.00]   --->   3

with right = TRUE and include.lowest = TRUE, you have

[0.00, 0,25]   --->   1
(0.25, 0.50]   --->   2
(0.50, 1.00]   --->   3