I have been trying use a custom function that I found on here to recalculate median household income from census tracts aggregated to neighborhoods. My data looks like this
> inc_df[, 1:5]
San Francisco Bayview Hunters Point Bernal Heights Castro/Upper Market Chinatown
2500-9999 22457 1057 287 329 1059
10000-14999 20708 920 288 463 1327
1500-19999 12701 626 145 148 867
20000-24999 12106 491 285 160 689
25000-29999 10129 554 238 328 167
30000-34999 10310 338 257 179 289
35000-39999 9028 383 184 163 326
40000-44999 9532 472 334 173 264
45000-49999 8406 394 345 241 193
50000-59999 17317 727 367 353 251
60000-74999 25947 1037 674 794 236
75000-99999 36378 1185 980 954 289
100000-124999 33890 990 640 1208 199
125000-149999 24935 522 666 957 234
150000-199999 37190 814 1310 1535 150
200000-250001 65763 796 2122 3175 302
The function is as follows:
GroupedMedian <- function(frequencies, intervals, sep = NULL, trim = NULL) {
# If "sep" is specified, the function will try to create the
# required "intervals" matrix. "trim" removes any unwanted
# characters before attempting to convert the ranges to numeric.
if (!is.null(sep)) {
if (is.null(trim)) pattern <- ""
else if (trim == "cut") pattern <- "\\[|\\]|\\(|\\)"
else pattern <- trim
intervals <- sapply(strsplit(gsub(pattern, "", intervals), sep), as.numeric)
}
Midpoints <- rowMeans(intervals)
cf <- cumsum(frequencies)
Midrow <- findInterval(max(cf)/2, cf) + 1
L <- intervals[1, Midrow] # lower class boundary of median class
h <- diff(intervals[, Midrow]) # size of median class
f <- frequencies[Midrow] # frequency of median class
cf2 <- cf[Midrow - 1] # cumulative frequency class before median class
n_2 <- max(cf)/2 # total observations divided by 2
unname(L + (n_2 - cf2)/f * h)
}
And the code to apply the function looks like this:
GroupedMedian(inc_df[, "Bernal Heights"], rownames(inc_df), sep="-", trim="cut")
This all works fine but I can't figure out how to apply this to each column of the matrix instead of typing out each column name and running it again and again. I have tried this:
> minc_hood <- data.frame(apply(inc_df, 2, function(x) GroupedMedian(inc_df[, x],
rownames(inc_df), sep="-", trim="cut")))
But I get this error message
Error in inc_df[, x] : subscript out of bounds