unused arguments error using apply() in R

2019-07-24 04:53发布

问题:

I get an error message when I attempt to use apply() conditional on a column of dates to return a set of coefficients.

I have a dataset (herein modified for simplicity, but reproducible):

ADataset <- data.table(Epoch = c("2007-11-15", "2007-11-16", "2007-11-17", 
                       "2007-11-18", "2007-11-19", "2007-11-20", "2007-11-21"),
                       Distance = c("92336.22", "92336.23", "92336.22", "92336.20",
                       "92336.19", "92336.21", "92336.18))
ADataset
        Epoch Distance
1: 2007-11-15 92336.22
2: 2007-11-16 92336.23
3: 2007-11-17 92336.22
4: 2007-11-18 92336.20
5: 2007-11-19 92336.19
6: 2007-11-20 92336.21
7: 2007-11-21 92336.18

The analysis begins with establishing start and end dates:

############## Establish dates for analysis
#4.Set date for center of duration
StartDate <- "2007-11-18"
as.numeric(as.Date(StartDate)); StartDate
EndDate <- as.Date(tail(Adataset$Epoch,1)); EndDate

Then I establish time durations for analysis:

#5.Quantify duration of time window
STDuration <-  1
LTDuration  <- 3

Then I write functions to regress over both durations and return the slopes:

# Write STS and LTS functions, each with following steps
#6.Define time window- from StartDate less ShortTermDuration to 
StartDate plus ShortTermDuration
#7.Define Short Term & Long Term datasets
#8. Run regression over dataset
my_STS_Function <- function (StartDate) {

  STAhead  <- as.Date(StartDate) + STDuration; STAhead
  STBehind <- as.Date(StartDate) - STDuration; STBehind
  STDataset  <- subset(Adataset, as.Date(Epoch) >= STBehind & as.Date(Epoch)<STAhead)
  STResults <- rlm( Distance ~ Epoch, data=STDataset); STResults
  STSummary <- summary( STResults ); STSummary
  # Return coefficient (Slope of regression)
  STNum <- STResults$coefficients[2];STNum
}
my_LTS_Function <- function (StartDate) {
  LTAhead  <- as.Date(StartDate) + LTDuration; LTAhead
  LTBehind <- as.Date(StartDate) - LTDuration; LTBehind
  LTDataset  <- subset(Adataset, as.Date(Epoch) >= LTBehind & as.Date(Epoch)<LTAhead)
  LTResults <- rlm( Distance ~ Epoch, data=LTDataset); LTResults
  LTSummary <- summary( LTResults ); LTSummary
  # Return coefficient (Slope of regression)
  LTNum <- LTResults$coefficients[2];LTNum

Then I test the function to make sure it works for a single date:

myTestResult <- my_STS_Function("2007-11-18")

It works, so I move on to apply the function over the range of dates in the dataset:

mySTSResult <- apply(Adataset, 1, my_STS_Function, seq(StartDate : EndDate))

...in which my desired result is a list or array or vector of mySTSResult (slopes) (and, subsequently, a separate list/array/vector of myLTSResults so then I can create a STSlope:LTSlope ratio over the duration), something like (mySTSResults fabricated)...

> Adataset
    Epoch Distance mySTSResults
1: 2007-11-15 92336.22            3
2: 2007-11-16 92336.23            4
3: 2007-11-17 92336.22            5
4: 2007-11-18 92336.20            6
5: 2007-11-19 92336.19            7
6: 2007-11-20 92336.21            8
7: 2007-11-21 92336.18            9

Only I get this error:

Error in FUN(newX[, i], ...) : unused argument(s) (1:1185)

What is this telling me and how to do correct it? I've done some looking and cannot find the correction.

Hopefully I've explained this sufficiently. Please let me know if you need further details.

回答1:

Ok, it seems the problem is in the additional arguments to my_STS_Function as stated in your apply function call (as you have defined it with only one parameter). The date range is being passed as an additional parameter to that function, and R is complaining that it is unused (a vector of 1185 elements it seems). Are you rather trying to pull a subset of the rows restricted by date range first, then wishing to apply the my_STS_Function? I'd have to think a bit on an exact solution to that.

Sorry - I did my working out in the comments there. A possible solution is this:

subSet <- Adataset[Adataset[,1] %in% seq(StartDate:EndDate),][order(na.exclude(match(Adataset[,1], seq(StartData,EndDate))),]

Adapted from the answer in this question:

R select rows in matrix from another vector (match, %in)



回答2:

Adding this as a new answer as the previous one was getting confused. A previous commenter was correct, there are bugs in your code, but they aren't a sticking point.

My updated approach was to use seq.Date to generate the date sequence (only works if you have a data point for each day between the start and end - though you could use na.exclude as above):

dates = seq.Date(as.Date(StartDate),as.Date(EndDate),"days")

You then use this as the input to apply, with some munging of types to get things working correctly (I've done this with a lamda function):

mySTSResult <- apply(as.matrix(dates), 1, function(x) {class(x) <- "Date"; my_STS_Function(x)})

Then hopefully you should have a vector of the results, and you should be able to do something similar for LTS, and then manipulate that into another column in your original data frame/matrix.



标签: r regression