performing previous tick aggregation using lapply

2019-07-07 17:04发布

I am trying to solve this issue for past 3 months. Please help.

I have tick data (Price and Volume) for many stocks belonging to a single exchange. Each stock has its own .rds file on the hard disk. I am interested in cleaning it up:

  1. merge multiple same time stamps by taking median

  2. subset data for exchange hours only

  3. aggregate it over 20 minutes by previous tick aggregation

I know that the

function aggregatets in highfrequency package

can perform the previous tick aggregation operation. However, the function takes one stock one day data only.

To demonstrate the problem I am using raw tick data (named trade) for a single stock.

    dput(head(trade,50))
structure(c(54.7, 54.7, 54.5, 54.5, 54.5, 54.6, 54.6, 54.65, 
54.65, 54.6, 54.65, 54.65, 54.65, 54.65, 54.7, 54.7, 54.8, 54.8, 
54.85, 54.85, 54.85, 54.85, 54.8, 54.8, 54.8, 54.8, 54.65, 54.65, 
54.8, 54.8, 54.8, 54.8, 54.65, 54.65, 54.65, 54.75, 54.65, 54.7, 
54.7, 54.7, 54.75, 54.75, 54.75, 54.75, 54.75, 54.7, 54.7, 54.7, 
54.65, 54.65, 8, 542, 110, 600, 88, 200, 150, 100, 700, 250, 
75, 100, 25, 200, 100, 600, 1546, 940, 100, 6250, 89, 6911, 89, 
211, 100, 50, 1410, 1090, 913, 4737, 50, 300, 2486, 400, 25, 
85, 250, 168, 50, 100, 40, 40, 60, 50, 40, 10, 91, 6072, 229, 
1000), class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"
), tclass = c("POSIXct", "POSIXt"), .indexTZ = "Asia/Calcutta", tzone = "Asia/Calcutta", index = structure(c(1459481853, 
1459481853, 1459482302, 1459482302, 1459482305, 1459482306, 1459482306, 
1459482307, 1459482307, 1459482308, 1459482312, 1459482314, 1459482314, 
1459482315, 1459482317, 1459482317, 1459482318, 1459482318, 1459482319, 
1459482319, 1459482320, 1459482320, 1459482322, 1459482322, 1459482330, 
1459482330, 1459482331, 1459482331, 1459482336, 1459482336, 1459482337, 
1459482337, 1459482338, 1459482338, 1459482339, 1459482340, 1459482344, 
1459482348, 1459482351, 1459482351, 1459482356, 1459482357, 1459482357, 
1459482361, 1459482362, 1459482364, 1459482367, 1459482367, 1459482369, 
1459482369), tzone = "Asia/Calcutta", tclass = c("POSIXct", "POSIXt"
)), .Dim = c(50L, 2L), .Dimnames = list(NULL, c("value", "size"
)))

I use the following code to do previous tick aggregation to 20 minute intervals:

require(xts)
require(highfrequency)
trade<-xts(trade[,-1], order.by = trade[,1])
trade2<-do.call(rbind, lapply(split(trade,"days"), mergeTradesSameTimestamp))
colnames(trade)[c(1,2)]<-c("PRICE", "SIZE")
trade2<-trade2["T09:30:00/T15:30:00"]
trade2<-trade2[,1]
fundo=function(x) aggregatets(FUN = previoustick,on="minutes",k=20, dropna =F)

As aggregatets() only takes data for 1 day I am splitting trade2 into days and apply it on them

trade3<-do.call(rbind, lapply(split(trade2, "days"), fundo))

But I get the error for function aggregatets:

    trade3<-do.call(rbind, lapply(split(trade2, "days"), fundo))
Error in FUN != "previoustick" : 
  comparison (2) is possible only for atomic and list types
Called from: aggregatets(FUN = previoustick, on = "minutes", k = 20, dropna = F)

Please suggest how to solve this error.

1条回答
仙女界的扛把子
2楼-- · 2019-07-07 17:59

This code works, based on the limited data you provided. Your error was from not passing though an object to argument ts. (Also in your sample data, none of the ticks happened before 9:30am, so for reproducibility of this answer I changed it to 8.30am. i.e. trade2<-trade2["T08:30:00/T15:30:00"]):

trade <- structure(c(54.7, 54.7, 54.5, 54.5, 54.5, 54.6, 54.6, 54.65, 
    54.65, 54.6, 54.65, 54.65, 54.65, 54.65, 54.7, 54.7, 54.8, 54.8, 
    54.85, 54.85, 54.85, 54.85, 54.8, 54.8, 54.8, 54.8, 54.65, 54.65, 
    54.8, 54.8, 54.8, 54.8, 54.65, 54.65, 54.65, 54.75, 54.65, 54.7, 
    54.7, 54.7, 54.75, 54.75, 54.75, 54.75, 54.75, 54.7, 54.7, 54.7, 
    54.65, 54.65, 8, 542, 110, 600, 88, 200, 150, 100, 700, 250, 
    75, 100, 25, 200, 100, 600, 1546, 940, 100, 6250, 89, 6911, 89, 
    211, 100, 50, 1410, 1090, 913, 4737, 50, 300, 2486, 400, 25, 
    85, 250, 168, 50, 100, 40, 40, 60, 50, 40, 10, 91, 6072, 229, 
    1000), class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"
    ), tclass = c("POSIXct", "POSIXt"), .indexTZ = "Asia/Calcutta", tzone = "Asia/Calcutta", index = structure(c(1459481853, 
    1459481853, 1459482302, 1459482302, 1459482305, 1459482306, 1459482306, 
    1459482307, 1459482307, 1459482308, 1459482312, 1459482314, 1459482314, 
    1459482315, 1459482317, 1459482317, 1459482318, 1459482318, 1459482319, 
    1459482319, 1459482320, 1459482320, 1459482322, 1459482322, 1459482330, 
    1459482330, 1459482331, 1459482331, 1459482336, 1459482336, 1459482337, 
    1459482337, 1459482338, 1459482338, 1459482339, 1459482340, 1459482344, 
    1459482348, 1459482351, 1459482351, 1459482356, 1459482357, 1459482357, 
    1459482361, 1459482362, 1459482364, 1459482367, 1459482367, 1459482369, 
    1459482369), tzone = "Asia/Calcutta", tclass = c("POSIXct", "POSIXt"
    )), .Dim = c(50L, 2L), .Dimnames = list(NULL, c("value", "size"
    )))

# mergeTradesSameTimestamp wants "PRICE" column, so rename now:
colnames(trade) <- c("PRICE", "SIZE")

trade2<-do.call(rbind, lapply(split(trade,"days"), mergeTradesSameTimestamp))
trade2<-trade2["T08:30:00/T15:30:00"]
# Your error was from not passing through x to argument ts:
fundo=function(x) aggregatets(ts = x, FUN = "previoustick",on="minutes",k=20, dropna =F)
trade3<-do.call(rbind, lapply(split(trade2, "days"), fundo))
查看更多
登录 后发表回答