I am trying to solve this issue for past 3 months. Please help.

I have tick data (Price and Volume) for many stocks belonging to a single exchange. Each stock has its own .rds file on the hard disk. I am interested in cleaning it up:

merge multiple same time stamps by taking median
subset data for exchange hours only
aggregate it over 20 minutes by previous tick aggregation

I know that the

function aggregatets in highfrequency package

can perform the previous tick aggregation operation. However, the function takes one stock one day data only.

To demonstrate the problem I am using raw tick data (named trade) for a single stock.

    dput(head(trade,50))
structure(c(54.7, 54.7, 54.5, 54.5, 54.5, 54.6, 54.6, 54.65, 
54.65, 54.6, 54.65, 54.65, 54.65, 54.65, 54.7, 54.7, 54.8, 54.8, 
54.85, 54.85, 54.85, 54.85, 54.8, 54.8, 54.8, 54.8, 54.65, 54.65, 
54.8, 54.8, 54.8, 54.8, 54.65, 54.65, 54.65, 54.75, 54.65, 54.7, 
54.7, 54.7, 54.75, 54.75, 54.75, 54.75, 54.75, 54.7, 54.7, 54.7, 
54.65, 54.65, 8, 542, 110, 600, 88, 200, 150, 100, 700, 250, 
75, 100, 25, 200, 100, 600, 1546, 940, 100, 6250, 89, 6911, 89, 
211, 100, 50, 1410, 1090, 913, 4737, 50, 300, 2486, 400, 25, 
85, 250, 168, 50, 100, 40, 40, 60, 50, 40, 10, 91, 6072, 229, 
1000), class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"
), tclass = c("POSIXct", "POSIXt"), .indexTZ = "Asia/Calcutta", tzone = "Asia/Calcutta", index = structure(c(1459481853, 
1459481853, 1459482302, 1459482302, 1459482305, 1459482306, 1459482306, 
1459482307, 1459482307, 1459482308, 1459482312, 1459482314, 1459482314, 
1459482315, 1459482317, 1459482317, 1459482318, 1459482318, 1459482319, 
1459482319, 1459482320, 1459482320, 1459482322, 1459482322, 1459482330, 
1459482330, 1459482331, 1459482331, 1459482336, 1459482336, 1459482337, 
1459482337, 1459482338, 1459482338, 1459482339, 1459482340, 1459482344, 
1459482348, 1459482351, 1459482351, 1459482356, 1459482357, 1459482357, 
1459482361, 1459482362, 1459482364, 1459482367, 1459482367, 1459482369, 
1459482369), tzone = "Asia/Calcutta", tclass = c("POSIXct", "POSIXt"
)), .Dim = c(50L, 2L), .Dimnames = list(NULL, c("value", "size"
)))

I use the following code to do previous tick aggregation to 20 minute intervals:

require(xts)
require(highfrequency)
trade<-xts(trade[,-1], order.by = trade[,1])
trade2<-do.call(rbind, lapply(split(trade,"days"), mergeTradesSameTimestamp))
colnames(trade)[c(1,2)]<-c("PRICE", "SIZE")
trade2<-trade2["T09:30:00/T15:30:00"]
trade2<-trade2[,1]
fundo=function(x) aggregatets(FUN = previoustick,on="minutes",k=20, dropna =F)

As aggregatets() only takes data for 1 day I am splitting trade2 into days and apply it on them

trade3<-do.call(rbind, lapply(split(trade2, "days"), fundo))

But I get the error for function aggregatets:

    trade3<-do.call(rbind, lapply(split(trade2, "days"), fundo))
Error in FUN != "previoustick" : 
  comparison (2) is possible only for atomic and list types
Called from: aggregatets(FUN = previoustick, on = "minutes", k = 20, dropna = F)

Please suggest how to solve this error.

标签： r aggregate xts lapply

1条回答

仙女界的扛把子

2楼-- · 2019-07-07 17:59

This code works, based on the limited data you provided. Your error was from not passing though an object to argument ts. (Also in your sample data, none of the ticks happened before 9:30am, so for reproducibility of this answer I changed it to 8.30am. i.e. trade2<-trade2["T08:30:00/T15:30:00"]):

trade <- structure(c(54.7, 54.7, 54.5, 54.5, 54.5, 54.6, 54.6, 54.65, 
    54.65, 54.6, 54.65, 54.65, 54.65, 54.65, 54.7, 54.7, 54.8, 54.8, 
    54.85, 54.85, 54.85, 54.85, 54.8, 54.8, 54.8, 54.8, 54.65, 54.65, 
    54.8, 54.8, 54.8, 54.8, 54.65, 54.65, 54.65, 54.75, 54.65, 54.7, 
    54.7, 54.7, 54.75, 54.75, 54.75, 54.75, 54.75, 54.7, 54.7, 54.7, 
    54.65, 54.65, 8, 542, 110, 600, 88, 200, 150, 100, 700, 250, 
    75, 100, 25, 200, 100, 600, 1546, 940, 100, 6250, 89, 6911, 89, 
    211, 100, 50, 1410, 1090, 913, 4737, 50, 300, 2486, 400, 25, 
    85, 250, 168, 50, 100, 40, 40, 60, 50, 40, 10, 91, 6072, 229, 
    1000), class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"
    ), tclass = c("POSIXct", "POSIXt"), .indexTZ = "Asia/Calcutta", tzone = "Asia/Calcutta", index = structure(c(1459481853, 
    1459481853, 1459482302, 1459482302, 1459482305, 1459482306, 1459482306, 
    1459482307, 1459482307, 1459482308, 1459482312, 1459482314, 1459482314, 
    1459482315, 1459482317, 1459482317, 1459482318, 1459482318, 1459482319, 
    1459482319, 1459482320, 1459482320, 1459482322, 1459482322, 1459482330, 
    1459482330, 1459482331, 1459482331, 1459482336, 1459482336, 1459482337, 
    1459482337, 1459482338, 1459482338, 1459482339, 1459482340, 1459482344, 
    1459482348, 1459482351, 1459482351, 1459482356, 1459482357, 1459482357, 
    1459482361, 1459482362, 1459482364, 1459482367, 1459482367, 1459482369, 
    1459482369), tzone = "Asia/Calcutta", tclass = c("POSIXct", "POSIXt"
    )), .Dim = c(50L, 2L), .Dimnames = list(NULL, c("value", "size"
    )))

# mergeTradesSameTimestamp wants "PRICE" column, so rename now:
colnames(trade) <- c("PRICE", "SIZE")

trade2<-do.call(rbind, lapply(split(trade,"days"), mergeTradesSameTimestamp))
trade2<-trade2["T08:30:00/T15:30:00"]
# Your error was from not passing through x to argument ts:
fundo=function(x) aggregatets(ts = x, FUN = "previoustick",on="minutes",k=20, dropna =F)
trade3<-do.call(rbind, lapply(split(trade2, "days"), fundo))

0人赞添加讨论(0) 举报

performing previous tick aggregation using lapply

As aggregatets() only takes data for 1 day I am splitting trade2 into days and apply it on them

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间