KDB query performance improvement

2019-08-13 15:01发布

问题:

I have a simple table containing prices that I'm using for stock algo back testing.

price_hist:([pxkey:`$()]price:`float$())
update `g#pxkey from `price_hist

pxkey is a concatenated string in the format 'MSFT_5M_201710060945', so stock=MSFT, price bar intervals=5 mins and datetime=201710060945. I used the concatenated string instead of individual columns because it's simple and I'm a KDB novice and I wanted to get something running quickly.

I have about 5 million rows in there and the performance is only marginally faster than MySql using the exact same data. Any ideas on how to improve this (either thru table structure, attributes, query, anything..)? FYI I'm using C# with qSharp library and to query i'm using this format which returns a dictionary:-

price_hist`MSFT_5M_201710060945

回答1:

Creating millions of generated symbols is never a good idea in kdb+. I would recommend using a keyed table instead of a dictionary:

bar5m:([sym:`$();time:`timestamp$()]price:`float$())

Once you populate it, you should be able to query it as follows

bar5m[(`MSFT;2017.10.06D09:45);`price]

To improve the performance, make sure the table is sorted by sym,time and put the p attribute on sym.

KDB query performance improvement

问题:

回答1:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮