Numpy output based on 3 sequential conditions?

I try to build a vectorized/parallel stock backtesting program. I implemented a sequential version with loops, but now I'm stuck at vectorizing the functionality. I'm looking to use Pandas/Numpy for that, here's a quick outline:

There are 2 given columns, left is order quantity (to be added to position), right is stops (if stop is 1, position gets reset to 0)

M = [[0.1, 0], # left column is order quantity, right is stop
     [0.1, 0],
     [0.5, 0],
     [0.5, 0],
     [0.3, 0],
     [-0.3, 0], # negative order quantity means short or sell
     [-0.1, 1]] # right column (stop) is 1, so position is reset to 0

And 2 columns which I want to calculate based on the initial matrix M: Left column is position (ranges from -1 to 1 but can't go beyond) based on order quantity and right column the executed order quantity

R = [[0.1, 0.1],
     [0.2, 0.1],
     [0.7, 0.5], # position (left column) is equal to cumsum of order quantity (from last stop trigger)
     [1, 0.3], # executed quantity is < order quantity as it's the remainder to position's max of 1
     [1, 0],
     [0.7, -0.3],
     [-0.1, -0.8]] # stop triggered, so position is reset to 0, and then -0.1 in order quantity is executed

Position is basically cumsum of order quantity, but only until 1 or -1, and only if stops are not triggered
Executed order quantity is either the order quantity if position limits are not exceeded, otherwise the remainder
Stops (when 1) reset the position to 0

The problem is that each condition is based on the other one. Does that mean this task can't be solved in parallel?

I can imagine an approach with quantity cumsum and indices where stops trigger, applied on the cumsum to calculate the executed quantity. I would appreciate any tips for elegant ways to solve this. Maybe which Numpy functions to look into, besides cumsum.

Edit: A very simplified version of the sequential version:

orders = [{'quantity': 0.1,'stop': 0},{'quantity': 0.1,'stop': 0},{'quantity': 0.5,'stop': 0},{'quantity': 0.5,'stop': 0},{'quantity': 0.3,'stop': 0},{'quantity': -0.3,'stop': 0},{'quantity': -0.1,'stop': 1}]
position = 0

for order in orders:
    position_beginning = position
    if order['stop'] == 1:
        position = 0

    if order['quantity']+position <= 1 and order['quantity']+position >= -1:
        position += order['quantity']
    elif position < 0 and order['quantity'] < 0:
        position = -1
    elif position > 0 and order['quantity'] > 0:
        position = 1


    executed_quantity = abs(position - position_beginning) * (1 if position > position_beginning else -1)
    print(position, executed_quantity)

In the actual app, the order quantities are much more complex, e.g. divided into sub quantities. The fact that the backtester has to run over millions of orders with sub quantities, makes things really slow using this loop approach.

标签： python pandas numpy parallel-processing scipy

0条回答

Numpy output based on 3 sequential conditions?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间