I am researching/backtesting a trading system.
I have a Pandas dataframe containing OHLC data and have added several calculated columns which identify price patterns that I will use as signals to initiate positions.
I would now like to add a further column that will keep track of the current net position. I have tried using df.apply(), but passing the dataframe itself as the argument instead of the row object, as with the latter I seem to be unable to look back at previous rows to determine whether they resulted in any price patterns:
open_campaigns = []
Campaign = namedtuple('Campaign', 'open position stop')
def calc_position(df):
# sum of current positions + any new positions
if entered_long(df):
open_campaigns.add(
Campaign(
calc_long_open(df.High.shift(1)),
calc_position_size(df),
calc_long_isl(df)
)
)
return sum(campaign.position for campaign in open_campaigns)
def entered_long(df):
return buy_pattern(df) & (df.High > df.High.shift(1))
df["Position"] = df.apply(lambda row: calc_position(df), axis=1)
However, this returns the following error:
ValueError: ('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()', u'occurred at index 1997-07-16 08:00:00')
Rolling window functions would seem to be the natural fit, but as I understand it, they only act on a single time series or column, so wouldn't work either as I need to access the values of multiple columns at multiple timepoints.
How should I in fact be doing this?