I'm sorry for the poor phrasing of the question, but it was the best I could do. I know exactly what I want, but not exactly how to ask for it.
Here is the logic demonstrated by an example:
Two conditions that take on the values 1 or 0 trigger a signal that also takes on the values 1 or 0. Condition A triggers the signal (If A = 1 then signal = 1, else signal = 0) no matter what. Condition B does NOT trigger the signal, but the signal stays triggered if condition B stays equal to 1 after the signal previously has been triggered by condition A. The signal goes back to 0 only after both A and B have gone back to 0.
1. Input:
2. Desired output (signal_d) and confirmation that a for loop can solve it (signal_l):
3. My attempt using numpy.where():
4. Reproducible snippet:
# Settings
import numpy as np
import pandas as pd
import datetime
# Data frame with input and desired output i column signal_d
df = pd.DataFrame({'condition_A':list('00001100000110'),
'condition_B':list('01110011111000'),
'signal_d':list('00001111111110')})
colnames = list(df)
df[colnames] = df[colnames].apply(pd.to_numeric)
datelist = pd.date_range(pd.datetime.today().strftime('%Y-%m-%d'), periods=14).tolist()
df['dates'] = datelist
df = df.set_index(['dates'])
# Solution using a for loop with nested ifs in column signal_l
df['signal_l'] = df['condition_A'].copy(deep = True)
i=0
for observations in df['signal_l']:
if df.ix[i,'condition_A'] == 1:
df.ix[i,'signal_l'] = 1
else:
# Signal previously triggered by condition_A
# AND kept "alive" by condition_B:
if df.ix[i - 1,'signal_l'] & df.ix[i,'condition_B'] == 1:
df.ix[i,'signal_l'] = 1
else:
df.ix[i,'signal_l'] = 0
i = i + 1
# My attempt with np.where in column signal_v1
df['Signal_v1'] = df['condition_A'].copy()
df['Signal_v1'] = np.where(df.condition_A == 1, 1, np.where( (df.shift(1).Signal_v1 == 1) & (df.condition_B == 1), 1, 0))
print(df)
This is pretty straight forward using a for loop with lagged values and nested if sentences, but I can't figure it out using vectorized functions like numpy.where()
. And I know this would be much faster for bigger data frames.
Thank you for any suggestions!