Here's some example data for the problem I'm working on:
index Quarter Sales_Growth
0 2001q1 0
1 2002q2 0
2 2002q3 1
3 2002q4 0
4 2003q1 0
5 2004q2 0
6 2004q3 1
7 2004q4 1
The Sales_Growth
column tells me if there was indeed sales growth in the quarter or not. 0 = no growth, 1 = growth.
First, I'm trying to return the first Quarter
when there were two consecutive quarters of no sales growth.
With the data above this answer would be 2001q1
.
Then, I want to return the 2nd quarter of consecutive sales growth that occurs AFTER the initial two quarters of no growth.
The answer to this question would be 2004q4
.
I've searched and searched but the closest answer I can find I can't get to work: https://stackoverflow.com/a/26539166/3225420
Thanks in advance for helping a Pandas newbie, I'm hacking away as best I can but stuck on this one.
You're doing subsequence matching. This is a bit strange, but bear with me:
growth = df.Sales_Growth.astype(str).str.cat()
That gives you:
'00100011'
Then:
growth.index('0011')
Gives you 4 (obviously you'd add a constant 3 to get the index of the last row matched by the pattern).
I feel this approach starts off a bit ugly, but the end result is really usable--you can search for any fixed pattern with no additional coding.
For Q1:
temp = df.Sales_Growth + df.Sales_Growth.shift(-1)
df[temp == 0].head(1)
For Q2:
df[(df.Sales_Growth == 1) & (df.Sales_Growth.shift(1) == 1) & (df.Sales_Growth.shift(2) == 0) & (df.Sales_Growth.shift(3) == 0)].head(1)
Building on the earlier answers. Q1:
temp = df.Sales_Growth.rolling_apply(window=2, min_periods=2, \
kwargs={pattern: [0,0]}, func=lambda x, pattern: x == pattern)
print(df[temp==1].head())
In the rolling_apply call, window
and min_periods
must match the length of the pattern list being passed to the rolling_apply function.
Q2: Same approach, different pattern:
temp = df.Sales_Growth.rolling_apply(window=4, min_periods=4, \
kwargs={pattern: [0,0,1,1]}, func=lambda x, pattern: x == pattern)
print(df[temp==1].head())