I am getting an exception as I try to slice with a logical expression my Pandas dataframe.
My data have the following form:
df
GDP_norm SP500_Index_deflated_norm
Year
1980 2.121190 0.769400
1981 2.176224 0.843933
1982 2.134638 0.700833
1983 2.233525 0.829402
1984 2.395658 0.923654
1985 2.497204 0.922986
1986 2.584896 1.09770
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 38 entries, 1980 to 2017
Data columns (total 2 columns):
GDP_norm 38 non-null float64
SP500_Index_deflated_norm 38 non-null float64
dtypes: float64(2)
memory usage: 912.0 bytes
The command is the following:
df[((df['GDP_norm'] >=3.5 & df['GDP_norm'] <= 4.5) & (df['SP500_Index_deflated_norm'] > 3)) | (
(df['GDP_norm'] >= 4.0 & df['GDP_norm'] <= 5.0) & (df['SP500_Index_deflated_norm'] < 3.5))]
The error message is the following:
TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]
Your advice will be appreciated.
I suggest create boolean masks separately for better readibility and also easier error handling.
Here are missing
()
inm1
andm2
code, problem is in operator precedence:docs - 6.16. Operator precedence where see
&
have higher priority as>=
:You are suffering from the effects of chained comparisons. What's happening is the expression
df['GDP_norm'] >=3.5 & df['GDP_norm'] <= 4.5
is evaluated as something like:Of course, this fails since
float
cannot be compared withbool
, as described in your error message. Instead, use parentheses to isolate each Boolean mask and assign to variables: