I am trying to iterate over columns AND rows in Pandas to cross-reference a list I have and count the cooccurrences.
My dataframe looks like:
+-------+-----+-----+----+----+-------+-------+------+
| Lemma | Dog | Cat | Sg | Pl | Good | Okay | Bad |
+-------+-----+-----+----+----+-------+-------+------+
| Dog | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Cat | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-------+-----+-----+----+----+-------+-------+------+
I have a list like:
c=[[dog, Sg, Good], [cat, Pl, Okay], [dog, Pl, Bad]
I want to go through every item in Lemma
, find it in c
and then for that list item look for any of the column names. If those column names are seen, I was to add +1. I also want to add a count if the Lemma items occur in a 3 word window of each other.
I've tried something like the following (ignoring the word window issue):
for idx, row in df.iterrows():
for columns in df:
for i in c:
if i[0]==row:
if columns in c[1]:
df.ix['columns','row'] +=1
But I get the error: "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."
My ideal results look like:
+-------+-----+-----+----+----+-------+-------+------+
| Lemma | Dog | Cat | Sg | Pl | Good | Okay | Bad |
+-------+-----+-----+----+----+-------+-------+------+
| Dog | 1 | 1 | 1 | 1 | 1 | 0 | 1 |
| Cat | 2 | 0 | 0 | 1 | 0 | 1 | 0 |
+-------+-----+-----+----+----+-------+-------+------+
Thanks!
You have several things that need to be changed.
1) Your list probably needs to have
Dog
instead ofdog
,Cat
instead ofcat
2) You probably want:
for column in df.columns
instead offor columns in df
3) You probably want:
if i[0] == row['Lemma']
instead ofif i[0]==row:
(this is where it was breaking4) You probably want
if column in i
instead ofif columns in c[1]
5) You probably want
df.ix[idx, column] += 1
instead ofdf.ix['columns','row'] +=1
cat
in thedog
column and vise versa.DataFrame
, I'd unpack thelist
oflists
into adict
then load thedict
into aDataFrame
, as shown below.Code:
Note: Each value in a
list
fromc
is akey
inLemma
. Reference python dictionaries. e.g. Withx = ['dog', 'Sg', 'Good']
,Lemma[x[0]][x[2]]
is the same asLemma['dog']['Good']
. The initial value ofLemma['dog']['Good']
= 0, thereforeLemma['dog']['Good']
= 0 + 1, then next time it would be 1 + 1, etc.Output:
Plot
Create the
dict
programmatically:create
sets
of words for thedict
keys
from thelist
oflists
:create
dict
ofdicts
:final
dict
: