How to iterate over rows in a DataFrame in Pandas?

2018-12-31 06:12发布

I have a DataFrame from pandas:

import pandas as pd
inp = [{'c1':10, 'c2':100}, {'c1':11,'c2':110}, {'c1':12,'c2':120}]
df = pd.DataFrame(inp)
print df

Output:

   c1   c2
0  10  100
1  11  110
2  12  120

Now I want to iterate over the rows of this frame. For every row I want to be able to access its elements (values in cells) by the name of the columns. For example:

for row in df.rows:
   print row['c1'], row['c2']

Is it possible to do that in pandas?

I found this similar question. But it does not give me the answer I need. For example, it is suggested there to use:

for date, row in df.T.iteritems():

or

for row in df.iterrows():

But I do not understand what the row object is and how I can work with it.

14条回答
高级女魔头
2楼-- · 2018-12-31 06:33

To loop all rows in a dataframe and use values of each row conveniently, namedtuples can be converted to ndarrays. For example:

df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]}, index=['a', 'b'])

Iterating over the rows:

for row in df.itertuples(index=False, name='Pandas'):
    print np.asarray(row)

results in:

[ 1.   0.1]
[ 2.   0.2]

Please note that if index=True, the index is added as the first element of the tuple, which may be undesirable for some applications.

查看更多
时光乱了年华
3楼-- · 2018-12-31 06:33

Adding to the answers above, sometimes a useful pattern is:

# Borrowing @KutalmisB df example
df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]}, index=['a', 'b'])
# The to_dict call results in a list of dicts
# where each row_dict is a dictionary with k:v pairs of columns:value for that row
for row_dict in df.to_dict(orient='records'):
    print(row_dict)

Which results in:

{'col1':1.0, 'col2':0.1}
{'col1':2.0, 'col2':0.2}
查看更多
长期被迫恋爱
4楼-- · 2018-12-31 06:35

DataFrame.iterrows is a generator which yield both index and row

for index, row in df.iterrows():
    print(row['c1'], row['c2'])

Output: 
   10 100
   11 110
   12 120
查看更多
笑指拈花
5楼-- · 2018-12-31 06:35

IMHO, the simplest decision

 for ind in df.index:
     print df['c1'][ind], df['c2'][ind]
查看更多
栀子花@的思念
6楼-- · 2018-12-31 06:36

To loop all rows in a dataframe you can use:

for x in range(len(date_example.index)):
    print date_example['Date'].iloc[x]
查看更多
萌妹纸的霸气范
7楼-- · 2018-12-31 06:37

Why complicate things?

Simple.

import pandas as pd
import numpy as np

# Here is an example dataframe
df_existing = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))

for idx,row in df_existing.iterrows():
    print row['A'],row['B'],row['C'],row['D']
查看更多
登录 后发表回答