Python crashes using pandas and str.strip

2019-05-10 22:58发布

问题:

This minimal code crashes my Python. (Setting: pandas 0.13.0, python 2.7.3 AMD64, Win7.)

import pandas as pd
input_file = r"c3.csv"
input_df = pd.read_csv(input_file)
for col in input_df.columns:  # strip whitespaces from string values
    if input_df[col].dtype == object:
        input_df[col] = input_df[col].apply(lambda x: x.strip())
print 'start'
for idx in range(len(input_df)):
    input_df['LL'].iloc[idx] = 3
    print idx
print 'finished'

Output:

start
0

Process finished with exit code -1073741819

What prevents the crash:

  1. Removing lines from c3.csv.
  2. Removing .strip() from the code.
  3. Changing c3.csv changes the amount of for iterations until the crash in unexpected ways.

Contents of c3.csv:

 Size    , B/S , Symbol    , Type , BN , Duration , VR , Time    , SR ,LL,
0, xxxx , xxxx0 ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
00, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,

回答1:

You are doing a chained assignment which can behave in unexpected ways. see here: http://pandas.pydata.org/pandas-docs/dev/indexing.html#indexing-view-versus-copy. This is fixed in master and will work in 0.13.1 (coming soon). see here: https://github.com/pydata/pandas/pull/6031

This is not correct to do:

input_df['LL'].iloc[idx] = 3

Instead do:

input_df.ix[ix,'LL'] = 3

Or even better (as you are assigning ALL rows to 3)

input_df['LL'] = 3

If you are assigning just some of the rows (and have say an integer/boolean indexer)

input_df.ix[indexer,'LL'] = 3

You should also just do this to strip the whitespace:

input_df[col] = input_df[col].str.strip()