Pickling issue with python pathos

2019-02-15 23:30发布

问题:

import pathos.multiprocessing as mp
class Model_Output_File():
    """
    Class to read Model Output files
    """
    def __init__(self, ftype = ''):
        """
        Constructor
        """
        # Create a sqlite database in the analysis directory
        self.db_name = 'sqlite:///' + constants.anly_dir + os.sep + ftype + '_' + '.db'
        self.engine  = create_engine(self.db_name)
        self.ftype   = ftype

    def parse_DGN(self, fl):
        df      = pandas.read_csv(...)
        df.to_sql(self.db_name, self.engine, if_exists='append')

    def collect_epic_output(self, fls):
        pool = mp.ProcessingPool(4)
        if(self.ftype == 'DGN'):
            pool.map(self.parse_DGN, fls)
        else:
            logging.info( 'Wrong file type')

if __name__ == '__main__':
    list_fls = fnmatch.filter(...)
    obj = Model_Output_File(ftype = 'DGN')
    obj.collect_model_output(list_fls)

In the code above, I am using the pathos multiprocessing library to avoid python multiprocessing issues with classes. However I am getting a pickling error:

  pool.map(self.parse_DGN, fls)
  File "C:\Anaconda64\lib\site-packages\pathos-0.2a1.dev0-py2.7.egg\pathos\multiprocessing.py", line 131, in map
    return _pool.map(star(f), zip(*args)) # chunksize
  File "C:\Anaconda64\lib\multiprocessing\pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "C:\Anaconda64\lib\multiprocessing\pool.py", line 567, in get
    raise self._value
cPickle.PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

How do I fix this?

回答1:

I'm the pathos author. You are getting a cPickle.PicklingError… which you should not get with pathos. Make sure you have multiprocess installed, and if you do, that you have a C++ compiler. You can check for pickling errors by importing dill, and doing a dill.copy(self.parse_DGN) inside your class, or externally using the instance of the class. If that works, then you probably have some installation issue, where pathos is finding the python standard library multiprocessing. If so, then you probably need to install a compiler… like Microsoft Visual Studio Community. See: github.com/mmckerns/tuthpc. Make sure to rebuild multiprocess after the install of the MS compiler.



回答2:

I encountered the same problem. Mystery is that the same identical code works on 1 win7 machine and not another win7! Then I checked the versions --- turned out dill and multiprocess were 1 version higher on the balky machine. I down-versioned dill and multiprocess to 0.2.5 and 0.70.4 respectively. And that solved the problem! Hope that helps