Pickling objects that refer to each other

2019-08-08 19:40发布

问题:

I have three Python classes, Student, Event, and StudentEvent.

For simplicity:

class Student:
    def __init__(self, id):
        self.id = id

class Event:
    def __init__(self, id):
       self.id = id
       self.studentevents = []

class StudentEvent:
    def __init__(self, student, event, id):
        self.student = student
        self.event = event
        self.id = id

I have between thousands and millions of instances of each of these classes, which I put into dictionaries that I can read and analyze. Reading and creating the objects takes a lot of time, so I'd like to pickle them into them into 3 dictionaries, students_dict, events_dict, studentevents_dict.

So, fine, I can do that. But, if I un-pickle the dictionaries at a later date, the students and events in the studentevents_dict won't refer to the same Students and Events in the students_dict and events_dict, correct?

If I modify the objects at a later time, for example, populating the list of associated StudentEvents in the Event objects, that might be problematic because the event referenced by the StudentEvent won't be the Event with the same id in the events_dict.

回答1:

Correct. If you need to preserve the pointer relationship between the objects, you have to pickle them together in a tuple for example. Here I'm using dill instead of pickle, but the effect should be the same. This works for class instances (as shown), dicts, or otherwise.

>>> class A:
...   def __init__(self, b):
...     self.b = b
... 
>>> class B:
...   pass
... 
>>> import dill
>>>           
>>> b = B()
>>> a = A(b)
>>> 
>>> f = open('_sed', 'wb')
>>> dill.dump(({1:a},{2:b}), f)
>>> f.close()

Then later…

Python 2.7.8 (default, Jul 13 2014, 02:29:54) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> f = open('_sed', 'rb')
>>> t = dill.load(f)
>>> f.close()
>>> t
({1: <__main__.A instance at 0x10906a440>}, {2: <__main__.B instance at 0x10906a830>})
>>> t[0][1].b
<__main__.B instance at 0x10906a830>


标签: python pickle