pympler asizeof vs sys.getsizeof

2019-08-07 05:41发布

I have a pickled filed. Its size is 9.3MB.

-rw-r--r-- 1 ankit ankit 9.3M Jan  7 17:43 agg_397127.pkl

I load it in python using cPickle. I tried to ascertain its size using pympler asizeof. But there is a considerable difference size given by asize of and sys.getsizeof

from pympler import asizeof
import cPickle as pickle
path = "agg_397127.pkl"
temp  = pickle.load(open(path, 'rb'))
temp
{397127: RandomForestRegressor(bootstrap=True, criterion='band_predict',
           max_depth=None, max_features='auto', max_leaf_nodes=None,
           min_samples_leaf=1, min_samples_split=2,
           min_weight_fraction_leaf=0.0, n_estimators=1000, n_jobs=1,
           oob_score=False, random_state=0, verbose=0, warm_start=False)}
asizeof.asizeof(temp)
1328504
asizeof.flatsize(temp)
import sys
sys.getsizeof(temp)
280

Can someone explain why there is such a difference ?

标签: python pickle
1条回答
对你真心纯属浪费
2楼-- · 2019-08-07 06:25

sys.getsizeof() returns the size of the object passed to it - which is a dictionary with one entry, in your example. It does NOT include the size of the complex class instance referred to by the dictionary, nor any of the objects referred to by that instance. ANY dictionary with only a few entries (up to 5, on my Python version) would return exactly the same number.

The assizeof module you're using attempts to recursively add up the sizes of all these referred objects. It doesn't seem to have done a very good job in this case, considering the huge discrepancy between the size returned and the pickle size (but note that these numbers would never be exactly equal, since the format of a pickle on disk is necessarily different than the format of the actual objects in memory).

查看更多
登录 后发表回答