How much memory does this byte string actually tak

My understanding is that os.urandom(size) outputs a random string of bytes of the given "size", but then:

import os
import sys

print(sys.getsizeof(os.urandom(42)))

>>>
75

Why is this not 42?

And a related question:

import base64
import binascii


print(sys.getsizeof(base64.b64encode(os.urandom(42))))
print(sys.getsizeof(binascii.hexlify(os.urandom(42))))

>>>
89
117

Why are these so different? Which encoding would be the most memory efficient way to store a string of bytes such as that given by os.urandom?

Edit: It seems like quite a stretch to say that this question is a duplicate of What is the difference between len() and sys.getsizeof() methods in python? My question is not about the difference between len() and getsizeof(). I was confused about the memory used by Python objects in general, which the answer to this question has clarified for me.

标签： python python-3.x memory encoding base64

1条回答

Animai°情兽

2楼-- · 2019-07-09 09:20

Python byte string objects are more than just the characters that comprise them. They are fully fledged objects. As such they require more space to accommodate the object's components such as the type pointer (needed to identify what kind of object the bytestring even is) and the length (needed for efficiency and because Python bytestrings can contain null bytes).

The simplest object, an object instance, requires space:

>>> sys.getsizeof(object())
16

The second part of your question is simply because the strings produced by b64encode() and hexlify() have different lengths; the latter being 28 characters longer which, unsurprisingly, is the difference in the values reported by sys.getsizeof().

>>> s1 = base64.b64encode(os.urandom(42))
>>> s1
b'CtlMjDM9q7zp+pGogQci8gr0igJsyZVjSP4oWmMj2A8diawJctV/8sTa'
>>> s2 = binascii.hexlify(os.urandom(42))
>>> s2
b'c82d35f717507d6f5ffc5eda1ee1bfd50a62689c08ba12055a5c39f95b93292ddf4544751fbc79564345'

>>> len(s2) - len(s1)
28
>>> sys.getsizeof(s2) - sys.getsizeof(s1)
28

Unless you use some form of compression, there is no encoding that will be more efficient than the binary string that you already have, and this is particularly true in this case because the data is random which is inherently incompressible.

0人赞添加讨论(0) 举报

How much memory does this byte string actually tak

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间