What is an object reference in Python?

2019-06-23 17:44发布

A introductory Python textbook defined 'object reference' as follows, but I didn't understand:

An object reference is nothing more than a concrete representation of the object’s identity (the memory address where the object is stored).

The textbook tried illustrating this by using an arrow to show an object reference as some sort of relation going from a variable a to an object 1234 in the assignment statement a = 1234.

From what I gathered off of Wikipedia, the (object) reference of a = 1234 would be an association between a and 1234 were a was "pointing" to 1234 (feel free to clarify "reference vs. pointer"), but it has been a bit difficult to verify as (1) I'm teaching myself Python, (2) many search results talk about references for Java, and (3) not many search results are about object references.

So, what is an object reference in Python? Thanks for the help!

2条回答
Melony?
2楼-- · 2019-06-23 18:38

Whatever is associated with a variable name has to be stored in the program's memory somewhere. An easy way to think of this, is that every byte of memory has an index-number. For simplicity's sake, lets imagine a simple computer, these index-numbers go from 0 (the first byte), upwards to however many bytes there are.

Say we have a sequence of 37 bytes, that a human might interpret as some words:

"The Owl and the Pussy-cat went to sea"

The computer is storing them in a contiguous block, starting at some index-position in memory. This index-position is most often called an "address". Obviously this address is absolutely just a number, the byte-number of the memory these letters are residing in.

@12000 The Owl and the Pussy-cat went to sea

So at address 12000 is a T, at 12001 an h, 12002 an e ... up to the last a at 12037.

I am labouring the point here because it's fundamental to every programming language. That 12000 is the "address" of this string. It's also a "reference" to it's location. For most intents and purposes an address is a pointer is a reference. Different languages have differing syntactic handling of these, but essentially they're the same thing - dealing with a block of data at a given number.

Python and Java try to hide this addressing as much as possible, where languages like C are quite happy to expose pointers for exactly what they are.

The take-away from this, is that an object reference is the number of where the data is stored in memory. (As is a pointer.)

Now, most programming languages distinguish between simple types: characters and numbers, and complex types: strings, lists and other compound-types. This is where the reference to an object makes a difference.

So when performing operations on simple types, they are independent, they each have their own memory for storage. Imagine the following sequence in python:

>>> a = 3
>>> b = a
>>> b
3
>>> b = 4
>>> b
4
>>> a
3      # <-- original has not changed

The variables a and b do not share the memory where their values are stored. But with a complex type:

>>> s = [ 1, 2, 3 ]
>>> t = s
>>> t
[1, 2, 3]
>>> t[1] = 8
>>> t
[1, 8, 3]
>>> s
[1, 8, 3]  # <-- original HAS changed

We assigned t to be s, but obviously in this case t is s - they share the same memory. Wait, what! Here we have found out that both s and t are a reference to the same object - they simply share (point to) the same address in memory.

One place Python differs from other languages is that it considers strings as a simple type, and these are independent, so they behave like numbers:

>>> j = 'Pussycat'
>>> k = j
>>> k
'Pussycat'
>>> k = 'Owl'
>>> j
'Pussycat'  # <-- Original has not changed

Whereas in C strings are definitely handled as complex types, and would behave like the Python list example.

The upshot of all this, is that when objects that are handled by reference are modified, all references-to this object "see" the change. So if the object is passed to a function that modifies it (i.e.: the content of memory holding the data is changed), the change is reflected outside that function too.

But if a simple type is changed, or passed to a function, it is copied to the function, so the changes are not seen in the original.

For example:

def fnA( my_list ):
    my_list.append( 'A' )

a_list = [ 'B' ]
fnA( a_list )
print( str( a_list ) )
['B', 'A']        # <-- a_list was changed inside the function

But:

def fnB( number ):
    number += 1

x = 3
fnB( x )
print( x )
3                # <-- x was NOT changed inside the function

So keeping in mind that the memory of "objects" that are used by reference is shared by all copies, and memory of simple types is not, it's fairly obvious that the two types operate differently.

查看更多
冷血范
3楼-- · 2019-06-23 18:49

Objects are things. Generally, they're what you see on the right hand side of an equation.

Variable names (often just called "names") are references to the actual object. When a name is on the right hand side of an equation1, the object that it references is automatically looked up and used in the equation. The result of the expression on the right hand side is an object. The name on the left hand side of the equation becomes a reference to this (possibly new) object.

Note, you can have object references that aren't explicit names if you are working with container objects (like lists or dictionaries):

a = []  # the name a is a reference to a list.
a.append(12345)  # the container list holds a reference to an integer object

In a similar way, multiple names can refer to the same object:

a = []
b = a

We can demonstrate that they are the same object by looking at the id of a and b and noting that they are the same. Or, we can look at the "side-effects" of mutating the object referenced by a or b (if we mutate one, we mutate both because they reference the same object).

a.append(1)
print a, b  # look mom, both are [1]!

1More accurately, when a name is used in an expression

查看更多
登录 后发表回答