Possible Duplicate:
Python ‘self’ explained
I'm learning Python and I have a question, more theoretical than practical, regarding access class variables from method of this class.
For example we have:
class ExampleClass:
x = 123
def example_method(self):
print(self.x)
Why is necessarily to write exactly self.x
, not just x
? x
belongs to namespace of the class, and method using it belongs to it too. What am I missing? What a rationale stands behind such style?
In C++ you can write:
class ExampleClass {
public:
int x;
void example_method()
{
x = 123;
cout << x;
};
};
And it will work!
From The History of Python: Adding Support for User-defined Classes:
Instead, I decided to give up on the idea of implicit references to
instance variables. Languages like C++ let you write this->foo to
explicitly reference the instance variable foo (in case there’s a
separate local variable foo). Thus, I decided to make such explicit
references the only way to reference instance variables. In addition,
I decided that rather than making the current object ("this") a
special keyword, I would simply make "this" (or its equivalent) the
first named argument to a method. Instance variables would just always
be referenced as attributes of that argument.
With explicit references, there is no need to have a special syntax
for method definitions nor do you have to worry about complicated
semantics concerning variable lookup. Instead, one simply defines a
function whose first argument corresponds to the instance, which by
convention is named "self." For example:
def spam(self,y):
print self.x, y
This approach resembles something I had seen in Modula-3, which had
already provided me with the syntax for import and exception handling.
Modula-3 doesn’t have classes, but it lets you create record types
containing fully typed function pointer members that are initialized
by default to functions defined nearby, and adds syntactic sugar so
that if x is such a record variable, and m is a function pointer
member of that record, initialized to function f, then calling
x.m(args) is equivalent to calling f(x, args). This matches the
typical implementation of objects and methods, and makes it possible
to equate instance variables with attributes of the first argument.
So, stated by the BDFL himself, the only real reason he decided to use explicit self over implicit self is that:
- it is explicit
- it is easier to implement, since the lookup must be done at runtime(and not at compile time like other languages) and having implicit self could have increased the complexity(and thus cost) of the lookups.
Edit: There is also an answer in the Python FAQ.
It seems to be related to module vs. class scope handling, in Python:
COLOR = 'blue'
class TellColor(object):
COLOR = 'red'
def tell(self):
print self.COLOR # references class variable
print COLOR # references module variable
a = TellColor()
a.tell()
> red
> blue
Here's the content I did in an ancient answer concerning this feature:
The problem you encountered is due to this:
A block is a piece of Python program text that is executed as a unit.
The following are blocks: a module, a function body, and a class
definition.
(...)
A scope defines the visibility of a name within a
block.
(...)
The scope of names defined in a class block is limited to
the class block; it does not extend to the code blocks of methods –
this includes generator expressions since they are implemented using a
function scope. This means that the following will fail:
class A:
a = 42
b = list(a + i for i in range(10))
http://docs.python.org/reference/executionmodel.html#naming-and-binding
The above means:
a function body is a code block and a method is a function, then names defined out of the function body present in a class definition do not extend to the function body.
It appeared strange to me, when I was reading this, but that's how Python is crafted:
The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods
That's the official documentation that says this.
.
EDIT
heltonbiker wrote an interesting code:
COLOR = 'blue'
class TellColor(object):
COLOR = 'red'
def tell(self):
print self.COLOR # references class variable
print COLOR # references module variable
a = TellColor()
a.tell()
> red
> blue
It made me wonder how the instruction print COLOR
written inside the method tell()
provokes the printing of the value of the global object COLOR defined outside the class.
I found the answer in this part of the official documentation:
Methods may reference global names in the same way as ordinary
functions. The global scope associated with a method is the module
containing its definition. (A class is never used as a global scope.)
While one rarely encounters a good reason for using global data in a
method, there are many legitimate uses of the global scope: for one
thing, functions and modules imported into the global scope can be
used by methods, as well as functions and classes defined in it.
Usually, the class containing the method is itself defined in this
global scope (...)
http://docs.python.org/2/tutorial/classes.html#method-objects
When the interpreter has to execute print self.COLOR
, as COLOR isn't an instance attribute (that is to say the identifier 'COLOR' doesn't belong to the namespace of the instance), the interpreter goes in the namespace of the class of the instance in search for the identifier 'COLOR' and find it, so it prints the value of TellColor.COLOR
When the interpreter has to execute print COLOR
, as there is no attribute access written in this instruction, it will search for the identifier 'COLOR' in the global namespace, which the official documentation says it's the module's namespace.
What attribute names are attached to an object (and its class, and the ancestors of that class) is not decidable at compile time. So you either make attribute lookup explicit, or you:
- eradicate local variables (in methods) and always use instance variables. This does no good, as it essentially removes local variables with all their advantages (at least in methods).
- decide whether a base
x
refers to an attribute or local at runtime (with some extra rules to decide when x = ...
adds a new attribute if there's no self.x
). This makes code less readable, as you never know which one a name is supposed to be, and essentially turns every local variable in all methods into part of the public interface (as attaching an attribute of that name changes the behavior of a method).
Both have the added disadvantage that they require special casing for methods. Right now, a "method" is just a regular function that happens to be accessible through a class attribute. This is very useful for a wide variety of use good cases.