comparing two strings with 'is' — not perf

2019-07-30 18:59发布

问题:

I'm attempting to compare two strings with is. One string is returned by a function, and the other is just declared in the comparison. is tests for object identity, but according to this page, it also works with two identical strings because of Python's memory optimization. But, the following doesn't work:

def uSplit(ustring):
        #return user minus host
        return ustring.split('!',1)[0]

user = uSplit('theuser!host')
print type(user)
print user
if user is 'theuser':
    print 'ok'
else:
    print 'failed'

user = 'theuser'

if user is 'theuser':
    print 'ok'

The output:

type 'str'
theuser
failed
ok

I'm guessing the reason for this is a string returned by a function is a different "type" of string than a string literal. Is there anyway to get a function to return a string literal? I know I could use ==, but I'm just curious.

回答1:

The site you quote says this:

If two string literals are equal, they have been put to same memory location.

But

uSplit('theuser!host')

is not a string literal -- it's the result of an operation on the literal 'theuser!host'.

Anyway, you usually shouldn't check for string equality using is, because this memory optimization in any case is just an implementation detail you shouldn't rely on.


Also, You should use is for things like is None. Use it for checking to see if two objects -- of classes that you designed -- are the same instance. You can't easily use it for strings or numbers because the rules for creation of those built-in classes are complex. Some strings are interned. Some numbers, similarly, are interned.



回答2:

That page you quoted says "If two string literals are equal, they have been put to same memory location" (emphasis mine). Python interns literal strings, but strings that are returned from some arbitrary function are separate objects. The is operator can be thought of as a pointer comparison, so two different objects will not compare as identical (even if they contain the same characters, ie. they are equal).



回答3:

What you have run into is the fact that Python does not always intern all of its strings. More detail here:

http://mail.python.org/pipermail/tutor/2009-July/070157.html