String comparison technique used by Python

2018-12-31 05:29发布

I'm wondering how Python does string comparison, more specifically how it determines the outcome when a less than (<) or greater than (>) operator is used.

For instance if I put print('abc' < 'bac') I get True. I understand that it compares corresponding characters in the string, however its unclear as to why there is more, for lack of a better term, "weight" placed on the fact that a is less than b (first position) in first string rather than the fact that a is less than b in the second string (second position).

8条回答
姐姐魅力值爆表
2楼-- · 2018-12-31 05:52

Here is a sample code which compare two strings lexicographically.

  a = str(input())
  b = str(input())
  if 1<=len(a)<=100 and 1<=len(b)<=100:
    a = a.lower()
    b = b.lower()
    if a > b:
       print('1')
    elif a < b:
       print( '-1')
    elif a == b:
       print('0') 

for different inputs the outputs are-

1- abcdefg
   abcdeff
   1

2- abc
   Abc
   0

3- abs
   AbZ
  -1
查看更多
骚的不知所云
3楼-- · 2018-12-31 05:56

A pure Python equivalent for string comparisons would be:

def less(string1, string2):
    # Compare character by character
    for idx in range(min(len(string1), len(string2))):
        # Get the "value" of the character
        ordinal1, ordinal2 = ord(string1[idx]), ord(string2[idx])
        # If the "value" is identical check the next characters
        if ordinal1 == ordinal2:
            continue
        # If it's smaller we're finished and can return True
        elif ordinal1 < ordinal2:
            return True
        # If it's bigger we're finished and return False
        else:
            return False
    # We're out of characters and all were equal, so the result is False
    return False

This function does the equivalent of the real method (Python 3.6 and Python 2.7) just a lot slower. Also note that the implementation isn't exactly "pythonic" and only works for < comparisons. It's just to illustrate how it works. I haven't checked if it works like Pythons comparison for combined unicode characters.

A more general variant would be:

from operator import lt, gt

def compare(string1, string2, less=True):
    op = lt if less else gt
    for char1, char2 in zip(string1, string2):
        ordinal1, ordinal2 = ord(char1), ord(char1)
        if ordinal1 == ordinal2:
            continue
        elif op(ordinal1, ordinal2):
            return True
        else:
            return False
    return False
查看更多
登录 后发表回答