Printed length of a string in python

2019-04-27 03:04发布

问题:

Is there any way to find (even a best guess) the "printed" length of a string in python? E.g. 'potaa\bto' is 8 characters in len but only 6 characters wide printed on a tty.

Expected usage:

s = 'potato\x1b[01;32mpotato\x1b[0;0mpotato'
len(s)   # 32
plen(s)  # 18

回答1:

At least for the ANSI TTY escape sequence, this works:

import re
strip_ANSI_pat = re.compile(r"""
    \x1b     # literal ESC
    \[       # literal [
    [;\d]*   # zero or more digits or semicolons
    [A-Za-z] # a letter
    """, re.VERBOSE).sub

def strip_ANSI(s):
    return strip_ANSI_pat("", s)

s = 'potato\x1b[01;32mpotato\x1b[0;0mpotato'

print s, len(s)
s1=strip_ANSI(s)
print s1, len(s1)

Prints:

potato[01;32mpotato[0;0mpotato 32
potatopotatopotato 18

For backspaces \b or vertical tabs or \r vs \n -- it depends how and where it is printed, no?



回答2:

The bash shell had exactly the same need, in order to know when the user's typed input wraps to the next line, in the presence of non-printable characters in the prompt string. Their solution was to not even try - instead, they require that anyone setting a prompt string put \[ and \] around non-printing portions of the prompt. The printed length is calculated to be the length of the string, with these special sequences and all text between them filtered out. (The special sequences are omitted on output, of course.)



回答3:

The printed length of a string depends on the type of the string.

Normal strings in python 2.x are in utf-8. The length of utf-8 is equal to the bytes in String. Change the type to unicode, len() delivers now printed signs. So Formatting works:

value = 'abcäöücdf'
len_value  = len(value)
len_uvalue = len(unicode(value,'utf-8'))
size = self['size'] + len_value-len_uvalue
print value[:min(len(value),size)].ljust(size)