Basically, if I have a line of text which starts with indention, what's the best way to grab that indention and put it into a variable in Python? For example, if the line is:
\t\tthis line has two tabs of indention
Then it would return '\t\t'. Or, if the line was:
this line has four spaces of indention
Then it would return four spaces.
So I guess you could say that I just need to strip everything from a string from first non-whitespace character to the end. Thoughts?
A sneaky way: abuse
lstrip
!This way you don't have to work through all the details of whitespace!
(Thanks Adam for the correction)
And to strip leading spaces, use lstrip.
As there are down votes probably questioning the efficiency of regex, I've done some profiling to check the efficiency of each cases.
Very long string, very short leading space
RegEx > Itertools >> lstrip
Very short string, very short leading space
lstrip > RegEx > Itertools
If you can limit the string's length to thousounds of chars or less, the lstrip trick maybe better.
This shows the lstrip trick scales roughly as O(√n) and the RegEx and itertool methods are O(1) if the number of leading spaces is not a lot.
Very short string, very long leading space
lstrip >> RegEx >>> Itertools
If there are a lot of leading spaces, don't use RegEx.
Very long string, very long leading space
lstrip >>> RegEx >>>>>>>> Itertools
This shows all methods scales roughly as O(m) if the non-space part is not a lot.
Basically, the my idea is:
This can also be done with
str.isspace
anditertools.takewhile
instead of regex.How about using the regex
\s*
which matches any whitespace characters. You only want the whitespace at the beginning of the line so eithersearch
with the regex^\s*
or simplymatch
with\s*
.If you're interested in using regular expressions you can use that.
/\s/
usually matches one whitespace character, so/^\s+/
would match the whitespace starting a line.