I have researched about this issue, It seems Python 2.7 default use is ASCII, I cant switch to python 3 (default Unicode) because of libraries
# -*- coding: utf-8 -*-
print u'порядке'
seems to print fine it will be ??????
without u
but:
print list(os.walk(ur'c:\somefoler'))
returns \u0438\u0442...
why not readable as first print? Also I use os.walk with variables I can't use it with ur
I'm just trying to understand how could I make my following code work with any folders/files language I use os.walk + save to file both seems not to work always ????
where Cyrillic
def findit(self,root, exclude_files=[], exclude_dirs=[]):
exclude_files = (fnmatch.translate(i) for i in exclude_files)
exclude_files = '('+')|('.join(exclude_files)+')'
exclude_files = re.compile(exclude_files)
exclude_dirs = (os.path.normpath(i) for i in exclude_dirs)
exclude_dirs = (os.path.normcase(i) for i in exclude_dirs)
exclude_dirs = set(exclude_dirs)
for root, dirs, files in os.walk(root):
if os.path.normpath(os.path.normcase(root)) in exclude_dirs:
# exclude this dir and subdirectories
dirs[:] = []
continue
for f in files:
if not exclude_files.match(os.path.normcase(f)):
yield os.path.join(root, f)
filelist = list(findit('c:\\',exclude_files = ['*.dll', '*.dat', '*.log', '*.exe'], exclude_dirs = ['c:/windows', 'c:/program files', 'c:/else']))
When it's a variable it seems I have to use .decode('utf-8')
? Why not unicode such as u'var'
if it exists and why there are many times exceptions it's not possible to convert had encountered it and saw a lot of answers with such errors I'm having hard time understanding it isn't there a way to make it just work?
try
you should see the proper thing(along side the repr that is used in the list) ... the problem is that when you print a list it prints a
repr
of its items