Remove u202a from Python string

2020-03-26 07:52发布

I'm trying to open a file in Python, but I got an error, and in the beginning of the string I got a /u202a character... Does anyone know how to remove it?

def carregar_uml(arquivo, variaveis):
    cadastro_uml = {}
    id_uml = 0

    for i in open(arquivo):
        linha = i.split(",")


carregar_uml("‪H:\\7 - Script\\teste.csv", variaveis)

OSError: [Errno 22] Invalid argument: '\u202aH:\7 - Script\teste.csv'

标签: python
6条回答
唯我独甜
2楼-- · 2020-03-26 08:27

use small letter when you write your hard-disk-drive name! not big letter!

ex) H: -> error ex) h: -> not error

查看更多
Luminary・发光体
3楼-- · 2020-03-26 08:30

Or you can slice out that character

file_path = r"‪C:\Test3\Accessing_mdb.txt"
file_path = file_path[1:]
with open(file_path, 'a') as f_obj:
f_obj.write('some words')
查看更多
手持菜刀,她持情操
4楼-- · 2020-03-26 08:37

When you initially created your .py file, your text editor introduced a non-printing character.

Consider this line:

carregar_uml("‪H:\\7 - Script\\teste.csv", variaveis)

Let's carefully select the string, including the quotes, and copy-paste it into an interactive Python session:

$ python
Python 3.6.1 (default, Jul 25 2017, 12:45:09) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> "‪H:\\7 - Script\\teste.csv"
'\u202aH:\\7 - Script\\teste.csv'
>>> 

As you can see, there is a character with codepoint U-202A immediately before the H.

As someone else pointed out, the character at codepoint U-202A is LEFT-TO-RIGHT EMBEDDING. Returning to our Python session:

>>> s = "‪H:\\7 - Script\\teste.csv"
>>> import unicodedata
>>> unicodedata.name(s[0])
'LEFT-TO-RIGHT EMBEDDING'
>>> unicodedata.name(s[1])
'LATIN CAPITAL LETTER H'
>>> 

This further confirms that the first character in your string is not H, but the non-printing LEFT-TO-RIGHT EMBEDDING character.

I don't know what text editor you used to create your program. Even if I knew, I'm probably not an expert in that editor. Regardless, some text editor that you used inserted, unbeknownst to you, U+202A.

One solution is to use a text editor that won't insert that character, and/or will highlight non-printing characters. For example, in vim that line appears like so:

carregar_uml("<202a>H:\\7 - Script\\teste.csv", variaveis)

Using such an editor, simply delete the character between " and H.

carregar_uml("H:\\7 - Script\\teste.csv", variaveis)

Even though this line is visually identical to your original line, I have deleted the offending character. Using this line will avoid the OSError that you report.

查看更多
地球回转人心会变
5楼-- · 2020-03-26 08:39

try strip(),

def carregar_uml(arquivo, variaveis):
    cadastro_uml = {}
    id_uml = 0

    for i in open(arquivo):
        linha = i.split(",")


carregar_uml("‪H:\\7 - Script\\teste.csv", variaveis)

carregar_uml = carregar_uml.strip("\u202a")
查看更多
家丑人穷心不美
6楼-- · 2020-03-26 08:53

The problem is the directory path of the file is not read properly. Use raw strings to pass it as argument and it should work.

carregar_uml(r'H:\7 - Script\teste.csv', variaveis)
查看更多
Bombasti
7楼-- · 2020-03-26 08:53

you can use this sample code to remove u202a from file path

st="‪‪F:\\somepath\\filename.xlsx"    
data = pd.read_excel(st)

if i try to do this it gives me a OSError and In detail

Traceback (most recent call last):
  File "F:\CodeRepo\PythonWorkSpace\demo\removepartofstring.py", line 14, in <module>
    data = pd.read_excel(st)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\util\_decorators.py", line 188, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\util\_decorators.py", line 188, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel.py", line 350, in read_excel
    io = ExcelFile(io, engine=engine)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel.py", line 653, in __init__
    self._reader = self._engines[engine](self._io)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel.py", line 424, in __init__
    self.book = xlrd.open_workbook(filepath_or_buffer)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\xlrd\__init__.py", line 111, in open_workbook
    with open(filename, "rb") as f:
OSError: [Errno 22] Invalid argument: '\u202aF:\\somepath\\filename.xlsx'

but if i do that like this

    st="‪‪F:\\somepath\\filename.xlsx" 
    data = pd.read_excel(st.strip("‪u202a")) #replace your string here

Its working for me

查看更多
登录 后发表回答