FTP filename encoding

2019-08-06 02:10发布

Hi I use twisted library to connect to FTP server but I have problem with filename encoding. I receive 'Illusion-N\xf3z.txt' so its not unicode. Is there any FTP command to force specific encoding? Thanks in advance! MK

2条回答
我命由我不由天
2楼-- · 2019-08-06 02:20

There are two possibilities:

  • FTP is not unicode aware. It looks like the server you're talking to in this example is sending Latin-1 encoded bytes. So you need to decode the bytes using that encoding when you receive them.
  • There is an RFC which updates FTP to be UTF-8-aware. Check the results of the FEAT command to see if UTF8 is there (but it probably isn't, since the example bytes are not valid UTF-8). If it is, decode the bytes using UTF-8.

Twisted's FTP client won't do anything unicode-related for it, since it just implements the basic FTP RFC.

查看更多
叼着烟拽天下
3楼-- · 2019-08-06 02:38

FTP ignores encodings; as long as a filename does not contain a '\0' (null character) and '/' (slash) separates directories, it happily accepts anything.

Do your own decoding and encoding of the filenames. It is quite probable that the encoding used in your example is "cp1252", which is the “Windows Western” or something like that.

In your case, when you receive 'Illusion-N\xf3z.txt', convert it to Unicode by 'Illusion-N\xf3z.txt'.decode('cp1252').

查看更多
登录 后发表回答