FTP filename encoding

2019-08-06 02:19发布

问题:

Hi I use twisted library to connect to FTP server but I have problem with filename encoding. I receive 'Illusion-N\xf3z.txt' so its not unicode. Is there any FTP command to force specific encoding? Thanks in advance! MK

回答1:

There are two possibilities:

  • FTP is not unicode aware. It looks like the server you're talking to in this example is sending Latin-1 encoded bytes. So you need to decode the bytes using that encoding when you receive them.
  • There is an RFC which updates FTP to be UTF-8-aware. Check the results of the FEAT command to see if UTF8 is there (but it probably isn't, since the example bytes are not valid UTF-8). If it is, decode the bytes using UTF-8.

Twisted's FTP client won't do anything unicode-related for it, since it just implements the basic FTP RFC.



回答2:

FTP ignores encodings; as long as a filename does not contain a '\0' (null character) and '/' (slash) separates directories, it happily accepts anything.

Do your own decoding and encoding of the filenames. It is quite probable that the encoding used in your example is "cp1252", which is the “Windows Western” or something like that.

In your case, when you receive 'Illusion-N\xf3z.txt', convert it to Unicode by 'Illusion-N\xf3z.txt'.decode('cp1252').