Ftp create a filename with utf-8 chars such as gre

2020-02-12 10:26发布

问题:

I am trying to create a file to an ftp server with the following code (where I also tried with UseBinary option true and false)

string username = "name";
string password = "password";
string remotefolder = "ftp://ftp.myhost.gr/public_html/test/";
string remoteFileName = "δοκιμαστικό αρχείοüß-äCopy.txt";
string localFile = @"C:\test\δοκιμαστικό αρχείο - Copy.txt";
String ftpname = "ftp://ftp.myhost.gr/public_html/test" + @"/" + Uri.EscapeUriString(Program.remoteFileName);


FtpWebRequest request = (FtpWebRequest)WebRequest.Create(ftpname);
request.Proxy = null;
request.Credentials = new NetworkCredential(username, password);


request.UsePassive = true;
request.KeepAlive = true;
request.Method = WebRequestMethods.Ftp.UploadFile;
request.UseBinary = true;
//request.UseBinary = false;

 byte[] content = System.IO.File.ReadAllBytes(localFile);
 byte[] fileContents = new Byte[content.Length];

 Array.Copy(content, 0, fileContents, 0, content.Length);

 using (Stream uploadStream = request.GetRequestStream())
 {
     int contentLength = fileContents.Length;
     uploadStream.Write(fileContents, 0, contentLength);
 }

 FtpWebResponse response = (FtpWebResponse)request.GetResponse();
 Console.WriteLine(response.ExitMessage);

The problem is that file at my ftp server does not get the name I request which contains English, greek and german characters --> "δοκιμαστικό αρχείοüß-äCopy.txt

1) What can I do with that?

There is some improvement once I change my regional settings--> Current language for non-Unicode programs to Greek Language but I still miss the german chars.

2) Why does a c# program depend on this setting? Is there a special methodology i should follow in order to avoid dependency from this setting?

Encoding nightmares arose again :(

回答1:

It is not enough for you just to encode your string as UTF8 and send it as filename to FTP server. In the past all FTP servers understood ASCII only and nowadays to maintain backward compatibility - even if they are Unicode aware - when they start they treat all filenemes as ASCII too.

To make it all work you (your program) must first check what your server is capable of. Servers send their features after client connects - in your case you must check for FEAT UTF8. If your server sends that - it means it understands UTF8. Nevertheless - even if it understands it - you must tell it explicitly that from now on you will send your filenames UTF8 encoded and now it is the stuff that your program lacks (as your server supports utf8 as you've stated).

Your client must send to FTP server the following OPTS UTF8 ON. After sending that you may use UTF8 or speak UTF8-ish (so to speak) to your sever.

Read here for details Internationalization of the File Transfer Protocol



回答2:

In your code change:

string localFile = @"C:\test\δοκιμαστικό αρχείο - Copy.txt";
String ftpname = "ftp://ftp.myhost.gr/public_html/test" + @"/" + Uri.EscapeUriString(Program.remoteFileName);

FtpWebRequest request = (FtpWebRequest)WebRequest.Create(ftpname);

To:

string remoteFileName = "δοκιμαστικό αρχείο - Copy.txt";
String ftpname = "ftp://ftp.myhost.gr/public_html/test" + @"/" + remoteFileName;

var escapedUriString = Uri.EscapeUriString(Encoding.UTF8.GetString(Encoding.ASCII.GetBytes(ftpname)));
var request = (FtpWebRequest)WebRequest.Create(escapedUriString);

This needs to be done because EscapeUriString's input parameter is escaped according to the RFC 2396 specification.

The RFC 2396 standard states:

When a new URI scheme defines a component that represents textual data consisting of characters from the Universal Character Set [UCS], the data should first be encoded as octets according to the UTF-8 character encoding [STD63]; then only those octets that do not correspond to characters in the unreserved set should be percent-encoded.

Hence the code change shown above will force this string to be inputted in the UTF-8 format.

With regards to:

2) Why does a c# program depend on this setting? Is there a special methodology i should follow in order to avoid dependency from this setting?

Uri.EscapeUriString needs input which follows the RFC 2396 specification, hence the need to pass it data in a format which it will understand.