Using C# to get a directory listing from an FTP server. The output is formatted as HTML. Required is an un-formatted list (as returned by the Linux ls command). (Trying to avoid parsing HTML to get list of files.)
Observations:
FTP server is vsftpd run by the client.
The problem does not occur when listing a directory on e.g. Filezilla FTP server.
Filezilla Client connected to the vsftpd server times out when getting directory listing
Error: Connection timed out Error: Failed to retrieve directory listing
Referring to the code below, the following behaviour / errors happen:
WHEN Proxy server set in the code THEN server returns the list formatted in HTML instead of simple ls output.
WHEN Proxy server set to null OR
WebRequest.DefaultWebProxy
ORGlobalProxySelection.GetEmptyWebProxy()
ORnew WebProxy();
THEN:The remote server returned an error: (550) File unavailable (e.g., file not found, no access).
WHEN: No proxy server specified in the code AND proxy is NOT set to null. THEN
The remote server returned an error: (407) Proxy Authentication Required.
Questions
- How do I setup the C# code to get the ls directory listing and not HTML? OR
- Is there anything that can be done on the vsftpd (server) side to prevent HTML directory listing?
Details:
Code extract
FtpWebRequest request = WebRequest.Create(uri) as FtpWebRequest;
request.Method = WebRequestMethods.Ftp.ListDirectory;
//1. Works but returns HTML
request.Proxy = new WebProxy("http://xxx.xxx.xxx.xxx:8080",true);
request.Proxy.Credentials = System.Net.CredentialCache.DefaultCredentials;
//2. Does not work
//request.Proxy = null;// WebRequest.DefaultWebProxy;// GlobalProxySelection.GetEmptyWebProxy(); // null; //new WebProxy();
request.Credentials = server.Credential;
request.KeepAlive = true;
request.UsePassive = true;
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
Stream responseStream = response.GetResponseStream();
StreamReader reader = new StreamReader(responseStream);
Regex filter = FileUtils.GetRegex(clientSource.FileFilter);
while (!reader.EndOfStream)
{
ProcessFileLine(reader.ReadLine(), filter, files);
}
reader.Close();
response.Close();
Directory listing formatted as HTML
<HTML>
<meta http-equiv="Content-Type" content="text-html; charset=UTF-8">
<HEAD>
<TITLE>FTP root at ftp-jhb.saicomvoice.co.za. </TITLE>
</HEAD>
<BODY>
<H1>FTP root at ftp-jhb.saicomvoice.co.za. </H1>
<HR>
<PRE>
12/11/15 04:36PM [GMT] <DIR> <A HREF="/bin/">bin</A>
12/11/15 12:56PM [GMT] <DIR> <A HREF="/boot/">boot</A>
02/22/13 12:00AM [GMT] <DIR> <A HREF="/cgroup/">cgroup</A>
12/11/15 03:36PM [GMT] <DIR> <A HREF="/dev/">dev</A>
01/19/15 01:32PM [GMT] <DIR> <A HREF="/etc/">etc</A>
12/12/15 11:45AM [GMT] <DIR> <A HREF="/home/">home</A>
12/11/15 12:51PM [GMT] <DIR> <A </PRE>
<HR>
</BODY>
</HTML>
From the description it looks like that you are required to use a HTTP proxy to access the FTP server. The proxy will not be accessed by the FTP protocol and just forward commands, but instead it will be accessed by the HTTP protocol. The proxy then will do the necessary FTP commands for you and return the result for you inside the HTTP response. How this result will look like depends fully on the proxy. And since most users will access a HTTP proxy using a browser, HTTP proxies usually return a HTML page with the result so that the user can from there just click to get the relevant files.
In summary: since the result depends fully on the proxy there is no way to get the result in a different way as long as you need to use this specific proxy. So the best would be to check with your administrators if there is another way to use FTP, i.e. without this HTTP proxy.
I found this code and it helped me: