I'm developing an FTP client in Python ftplib. How do I add proxies support to it (most FTP apps I have seen seem to have it)? I'm especially thinking about SOCKS proxies, but also other types... FTP, HTTP (is it even possible to use HTTP proxies with FTP program?)
Any ideas how to do it?
As per this source.
Depends on the proxy, but a common method is to ftp to the proxy, then use
the username and password for the destination server.
E.g. for ftp.example.com:
Server address: proxyserver (or open proxyserver from with ftp)
User: anonymous@ftp.example.com
Password: password
In Python code:
from ftplib import FTP
site = FTP('my_proxy')
site.set_debuglevel(1)
msg = site.login('anonymous@ftp.example.com', 'password')
site.cwd('/pub')
You can use the ProxyHandler in urllib2
.
ph = urllib2.ProxyHandler( { 'ftp' : proxy_server_url } )
server= urllib2.build_opener( ph )
I had the same problem and needed to use the ftplib module (not to rewrite all my scripts with URLlib2).
I have managed to write a script that installs transparent HTTP tunneling on the socket layer (used by ftplib).
Now, I can do FTP over HTTP transparently !
You can get it there:
http://code.activestate.com/recipes/577643-transparent-http-tunnel-for-python-sockets-to-be-u/
Standard module ftplib
doesn't support proxies. It seems the only solution is to write your own customized version of the ftplib
.
Patching the builtin socket library definitely won't be an option for everyone, but my solution was to patch socket.create_connection()
to use an HTTP proxy when the hostname matches a whitelist:
from base64 import b64encode
from functools import wraps
import socket
_real_create_connection = socket.create_connection
_proxied_hostnames = {} # hostname: (proxy_host, proxy_port, proxy_auth)
def register_proxy (host, proxy_host, proxy_port, proxy_username=None, proxy_password=None):
proxy_auth = None
if proxy_username is not None or proxy_password is not None:
proxy_auth = b64encode('{}:{}'.format(proxy_username or '', proxy_password or ''))
_proxied_hostnames[host] = (proxy_host, proxy_port, proxy_auth)
@wraps(_real_create_connection)
def create_connection (address, *args, **kwds):
host, port = address
if host not in _proxied_hostnames:
return _real_create_connection(address, *args, **kwds)
proxy_host, proxy_port, proxy_auth = _proxied_hostnames[host]
conn = _real_create_connection((proxy_host, proxy_port), *args, **kwds)
try:
conn.send('CONNECT {host}:{port} HTTP/1.1\r\nHost: {host}:{port}\r\n{auth_header}\r\n'.format(
host=host, port=port,
auth_header=('Proxy-Authorization: basic {}\r\n'.format(proxy_auth) if proxy_auth else '')
))
response = ''
while not response.endswith('\r\n\r\n'):
response += conn.recv(4096)
if response.split()[1] != '200':
raise socket.error('CONNECT failed: {}'.format(response.strip()))
except socket.error:
conn.close()
raise
return conn
socket.create_connection = create_connection
I also had to create a subclass of ftplib.FTP that ignores the host
returned by PASV
and EPSV
FTP commands. Example usage:
from ftplib import FTP
import paramiko # For SFTP
from proxied_socket import register_proxy
class FTPIgnoreHost (FTP):
def makepasv (self):
# Ignore the host returned by PASV or EPSV commands (only use the port).
return self.host, FTP.makepasv(self)[1]
register_proxy('ftp.example.com', 'proxy.example.com', 3128, 'proxy_username', 'proxy_password')
ftp_connection = FTP('ftp.example.com', 'ftp_username', 'ftp_password')
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) # If you don't care about security.
ssh.connect('ftp.example.com', username='sftp_username', password='sftp_password')
sftp_connection = ssh.open_sftp()
Here is workaround using requests
, tested with a squid proxy that does NOT support CONNECT tunneling:
def ftp_fetch_file_through_http_proxy(host, user, password, remote_filepath, http_proxy, output_filepath):
"""
This function let us to make a FTP RETR query through a HTTP proxy that does NOT support CONNECT tunneling.
It is equivalent to: curl -x $HTTP_PROXY --user $USER:$PASSWORD ftp://$FTP_HOST/path/to/file
It returns the 'Last-Modified' HTTP header value from the response.
More precisely, this function sends the following HTTP request to $HTTP_PROXY:
GET ftp://$USER:$PASSWORD@$FTP_HOST/path/to/file HTTP/1.1
Note that in doing so, the host in the request line does NOT match the host we send this packet to.
Python `requests` lib does not let us easily "cheat" like this.
In order to achieve what we want, we need:
- to mock urllib3.poolmanager.parse_url so that it returns a (host,port) pair indicating to send the request to the proxy
- to register a connection adapter to the 'ftp://' prefix. This is basically a HTTP adapter but it uses the FULL url of
the resource to build the request line, instead of only its relative path.
"""
url = 'ftp://{}:{}@{}/{}'.format(user, password, host, remote_filepath)
proxy_host, proxy_port = http_proxy.split(':')
def parse_url_mock(url):
return requests.packages.urllib3.util.url.parse_url(url)._replace(host=proxy_host, port=proxy_port, scheme='http')
with open(output_filepath, 'w+b') as output_file, patch('requests.packages.urllib3.poolmanager.parse_url', new=parse_url_mock):
session = requests.session()
session.mount('ftp://', FTPWrappedInFTPAdapter())
response = session.get(url)
response.raise_for_status()
output_file.write(response.content)
return response.headers['last-modified']
class FTPWrappedInFTPAdapter(requests.adapters.HTTPAdapter):
def request_url(self, request, _):
return request.url