How do I read a CSV from Secure FTP Server into a

2019-08-26 04:14发布

问题:

I have a set of CSV files on a secure FTP server that I'm trying to read into (separate) Pandas DataFrames in memory so that I can manipulate them and then pass them elsewhere via an API. The FTP server requires authentication, which means I'm not able to use the otherwise very useful pd.read_csv() to read the csv straight from the server.

The following (Python 3.x) code will connect and then write the file out to disk:

from ftplib import FTP
import pandas as pd

server = "server.ip"
username = "user"
password = "psswd"

file1 = "file1.csv"  # Just one of the files; I'll eventually loop through...

ftp = FTP(server)
ftp.login(user=username, passwd=password)

with open(filename, "wb") as file:
    ftp.retrbinary("RETR " + filename, file.write)

# Do some other logic not relevant to the question

I'd like to avoid writing the file to disk and then reading it back in. I know that pd.read_csv() will read csv files straight from public addresses, but I can't see any examples of how to do so when the files are gated behind a login.

回答1:

IIRC you can perform authenticated FTP requests using urllib2. Perhaps something like

import urllib2, base64
import pandas as pd

req = urllib2.Request('ftp://example.com')
base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string) 
response = urllib2.urlopen(req)
data = pd.csv_read(response.read())

Not tested but you can find more information urllib2 here.