So lets say I'm using Python's ftplib to retrieve a list of log files from an FTP server. How would I parse that list of files to get just the file names (the last column) inside a list? See the link above for example output.
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
Using retrlines() probably isn't the best idea there, since it just prints to the console and so you'd have to do tricky things to even get at that output. A likely better bet would be to use the nlst() method, which returns exactly what you want: a list of the file names.
Since every filename in the output starts at the same column, all you have to do is get the position of the dot on the first line:
Then slice the filename out of the other lines using the position of that dot as the starting index.
Since the dot is the last character on the line, you can use the length of the line minus 1 as the index. So the final code is something like this:
And a slightly less-optimal method, by the way, if you're stuck using retrlines() for some reason, is to pass a function as the second argument to retrlines(); it'll be called for each item in the list. So something like this (assuming you have an FTP object named 'ftp') would work as well:
The list 'filenames' will then be a list of the file names.
Is there any reason why ftplib.FTP.nlst() won't work for you? I just checked and it returns only names of the files in a given directory.
This best answer
You may want to use
ftp.nlst()
instead offtp.retrlines()
. It will give you exactly what you want.If you can't, read the following :
Generators for sysadmin processes
In his now famous review, Generator Tricks For Systems Programmers An Introduction, David M. Beazley gives a lot of receipes to answer to this kind of data problem with wuick and reusable code.
E.G :
Why don't we generate immediately the list ?
Well, it's because doing it this way offer you much flexibility : you can apply any intermediate generator to filter files before turning it into
files_list
: it's just like pipe, add a line, you add a process without overheat (since it's generators). And if you get rid offretrlines
, it still work be it's even better because you don't store the list even one time.EDIT : well, I read the comment to the other answer and it says that this won't work if there is any space in the name.
Cool, this will illustrate why this method is handy. If you want to change something in the process, you just change a line. Swap :
and
Ok, this may no be obvious here, but for huge batch process scripts, it's nice :-)
If the FTP server supports the
MLSD
command, then please see section “single directory case” from that answer.Use an instance (say
ftpd
) of theFTPDirectory
class, call its.getdata
method with connectedftplib.FTP
instance in the correct folder, then you can: