Reading end of huge and dynamic file via SFTP from

2020-03-26 08:45发布

问题:

I am trying to find a way to read just end of huge and dynamic log file (like 20-30 lines from end) via SFTP from server and to save the point until where I read, and if I need more lines, to read more from this point upper.

Everything I've tried takes too long time, I've tried to copy this file on machine and after this to read from end using ReversedLinesFileReader because this method need the File object, when via SFTP you will get only InputStream, takes a lot to download file.

Also tried to count lines and to read from n line but also takes too long and throws exception because sometime in this time file is modified. Another way I tried to connect via SSH and used tail -100 and get the desired result, but just for one time, because next time I will get also new logs, but I need to go upper. Is there a fast way to get the end of file and to save the point and to read more upper of this point later? Any idea?

回答1:

You don't say what SFTP library you're using, but the most widely used Java SSH/SFTP library is JSch, so I'll assume you're using that.

The SFTP protocol has operations to perform random-access I/O on remote files. Unfortunately, the JSch SFTP client doesn't expose the full range of operations. However, it does have versions of the get operation (for getting a file from the remote server) which permit skipping over the first part of the remote file. You can use one of these operations to read for example the last 10 KB of a file.

Several of the JSch get operations return an InputStream. You can read the contents of the remote file from the input stream. If you want to access the remote file line-by-line, you can convert it to Reader using InputStreamReader.

So, a process might do the following:

  1. Call stat() on the remote file to get its size.
  2. Figure out where in the file you want to start reading from. You could keep track of where you stopped reading last time, or you could guess based on the amount of data you're willing to download and the expected size in bytes of these last 20-30 lines.
  3. Call get() to start reading it.
  4. Process data read from the InputStream returned by the get() call.


回答2:

Best would be to have a kind of rotating log files, possibly with compression.

Hower rsync is a unidirectional synchronisation, that can transmit only the changed parts of a file: for a log the new end.

I am not sure whether it works sufficiently performant in your case, and ssh is a prerequisite.



标签: java sftp