Reading from a frequently updated file

2019-01-02 20:47发布

I'm currently writing a program in python on a Linux system. The objective is to read a log file and execute a bash command upon finding a particular string. The log file is being constantly written to by another program. My question is:

If I open the file using the open() method will my Python file object be updated as the actual file gets written to by the other program or will I have to reopen the file at timed intervals?

Thanks

Jim

UPDATE: Thanks for answers so far. I perhaps should have mentioned that the file is being written to by a Java EE app so I have no control over when data gets written to it. I've currently got a program that reopens the file every 10 seconds and tries to read from the byte position in the file that it last read up to. For the moment it just prints out the string that's returned. I was hoping that the file did not need to be reopened but the read command would somehow have access to the data written to the file by the Java app.

#!/usr/bin/python
import time

fileBytePos = 0
while True:
    inFile = open('./server.log','r')
    inFile.seek(fileBytePos)
    data = inFile.read()
    print data
    fileBytePos = inFile.tell()
    print fileBytePos
    inFile.close()
    time.sleep(10)

Thanks for the tips on pyinotify and generators. I'm going to have a look at these for a nicer solution.

6条回答
临风纵饮
2楼-- · 2019-01-02 20:50

Here is a slightly modified version of Jeff Bauer answer which is resistant to file truncation. Very useful if your file is being processed by logrotate.

import os
import time

def follow(name):
    current = open(name, "r")
    curino = os.fstat(current.fileno()).st_ino
    while True:
        while True:
            line = current.readline()
            if not line:
                break
            yield line

        try:
            if os.stat(name).st_ino != curino:
                new = open(name, "r")
                current.close()
                current = new
                curino = os.fstat(current.fileno()).st_ino
                continue
        except IOError:
            pass
        time.sleep(1)


if __name__ == '__main__':
    fname = "test.log"
    for l in follow(fname):
        print "LINE: {}".format(l)
查看更多
谁念西风独自凉
3楼-- · 2019-01-02 20:51

Since you're targeting a Linux system, you can use pyinotify to notify you when the file changes.

There's also this trick, which may work fine for you. It uses file.seek to do what tail -f does.

查看更多
梦该遗忘
4楼-- · 2019-01-02 20:55

If you have the code reading the file running in a while loop:

f = open('/tmp/workfile', 'r')
while(1):
    line = f.readline()
    if line.find("ONE") != -1:
        print "Got it"

and you are writing to that same file ( in append mode ) from another program. As soon as "ONE" is appended in the file you will get the print. You can take whatever action you want to take. In short, you dont have to reopen the file at regular intervals.

>>> f = open('/tmp/workfile', 'a')
>>> f.write("One\n")
>>> f.close()
>>> f = open('/tmp/workfile', 'a')
>>> f.write("ONE\n")
>>> f.close()
查看更多
琉璃瓶的回忆
5楼-- · 2019-01-02 21:03

I am no expert here but I think you will have to use some kind of observer pattern to passively watch the file and then fire off an event that reopens the file when a change occurs. As for how to actually implement this, I have no idea.

I do not think that open() will open the file in realtime as you suggest.

查看更多
余生无你
6楼-- · 2019-01-02 21:04

I would recommend looking at David Beazley's Generator Tricks for Python, especially Part 5: Processing Infinite Data. It will handle the Python equivalent of a tail -f logfile command in real-time.

# follow.py
#
# Follow a file like tail -f.

import time
def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

if __name__ == '__main__':
    logfile = open("run/foo/access-log","r")
    loglines = follow(logfile)
    for line in loglines:
        print line,
查看更多
荒废的爱情
7楼-- · 2019-01-02 21:15

"An interactive session is worth 1000 words"

>>> f1 = open("bla.txt", "wt")
>>> f2 = open("bla.txt", "rt")
>>> f1.write("bleh")
>>> f2.read()
''
>>> f1.flush()
>>> f2.read()
'bleh'
>>> f1.write("blargh")
>>> f1.flush()
>>> f2.read()
'blargh'

In other words - yes, a single "open" will do.

查看更多
登录 后发表回答