watchman: I am missing file deletions happening be

2019-03-06 12:35发布

问题:

I am missing deletes in watchman. Version 4.9.0, inotify.

My test code:

#!/usr/bin/env python3

import pathlib
import pywatchman

w = pywatchman.client()

w.query('watch', '/tmp/z')
clock = w.query('clock', '/tmp/z')['clock']
print(clock)

q = w.query('subscribe', '/tmp/z', 'Buffy', {'expression':["since", clock],
"fields": ["name", "exists", "oclock", "ctime_ns", "new", "mode"]})
print(q)

f = pathlib.Path('/tmp/z/xx')

f.touch()
data = w.receive()
clock = data['clock']
print()
print('Touch file:')
print(data)
print('Clock:', clock)

f.unlink()
print()
print('Delete file:')
print(w.receive())
w.close()

w = pywatchman.client(timeout=99999)
q = w.query('subscribe', '/tmp/z', 'Buffy', {'expression':["since", clock],
"fields": ["name", "exists", "oclock", "ctime_ns", "new", "mode"]})
print(q)

print()
print('We request changes since', clock)
print(w.receive())
w.close()

What I am seeing:

  1. We create the file. We receive the notification of the new file and the directory change. GOOD. We take note of the "clock" of this notification.

  2. We delete the file. We get the notification of the file deletion. GOOD. Be DO NOT get the notification of the directory change.

Just imagine now that the process crashes BEFORE it can update the internal details, but it remember the changes notified in step 1 (directory update and creation of a new file). That is, transaction 1 is processed, but the program crashes before transaction 2 is processed.

  1. We now open a new subscription to watchman (remember, we are simulating a crash) and request changes since step 1. I am simulating a recovery, where the program reboots, notice that transaction 1 was OK (the file is present) and request more changes (it should get the deletion).

  2. I would expect to get a file deletion but I get... NOTHING. CATASTROPHIC.

Transcript:

$ ./watchman-bug.py 
c:1517109517:10868:3:23
{'clock': 'c:1517109517:10868:3:23', 'subscribe': 'Buffy', 'version': '4.9.0'}

Touch file:
{'unilateral': True, 'subscription': 'Buffy', 'root': '/tmp/z', 'files': [{'name': 'xx', 'exists': True, 'oclock': 'c:1517109517:10868:3:24', 'ctime_ns': 1517114230070245747, 'new': True, 'mode': 33188}], 'is_fresh_instance': False, 'version': '4.9.0', 'since': 'c:1517109517:10868:3:23', 'clock': 'c:1517109517:10868:3:24'}
Clock: c:1517109517:10868:3:24

Delete file:
{'unilateral': True, 'subscription': 'Buffy', 'root': '/tmp/z', 'files': [{'name': 'xx', 'exists': False, 'oclock': 'c:1517109517:10868:3:25', 'ctime_ns': 1517114230070245747, 'new': False, 'mode': 33188}], 'is_fresh_instance': False, 'version': '4.9.0', 'since': 'c:1517109517:10868:3:24', 'clock': 'c:1517109517:10868:3:25'}
{'clock': 'c:1517109517:10868:3:25', 'subscribe': 'Buffy', 'version': '4.9.0'}

We request changes since c:1517109517:10868:3:24

The process hangs expecting the deletion notification.

What am I doing wrong?.

Thanks for your time and knowledge!

回答1:

The issue is that you're using a since expression term rather than informing watchman to use the since generator (the recency index).

What's the difference? You can think of this as the difference between the FROM and WHERE clauses in SQL. The expression field is similar in intent to the WHERE clause: it applies to the matched results and filters them down, but what you wanted to do is specify the FROM clause by setting the since field in the query spec. This is admittedly a subtle difference.

The solution is to remove the expression term and add the generator term like this:

q = w.query('subscribe', '/tmp/z', 'Buffy', 
            {"since": clock,
             "fields": ["name", "exists", "oclock",
                        "ctime_ns", "new", "mode"]})

While we don't have really any documentation on the use of the pywatchman API, you can borrow the concepts from the slightly better documented nodejs API; here's a relevant snippet:

https://facebook.github.io/watchman/docs/nodejs.html#subscribing-only-to-changed-files



标签: watchman