Updating CSV file (add/remove rows) with P

2019-09-17 01:26发布

G'day everyone,

I've got a Raspberry Pi system which keeps track of tools being checked out by various users. I've set it up such that a scan of the system is performed when a user checks in, as well as when they check out. By comparing the two scans, I can determine whether a tool has been taken/returned. However, I also have a Log.csv file which keeps track of which tools are currently checked out. I'm able to add to this log when a tool is checked out (no problems here), but I'm having trouble removing that row when the tool is returned.

I've searched SO for a solution to this, but haven't found anything concrete. From what I understand, you can't remove a single row from a CSV file? I would have to re-write the file, with that particular row being omitted?

Here's what I have so far, including both adding and remove rows on the Log.csv file:

with open('Log.csv', 'a+') as f:
    reader = csv.reader(f)
    if tools_taken not in reader:
        csv.writer(open('Log.csv', 'a+')).writerow([tools_taken])

with open('Log.csv', 'a+') as f:
    reader = csv.reader(f)
    if tools_returned in reader:
        ???

Bear in mind that the above code is simplified to keep it succinct. I'm thinking that the 'if tools_returned in reader' line is too vague. I might change it to:

for row in reader:
    for field in row:
        if field == tools_taken:
            ???

Am I on the right track? Any input here would be very much appreciated!

标签: python csv rows
2条回答
\"骚年 ilove
2楼-- · 2019-09-17 01:48

From what I understand, you can't remove a single row from a CSV file? I would have to re-write the file, with that particular row being omitted?

Exactly. In fact, that's true for files in general. In order to remove stuff from the middle of a file, you have to move the whole rest of the file upward, then truncate the cruft left behind. Which you generally don't want to do, so the csv module won't help you do it.


So, how do you create a new CSV file? Three ways:

  1. Open in read mode, read the whole file in, close, open in write mode, write the whole thing out, close.
  2. Rename to Log.csv.bak, open that in read mode, open Log.csv in write mode, and copy from one to the other.
  3. Open Log.csv in read mode, open a temporary file in write mode, copy from one to the other, then atomically rename the temporary file to Log.csv.

The third one is usually the best—but unfortunately, it's very hard to get it right in a cross-platform way, or even just for Windows. (It's very easy if you only care about Unix, however.) So, I'll show the second:

os.rename('Log.csv', 'Log.csv.bak')
with open('Log.csv.bak') as infile, open('Log.csv', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    for row in reader:
        if not supposed_to_be_removed(row):
            writer.writerow(row)

That's it.


This is similar to the way you'd write a copying algorithm in place of a mutating algorithm for a simple list:

newlist = [row for row in oldlist if not supposed_to_be_removed(row)]

Of course you could write that in terms of iterators instead of lists:

newlist = (row for row in oldlist if not supposed_to_be_removed(row))

And in fact, you can use the exact same iterator here:

os.rename('Log.csv', 'Log.csv.bak')
with open('Log.csv.bak') as infile, open('Log.csv', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    newrows = (row for row in reader if not_supposed_to_be_removed(row))
    writer.writerows(newrows)

You could even turn it into a one-liner, if you wanted:

    writer.writerows(row for row in reader if not supposed_to_be_removed(row))

Finally, it might be worth considering whether a csv file is really the right answer here. If you're doing a whole bunch of operations, continually re-reading and re-writing the file over and over is going to be a pain—and be slow. Maybe you could keep it in memory and just read and write and startup and shutdown, but then you have to make sure you don't lose data on errors. See my other answer for another alternative.

查看更多
霸刀☆藐视天下
3楼-- · 2019-09-17 02:04

I don't think csv is the right structure here. You want to be able to look up a given tool, find out whether its tools_taken is True, or change its tools_taken, or remove a tool from the file, or add a tool to the file, right?

That's what a database is for, such as shelve:

import contextlib
import shelve

tools = shelve.open('Log.db', 'c', writeback=True)
with contextlib.closing(tools):
    # Add a tool
    tools['hammer'] = {'name': 'Hammer', 'owner': 'Joe', 'tools_taken': False}
    # Check if a tool is taken
    if tools['screwdriver']['tools_taken']:
        print('The screwdriver is taken!')
    # Change a tool's taken status:
    tools['screwdriver']['tools_taken'] = True
    # Remove a tool
    del tools['jackhammer']

In other words, you can just it just like a dict (in this case, full of dicts), but it's automatically persistent across runs.

查看更多
登录 后发表回答