edit: see the bottom for my eventual solution
I have a directory of ~12,700 text files.
They have names like this:
1 - Re/ Report Novenator public call for bury - by Lizbett on Thu, 10 Sep 2009.txt
Where the leading digital increments with each file (e.g. the last file in the directory begins with "12,700 - ").
Unfortunately, the files are not timesorted, and I need them to be. Luckily I have a separate CSV file where the ID numbers are mapped e.g. the 1 in the example above should really be 25 (since there are 24 messages before it), and 2 should really be 8, and 3 should be 1, and so forth, like so:
OLD_FILEID TIMESORT_FILEID
21 0
23 1
24 2
25 3
I don't need to change anything in the file title except for this single leading number which I need to swap with its associated value. In my head, the way this would work is to open a file name, check the digits which appear before the dash, look them up in the CSV, replace them with the associated value, and then save the file with the adjusted title and go on to the next file.
What would be the best way to go about doing something like this? I'm a python newbie but have played around enough to feel comfortable following most directions or suggestions. Thanks :)
e: following the instructions below as best I could I did this, which doesn't work, but I'm not sure why:
import os
import csv
import sys
#open and store the csv file
with open('timesortmap.csv','rb') as csvfile:
timeReader = csv.reader(csvfile, delimiter = ',', quotechar='"')
#get the list of files
for filename in os.listdir('DiggOutput-TIMESORT/'):
oldID = filename.split(' - ')[0]
newFilename = filename.replace(oldID, timeReader[oldID],1)
os.rename(oldID, newFilename)
The error I get is:
TypeError: '_csv.reader' object is not subscriptable
I am not using DictReader, but that's because when I use csv.reader and print the rows, it looks like this:
['12740', '12738']
['12742', '12739']
['12738', '12740']
['12737', '12741']
['12739', '12742']
And when I use DictReader it looks like this:
{'FILEID-TS': '12738', 'FILEID-OLD': '12740'}
{'FILEID-TS': '12739', 'FILEID-OLD': '12742'}
{'FILEID-TS': '12740', 'FILEID-OLD': '12738'}
{'FILEID-TS': '12741', 'FILEID-OLD': '12737'}
{'FILEID-TS': '12742', 'FILEID-OLD': '12739'}
And I get this error in terminal:
File "TimeSorter.py", line 16, in <module>
newFilename = filename.replace(oldID, timeReader[oldID],1)
AttributeError: DictReader instance has no attribute '__getitem__'