I have about 1'500 PDFs consisting of only 1 page each, and exhibiting the same structure (see http://files.newsnetz.ch/extern/interactive/downloads/BAG_15m_kzh_2012_de.pdf for an example).
What I am looking for is a way to iterate over all these files (locally, if possible) and extract the actual contents of the table (as CSV, stored into a SQLite DB, whatever).
I would love to do this in Node.js, but couldn't find any suitable libraries for parsing such stuff. Do you know of any?
If not possible in Node.js, I could also code it in Python, if there are better methods available.