I am trying to use Tabula-py to read a pdf. I installed tabula-py through pip install tabula-py
I have also installed the required dependencies
requests
pandas
pytest
flake8
My code is currently as follows:
import tabula
import pandas as pd
df = tabula.read_pdf("report.pdf", pages=2)
print(df)
I am getting the following error:
Traceback (most recent call last):
File "tabula_pdf_reader.py", line 1, in <module>
import tabula
ImportError: No module named tabula
Any inputs to what I am missing here?
I faced this same issue in Ubuntu.
First, check the version of the JDK and JRE that are installed on your machine by running java --version
and javac --version
. Each should have a version greater than 7.
Then use pip3
to install tabula.
I got the same issue here when executing on Terminal.
However, after I ran by starting with 'ipython3' instead of 'ipython', it worked perfectly.
You have to make sure that tabula-py module is installed in python3 directory, not python2
use this
import camelot
tables = camelot.read_pdf('foo.pdf')
tables.export('foo.csv', f='csv', compress=True)