I have a sqlite database (appr. 11 GB) that has multiple tables including the tables distance
and vertices
. The table distance
is pretty large (120 mio rows), vertices
is smaller (15 000 rows). I want to use sqlite3 in python to update one column of distance
by values of another column in vertices
. The table vertices has an index on column cat
and another index on orig_cat
.
What I am doing:
import sqlite3
db_path='path/to/db.db'
conn = sqlite3.connect(db_path)
cur = conn.cursor()
cur.execute('''UPDATE distance SET
from_orig_v = (SELECT orig_cat FROM vertices WHERE cat=distance.source)''')
However running that update statement on such a large database, causes a memory error. The memory usage is increasing steadily until it crashes. I am looking for advise to perform such a large update statement without running out of memory? Maybe processing the update in chunks (i.e. rows of distance
table) and committing after e.g. 1000 updates to free memory? How would that be done in python/sqlite?