Sqlite3: Disabling primary key index while inserti

2019-01-14 03:35发布

问题:

I have an Sqlite3 database with a table and a primary key consisting of two integers, and I'm trying to insert lots of data into it (ie. around 1GB or so)

The issue I'm having is that creating primary key also implicitly creates an index, which in my case bogs down inserts to a crawl after a few commits (and that would be because the database file is on NFS.. sigh).

So, I'd like to somehow temporary disable that index. My best plan so far involved dropping the primary key's automatic index, however it seems that SQLite doesn't like it and throws an error if I attempt to do it.

My second best plan would involve the application making transparent copies of the database on the network drive, making modifications and then merging it back. Note that as opposed to most SQlite/NFS questions, I don't need access concurrency.

What would be a correct way to do something like that?

UPDATE:

I forgot to specify the flags I'm already using:

PRAGMA synchronous = OFF
PRAGMA journal_mode = OFF
PRAGMA locking_mode = EXCLUSIVE
PRAGMA temp_store = MEMORY

UPDATE 2: I'm in fact inserting items in batches, however every next batch is slower to commit than previous one (I'm assuming this has to do with the size of index). I tried doing batches of between 10k and 50k tuples, each one being two integers and a float.

回答1:

  1. You can't remove embedded index since it's the only address of row.
  2. Merge your 2 integer keys in single long key = (key1<<32) + key2; and make this as a INTEGER PRIMARY KEY in youd schema (in that case you will have only 1 index)
  3. Set page size for new DB at least 4096
  4. Remove ANY additional index except primary
  5. Fill in data in the SORTED order so that primary key is growing.
  6. Reuse commands, don't create each time them from string
  7. Set page cache size to as much memory as you have left (remember that cache size is in number of pages, but not number of bytes)
  8. Commit every 50000 items.
  9. If you have additional indexes - create them only AFTER ALL data is in table

If you'll be able to merge key (I think you're using 32bit, while sqlite using 64bit, so it's possible) and fill data in sorted order I bet you will fill in your first Gb with the same performance as second and both will be fast enough.



回答2:

Are you doing the INSERT of each new as an individual Transaction?

If you use BEGIN TRANSACTION and INSERT rows in batches then I think the index will only get rebuilt at the end of each Transaction.



回答3:

See faster-bulk-inserts-in-sqlite3.