I want to read all rows in a table using system.data.sqlite. As I have a very big Table (>450GB, with > 6 billion rows) I want to be sure that sqlite will use consequtive disk access. As you may know a random access to hard disk is slow. Due to memory restictions I can not load all the data at once. So the optimal way would be if sqlite reads some hundred MB (consecutive), then I work with this data and sqlite reads the next.
How can I be sure that sqlite will do disk access this way and not jumping from one position on hard disk to another?
Things I know (I think these suggestions will show up):
- It could be better to use an other DBMS. But I want/need to solve it wiht this one.
- I know that the disk head will be positioned by the OS on other data while I am workign with thhe data. This doesn't matter. It is just that some hundret MB will be read consecutively.
- I don't want to / can split the database file up into smaller pieces
I found this post, but it is not addressing my problem correctly:
Which is the fastest way to retrieve all items in SQLite?
In SQLite, table rows are stored sorted by the rowid, so the most efficient way to read the rows in this order is to sort by this column (or the alias you declared with INTEGER PRIMARY KEY):
For a WITHOUT ROWID table, you sort by the primary key column(s):
That's what clustered index are for. sqlite doesn't support them though.
The following is copied from: http://www.sqlite.org/cvstrac/wiki?p=PerformanceTuningWindows
Bottom line is you can re-order your records manually, but I imagine this would be practical for you only if you don't intend to write to the table to often.
From @CL. answer in this post: