I have the following questions regarding BLOBs in sqlite:
- Does sqlite keep track of sizes of BLOBs?
- I'm guessing that it does, but then, does the length function use it, or does it read the BLOB's content?
- If sqlite keeps track of the size of the BLOB and length doesn't use it, is the size accessible via some other functionality?
I'm asking this because I'm wondering if I should implement triggers that set BLOBs' sizes in additional columns, of if I can obtain the sizes dynamically without the performance hit of sqlite reading the BLOBs.
From the source:
** In an SQLite index record, the serial type is stored directly before
** the blob of data that it corresponds to. In a table record, all serial
** types are stored at the start of the record, and the blobs of data at
** the end. Hence these functions allow the caller to handle the
** serial-type and data blob seperately.
**
** The following table describes the various storage classes for data:
**
** serial type bytes of data type
** -------------- --------------- ---------------
** 0 0 NULL
** 1 1 signed integer
** 2 2 signed integer
** 3 3 signed integer
** 4 4 signed integer
** 5 6 signed integer
** 6 8 signed integer
** 7 8 IEEE float
** 8 0 Integer constant 0
** 9 0 Integer constant 1
** 10,11 reserved for expansion
** N>=12 and even (N-12)/2 BLOB
** N>=13 and odd (N-13)/2 text
In other words, the blob size is in the serial, and it's length is simply "(serial_type-12)/2".
This serial is stored before the actual blob, so you don't need to read the blob to get its size.
Call sqlite3_blob_open and then sqlite3_blob_bytes to get this value.
If you have access to the raw c api sqlite3_blob_bytes will do the job for you. If not please provide additional information.
Write a 1byte and a 10GB blob in a test database. If length()
takes the same time for both blobs, the blob's length is probably accessed. Otherwise the blob is probably read.
OR: download the source code and debug through it: http://www.sqlite.org/download.html. These are some relevant bits:
/*
** Implementation of the length() function
*/
static void lengthFunc(
sqlite3_context *context,
int argc,
sqlite3_value **argv
){
int len;
assert( argc==1 );
UNUSED_PARAMETER(argc);
switch( sqlite3_value_type(argv[0]) ){
case SQLITE_BLOB:
case SQLITE_INTEGER:
case SQLITE_FLOAT: {
sqlite3_result_int(context, sqlite3_value_bytes(argv[0]));
break;
}
case SQLITE_TEXT: {
const unsigned char *z = sqlite3_value_text(argv[0]);
if( z==0 ) return;
len = 0;
while( *z ){
len++;
SQLITE_SKIP_UTF8(z);
}
sqlite3_result_int(context, len);
break;
}
default: {
sqlite3_result_null(context);
break;
}
}
}
and then
/*
** Return the number of bytes in the sqlite3_value object assuming
** that it uses the encoding "enc"
*/
SQLITE_PRIVATE int sqlite3ValueBytes(sqlite3_value *pVal, u8 enc){
Mem *p = (Mem*)pVal;
if( (p->flags & MEM_Blob)!=0 || sqlite3ValueText(pVal, enc) ){
if( p->flags & MEM_Zero ){
return p->n + p->u.nZero;
}else{
return p->n;
}
}
return 0;
}
You can see that the length of text data is calculated on the fly. That of blobs... well, I'm not fluent enough in C... :-)