How to obtain the size of a BLOB without reading t

2019-04-08 14:35发布

问题:

I have the following questions regarding BLOBs in sqlite:

  • Does sqlite keep track of sizes of BLOBs?
  • I'm guessing that it does, but then, does the length function use it, or does it read the BLOB's content?
  • If sqlite keeps track of the size of the BLOB and length doesn't use it, is the size accessible via some other functionality?

I'm asking this because I'm wondering if I should implement triggers that set BLOBs' sizes in additional columns, of if I can obtain the sizes dynamically without the performance hit of sqlite reading the BLOBs.

回答1:

From the source:

** In an SQLite index record, the serial type is stored directly before
** the blob of data that it corresponds to. In a table record, all serial
** types are stored at the start of the record, and the blobs of data at
** the end. Hence these functions allow the caller to handle the
** serial-type and data blob seperately.
**
** The following table describes the various storage classes for data:
**
**   serial type        bytes of data      type
**   --------------     ---------------    ---------------
**      0                     0            NULL
**      1                     1            signed integer
**      2                     2            signed integer
**      3                     3            signed integer
**      4                     4            signed integer
**      5                     6            signed integer
**      6                     8            signed integer
**      7                     8            IEEE float
**      8                     0            Integer constant 0
**      9                     0            Integer constant 1
**     10,11                               reserved for expansion
**    N>=12 and even       (N-12)/2        BLOB
**    N>=13 and odd        (N-13)/2        text

In other words, the blob size is in the serial, and it's length is simply "(serial_type-12)/2".
This serial is stored before the actual blob, so you don't need to read the blob to get its size.
Call sqlite3_blob_open and then sqlite3_blob_bytes to get this value.



回答2:

If you have access to the raw c api sqlite3_blob_bytes will do the job for you. If not please provide additional information.



回答3:

Write a 1byte and a 10GB blob in a test database. If length() takes the same time for both blobs, the blob's length is probably accessed. Otherwise the blob is probably read.

OR: download the source code and debug through it: http://www.sqlite.org/download.html. These are some relevant bits:

/*
** Implementation of the length() function
*/
static void lengthFunc(
  sqlite3_context *context,
  int argc,
  sqlite3_value **argv
){
  int len;

  assert( argc==1 );
  UNUSED_PARAMETER(argc);
  switch( sqlite3_value_type(argv[0]) ){
    case SQLITE_BLOB:
    case SQLITE_INTEGER:
    case SQLITE_FLOAT: {
      sqlite3_result_int(context, sqlite3_value_bytes(argv[0]));
      break;
    }
    case SQLITE_TEXT: {
      const unsigned char *z = sqlite3_value_text(argv[0]);
      if( z==0 ) return;
      len = 0;
      while( *z ){
        len++;
        SQLITE_SKIP_UTF8(z);
      }
      sqlite3_result_int(context, len);
      break;
    }
    default: {
      sqlite3_result_null(context);
      break;
    }
  }
}

and then

/*
** Return the number of bytes in the sqlite3_value object assuming
** that it uses the encoding "enc"
*/
SQLITE_PRIVATE int sqlite3ValueBytes(sqlite3_value *pVal, u8 enc){
  Mem *p = (Mem*)pVal;
  if( (p->flags & MEM_Blob)!=0 || sqlite3ValueText(pVal, enc) ){
    if( p->flags & MEM_Zero ){
      return p->n + p->u.nZero;
    }else{
      return p->n;
    }
  }
  return 0;
}

You can see that the length of text data is calculated on the fly. That of blobs... well, I'm not fluent enough in C... :-)