REMARK I completely rewrite the question as while exploring options and getting insights, I realized the origin of my problem was not at all what I thought.
I use Firebird as a database engine and the standard .Net provider (v.5.11.0) to fetch data using following code:
// myBlob1 is BLOB SUB_TYPE 1 (text field) with some empty, but some
// VERY long stuff (xml-content) which exceeds VARCHAR(32765), but I removed
// those before performing my tests!!!
var tick = Stopwatch.StartNew();
DataTable dataTable = new DataTable();
DbLive.Open();
FbCommand command = new FbCommand("SELECT myBlob FROM MY_TABLE", DbLive);
try {
dataTable.BeginLoadData();
FbDataAdapter fda = new FbDataAdapter(command);
fda.Fill(dataTable);
dataTable.EndLoadData();
}
command.Dispose();
DbLive.Close();
tick.Stop();
Console.WriteLine("Execution time: " + tick.ElapsedMilliseconds + " [ms]");
I am fetching about 30k rows with, in total, about 16Mb of data (according to database workbench's statistics) but only 20k rows with non-null data.
So I go fetching the whole thing using naive approach, then I used the cast (VARCHAR(8192))
approach (note that I removed all rows with > 8192 characters before doing the tests). Now, here are the results :
// Obtained when loading data over wifi, with a bandwidth of about 100Mbps)
// the performance on local machine did not make a big difference!)
No casting: 73287.0788 ms
With casting: 2360.2244 ms
There really is something bad going on with Blobs
here. I tested performance using Firebird 3 and compression, but results weren't much better (a very small improvement, but difference is still of the same order of magnitude).
To the point:
1. Can I expect an improvement if I use another provider (e.g. www.ibprovider.com
as suggested, but I was not able to test it)
2. Is this a known issue on other database engines, or is there hope to get improvements by changing to another engine?
I encountered a similar performance issue when using Firebird 3 embedded and FBDataAdapter to read BLOB text fields. The query itself didn't seem to be the problem - it executed pretty quickly in Flamerobin. Than I noticed that Visual Studio was reporting a long series of:
although these exceptions are not seen by the calling .Net application.
It turned out that much of the slowdown was apparently due to the Visual Studio IDE trying to handle the many exceptions that are raised. There's a "quiet" exception raised for every BLOB field that is read. This occurs when reading "segments" of the BLOB via
isc_get_segment
, which is called repeatedly until all segements have been read.isc_get_segement
apparently throws an exception behind the scenes when the end of data is encountered.When I'm running under the Visual Studio IDE and debugger, this slows things way down, but performance is reasonable if I run the application on its own.
I got a bit confused about many things with the above, so let me clarify some insights:
FbDataReader
and theDapper
approaches did, in fact, NOT load the blob fields, so performance was indeed irrelevant;Using the
CAST
toVARCHAR(4096)
did show significant performance improvements (note that here, I tested the text-blob's content to fully match, which did work):DataTable no cast: 73287.0788 ms
FbDataReader cast: 2224.1387 ms
DataTable cast: 2360.2244 ms
I even tried with Firebird 3 using compression, without significant improvement. So it is indeed the fact I use
BLOB
instead of largeVARCHAR
that caused the problem.Note that I did not succeed in testing with another provider.
Bottom line : using
BLOB SUBTYPE 1
kills my performance here, I should useVARCHAR
instead.