When returning data from a DataReader
I would typically use the ordinal reference on the DataReader
to grab the relevant column:
if (dr.HasRows)
Console.WriteLine(dr[0].ToString());
or
if (dr.HasRows)
Console.WriteLine(dr.GetString(0));
or
if (dr.HasRows)
Console.WriteLine((string)dr[0]);
I have always done this because I was advised at an early stage that using dr["ColumnName"]
or a more elegant way of indexing causes a performance hit.
However, while all references to data entities are becoming increasingly strongly-typed I feel more uncomfortable with this. I'm also aware that the above does not check for DBNull
.
What is the most robust way to return data from a DataReader
?
The string name lookup is much more expensive than the ordinal call, but more maintainable and less "fragile" than hard coding the ordinals. So here's how I've been doing it. It's the best of both worlds. I don't have to remember the ordinal values or care if the column order changes, but I get the performance benefits of using ordinals.
Note: this only really makes sense for readers you expect to have a decent number of rows in. The
GetOrdinal()
calls are extra, and only pay for themselves if your combined savings from callingGetString(int ordinalNumber)
in the loop are greater than the cost of callingGetOrdinal
.Edit: missed the second part of this question. Regarding DBNull values, I've started writing extension methods that deal with that possibility. example:
dr.GetDatetimeSafely()
Within those extension methods, you can do whatever you need to feel confident you get the expected value back.The problem with ordinal is if the order of the columns change and you are forced to modify the consumer code for the DataReader, unlike using column names.
I dont think there is a performance gain when using ordinal or column names, it is more on best practice and coding standards and code maintainability really
I always go with the string name approach just because reading the code is cleaner. Having to mentally parse index to column name is horrible.
I do think that indexed fields is the better approach, if it would only be for avoiding field names changes from the underlying database, which would require a recompile of your application because you hardcoded the field name.
For each of the fields, you will need to manually check for nulls.
In addition to it, it would allow you to use reflection and make this "GetData()" method more generic providing a typeof(T) and getting the proper constructor for the proper type. The binding to the order of each columns is the only one thing some wish to avoid, but in this case, it becomes worthy.
It is possible to argue both sides in this situation. As already pointed out by others, using the name is more readable and will not break if someone changes the order of columns in the underlying database. But one might also argue the case that using an ordinal has the advantage of not breaking if someone changes the column name in the underlying database. I prefer the former argument, though, and think the readability argument for column names trumps the second argument in general. And an additional argument for names is that it is that it can “self-detect” errors. If someone does change a field name, then the code has a better chance of breaking rather than having the subtle bug of appearing to work while it reads the wrong field.
It seems obvious but maybe it is worth mentioning a usage case that has both the self-detecting error and the performance of ordinals. If you specify the SELECT list explicitly in the SQL, then using ordinals won’t be a problem because the statement in the code guarantees the order:
In this case, it would be fairly safe to use ordinals to access the data. It doesn’t matter if someone moves fields around in the table. And if someone changes a name, then the SQL statement produce an error when it runs.
And one final point. I just ran a test on a provider I helped write. The test read 1 million rows and accessed the “lastname” field on each record (compared against a value). The usage of
rdr[“lastname”]
took 3301 milliseconds to process whilerdr.GetString(1)
took 2640 milliseconds (approximately a 25% speedup). In this particular provider, the lookup of the name uses a sorted lookup to translate the name to ordinal.Indexing the data reader by name is slightly more expensive. There are two main reasons for this.
However, in most cases the performance penalty incurred by looking up a field by name is dwarfed by the amount of time it takes for the database to execute the command. Do not let the performance penalty dictate your choice between name and numeric indexing.
Despite the slight performance penalty I normally choose name indexing for two reasons.
If you feel like the performance penalty of name indexing is becoming problematic (maybe the command executes quickly, but returns a lot of rows) then lookup the numeric index once by name, save it away, and use it for the remaining rows.