Most efficient way to check for DBNull and then as

2019-01-02 19:15发布

站内文章 / 前端开发

35 0

裙下三千臣

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

This question comes up occasionally, but I haven't seen a satisfactory answer.

A typical pattern is (row is a DataRow):

 if (row["value"] != DBNull.Value)
 {
      someObject.Member = row["value"];
 }

My first question is which is more efficient (I've flipped the condition):

  row["value"] == DBNull.Value; // Or
  row["value"] is DBNull; // Or
  row["value"].GetType() == typeof(DBNull) // Or... any suggestions?

This indicates that .GetType() should be faster, but maybe the compiler knows a few tricks I don't?

Second question, is it worth caching the value of row["value"] or does the compiler optimize the indexer away anyway?

For example:

  object valueHolder;
  if (DBNull.Value == (valueHolder = row["value"])) {}

Notes:

row["value"] exists.
I don't know the column index of the column (hence the column name lookup).
I'm asking specifically about checking for DBNull and then assignment (not about premature optimization, etc.).

I benchmarked a few scenarios (time in seconds, 10,000,000 trials):

row["value"] == DBNull.Value: 00:00:01.5478995
row["value"] is DBNull: 00:00:01.6306578
row["value"].GetType() == typeof(DBNull): 00:00:02.0138757

Object.ReferenceEquals has the same performance as "=="

The most interesting result? If you mismatch the name of the column by case (for example, "Value" instead of "value", it takes roughly ten times longer (for a string):

row["Value"] == DBNull.Value: 00:00:12.2792374

The moral of the story seems to be that if you can't look up a column by its index, then ensure that the column name you feed to the indexer matches the DataColumn's name exactly.

Caching the value also appears to be nearly twice as fast:

No Caching: 00:00:03.0996622
With Caching: 00:00:01.5659920

So the most efficient method seems to be:

 object temp;
 string variable;
 if (DBNull.Value != (temp = row["value"]))
 {
      variable = temp.ToString();
 }

回答1:

I must be missing something. Isn't checking for DBNull exactly what the DataRow.IsNull method does?

I've been using the following two extension methods:

public static T? GetValue<T>(this DataRow row, string columnName) where T : struct
{
    if (row.IsNull(columnName))
        return null;

    return row[columnName] as T?;
}

public static string GetText(this DataRow row, string columnName)
{
    if (row.IsNull(columnName))
        return string.Empty;

    return row[columnName] as string ?? string.Empty;
}

Usage:

int? id = row.GetValue<int>("Id");
string name = row.GetText("Name");
double? price = row.GetValue<double>("Price");

If you didn't want Nullable<T> return values for GetValue<T>, you could easily return default(T) or some other option instead.

On an unrelated note, here's a VB.NET alternative to Stevo3000's suggestion:

oSomeObject.IntMember = If(TryConvert(Of Integer)(oRow("Value")), iDefault)
oSomeObject.StringMember = If(TryCast(oRow("Name"), String), sDefault)

Function TryConvert(Of T As Structure)(ByVal obj As Object) As T?
    If TypeOf obj Is T Then
        Return New T?(DirectCast(obj, T))
    Else
        Return Nothing
    End If
End Function

回答2:

You should use the method:

Convert.IsDBNull()

Considering it's built-in to the Framework, I would expect this to be the most efficient.

I'd suggest something along the lines of:

int? myValue = (Convert.IsDBNull(row["column"]) ? null : (int?) Convert.ToInt32(row["column"]));

And yes, the compiler should cache it for you.

回答3:

The compiler won't optimise away the indexer (i.e. if you use row["value"] twice), so yes it is slightly quicker to do:

object value = row["value"];

and then use value twice; using .GetType() risks issues if it is null...

DBNull.Value is actually a singleton, so to add a 4th option - you could perhaps use ReferenceEquals - but in reality, I think you're worrying too much here... I don't think the speed different between "is", "==" etc is going to be the cause of any performance problem you are seeing. Profile your entire code and focus on something that matters... it won't be this.

回答4:

I would use the following code in C# (VB.NET is not as simple).

The code assigns the value if it is not null/DBNull, otherwise it asigns the default which could be set to the LHS value allowing the compiler to ignore the assign.

oSomeObject.IntMemeber = oRow["Value"] as int? ?? iDefault;
oSomeObject.StringMember = oRow["Name"] as string ?? sDefault;

回答5:

I feel only very few approaches here doesn't risk the prospect OP the most worry (Marc Gravell, Stevo3000, Richard Szalay, Neil, Darren Koppand) and most are unnecessarily complex. Being fully aware this is useless micro-optimization, let me say you should basically employ these:

1) Don't read the value from DataReader/DataRow twice - so either cache it before null checks and casts/conversions or even better directly pass your record[X] object to a custom extension method with appropriate signature.

2) To obey the above, do not use built in IsDBNull function on your DataReader/DataRow since that calls the record[X] internally, so in effect you will be doing it twice.

3) Type comparison will be always slower than value comparison as a general rule. Just do record[X] == DBNull.Value better.

4) Direct casting will be faster than calling Convert class for converting, though I fear the latter will falter less.

5) Lastly, accessing record by index rather than column name will be faster again.

I feel going by the approaches of Szalay, Neil and Darren Koppand will be better. I particularly like Darren Koppand's extension method approach which takes in IDataRecord (though I would like to narrow it down further to IDataReader) and index/column name.

Take care to call it:

record.GetColumnValue<int?>("field");

and not

record.GetColumnValue<int>("field");

in case you need to differentiate between 0 and DBNull. For example, if you have null values in enum fields, otherwise default(MyEnum) risks first enum value being returned. So better call record.GetColumnValue<MyEnum?>("Field").

Since you're reading from a DataRow, I would create extension method for both DataRow and IDataReader by DRYing common code.

public static T Get<T>(this DataRow dr, int index, T defaultValue = default(T))
{
    return dr[index].Get<T>(defaultValue);
}

static T Get<T>(this object obj, T defaultValue) //Private method on object.. just to use internally.
{
    if (obj.IsNull())
        return defaultValue;

    return (T)obj;
}

public static bool IsNull<T>(this T obj) where T : class 
{
    return (object)obj == null || obj == DBNull.Value;
} 

public static T Get<T>(this IDataReader dr, int index, T defaultValue = default(T))
{
    return dr[index].Get<T>(defaultValue);
}

So now call it like:

record.Get<int>(1); //if DBNull should be treated as 0
record.Get<int?>(1); //if DBNull should be treated as null
record.Get<int>(1, -1); //if DBNull should be treated as a custom value, say -1

I believe this is how it should have been in the framework (instead of the record.GetInt32, record.GetString etc methods) in the first place - no run-time exceptions and gives us the flexibility to handle null values.

From my experience I had less luck with one generic method to read from the database. I always had to custom handle various types, so I had to write my own GetInt, GetEnum, GetGuid, etc. methods in the long run. What if you wanted to trim white spaces when reading string from db by default, or treat DBNull as empty string? Or if your decimal should be truncated of all trailing zeroes. I had most trouble with Guid type where different connector drivers behaved differently that too when underlying databases can store them as string or binary. I have an overload like this:

static T Get<T>(this object obj, T defaultValue, Func<object, T> converter)
{
    if (obj.IsNull())
        return defaultValue;

    return converter  == null ? (T)obj : converter(obj);
}

With Stevo3000's approach, I find the calling a bit ugly and tedious, and it will be harder to make a generic function out of it.

回答6:

There is the troublesome case where the object could be a string. The below extension method code handles all cases. Here's how you would use it:

    static void Main(string[] args)
    {
        object number = DBNull.Value;

        int newNumber = number.SafeDBNull<int>();

        Console.WriteLine(newNumber);
    }



    public static T SafeDBNull<T>(this object value, T defaultValue) 
    {
        if (value == null)
            return default(T);

        if (value is string)
            return (T) Convert.ChangeType(value, typeof(T));

        return (value == DBNull.Value) ? defaultValue : (T)value;
    } 

    public static T SafeDBNull<T>(this object value) 
    { 
        return value.SafeDBNull(default(T)); 
    }

回答7:

I personally favour this syntax, which uses the explicit IsDbNull method exposed by IDataRecord, and caches the column index to avoid a duplicate string lookup.

Expanded for readability, it goes something like:

int columnIndex = row.GetOrdinal("Foo");
string foo; // the variable we're assigning based on the column value.
if (row.IsDBNull(columnIndex)) {
  foo = String.Empty; // or whatever
} else { 
  foo = row.GetString(columnIndex);
}

Rewritten to fit on a single line for compactness in DAL code - note that in this example we're assigning int bar = -1 if row["Bar"] is null.

int i; // can be reused for every field.
string foo  = (row.IsDBNull(i  = row.GetOrdinal("Foo")) ? null : row.GetString(i));
int bar = (row.IsDbNull(i = row.GetOrdinal("Bar")) ? -1 : row.GetInt32(i));

The inline assignment can be confusing if you don't know it's there, but it keeps the entire operation on one line, which I think enhances readability when you're populating properties from multiple columns in one block of code.

回答8:

Not that I've done this, but you could get around the double indexer call and still keep your code clean by using a static / extension method.

Ie.

public static IsDBNull<T>(this object value, T default)
{
    return (value == DBNull.Value)
        ? default
        : (T)value;
}

public static IsDBNull<T>(this object value)
{
    return value.IsDBNull(default(T));
}

Then:

IDataRecord record; // Comes from somewhere

entity.StringProperty = record["StringProperty"].IsDBNull<string>(null);
entity.Int32Property = record["Int32Property"].IsDBNull<int>(50);

entity.NoDefaultString = record["NoDefaultString"].IsDBNull<string>();
entity.NoDefaultInt = record["NoDefaultInt"].IsDBNull<int>();

Also has the benefit of keeping the null checking logic in one place. Downside is, of course, that it's an extra method call.

Just a thought.

回答9:

I always use :

if (row["value"] != DBNull.Value)
  someObject.Member = row["value"];

Found it short and comprehensive.

回答10:

This is how I handle reading from DataRows

///<summary>
/// Handles operations for Enumerations
///</summary>
public static class DataRowUserExtensions
{
    /// <summary>
    /// Gets the specified data row.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="dataRow">The data row.</param>
    /// <param name="key">The key.</param>
    /// <returns></returns>
    public static T Get<T>(this DataRow dataRow, string key)
    {
        return (T) ChangeTypeTo<T>(dataRow[key]);
    }

    private static object ChangeTypeTo<T>(this object value)
    {
        Type underlyingType = typeof (T);
        if (underlyingType == null)
            throw new ArgumentNullException("value");

        if (underlyingType.IsGenericType && underlyingType.GetGenericTypeDefinition().Equals(typeof (Nullable<>)))
        {
            if (value == null)
                return null;
            var converter = new NullableConverter(underlyingType);
            underlyingType = converter.UnderlyingType;
        }

        // Try changing to Guid  
        if (underlyingType == typeof (Guid))
        {
            try
            {
                return new Guid(value.ToString());
            }
            catch

            {
                return null;
            }
        }
        return Convert.ChangeType(value, underlyingType);
    }
}

Usage example:

if (dbRow.Get<int>("Type") == 1)
{
    newNode = new TreeViewNode
                  {
                      ToolTip = dbRow.Get<string>("Name"),
                      Text = (dbRow.Get<string>("Name").Length > 25 ? dbRow.Get<string>("Name").Substring(0, 25) + "..." : dbRow.Get<string>("Name")),
                      ImageUrl = "file.gif",
                      ID = dbRow.Get<string>("ReportPath"),
                      Value = dbRow.Get<string>("ReportDescription").Replace("'", "\'"),
                      NavigateUrl = ("?ReportType=" + dbRow.Get<string>("ReportPath"))
                  };
}

Props to Monsters Got My .Net for ChageTypeTo code.

回答11:

I try to avoid this check as much as possible.

Obviously doesn't need to be done for columns that can't hold null.

If you're storing in a Nullable value type (int?, etc.), you can just convert using as int?.

If you don't need to differentiate between string.Empty and null, you can just call .ToString(), since DBNull will return string.Empty.

回答12:

I've done something similar with extension methods. Here's my code:

public static class DataExtensions
{
    /// <summary>
    /// Gets the value.
    /// </summary>
    /// <typeparam name="T">The type of the data stored in the record</typeparam>
    /// <param name="record">The record.</param>
    /// <param name="columnName">Name of the column.</param>
    /// <returns></returns>
    public static T GetColumnValue<T>(this IDataRecord record, string columnName)
    {
        return GetColumnValue<T>(record, columnName, default(T));
    }

    /// <summary>
    /// Gets the value.
    /// </summary>
    /// <typeparam name="T">The type of the data stored in the record</typeparam>
    /// <param name="record">The record.</param>
    /// <param name="columnName">Name of the column.</param>
    /// <param name="defaultValue">The value to return if the column contains a <value>DBNull.Value</value> value.</param>
    /// <returns></returns>
    public static T GetColumnValue<T>(this IDataRecord record, string columnName, T defaultValue)
    {
        object value = record[columnName];
        if (value == null || value == DBNull.Value)
        {
            return defaultValue;
        }
        else
        {
            return (T)value;
        }
    }
}

To use it, you would do something like

int number = record.GetColumnValue<int>("Number",0)

回答13:

if in a DataRow the row["fieldname"] isDbNull replace it with 0 otherwise get the decimal value:

decimal result = rw["fieldname"] as decimal? ?? 0;

回答14:

public static class DBH
{
    /// <summary>
    /// Return default(T) if supplied with DBNull.Value
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="value"></param>
    /// <returns></returns>
    public static T Get<T>(object value)
    {   
        return value == DBNull.Value ? default(T) : (T)value;
    }
}

use like this

DBH.Get<String>(itemRow["MyField"])

回答15:

I have IsDBNull in a program which reads a lot of data from a database. With IsDBNull it loads data in about 20 seconds. Without IsDBNull, about 1 second.

So I think it is better to use:

public String TryGetString(SqlDataReader sqlReader, int row)
{
    String res = "";
    try
    {
        res = sqlReader.GetString(row);
    }
    catch (Exception)
    { 
    }
    return res;
}

标签：

裙下三千臣

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~