What's the best way to represent System.Double

2019-02-18 18:17发布

问题:

In data formats where all underlying types are strings, numeric types must be converted to a standardized string format which can be compared alphabetically. For example, a short for the value 27 could be represented as 00027 if there are no negatives.

What's the best way to represent a double as a string? In my case I can ignore negatives, but I'd be curious how you'd represent the double in either case.

UPDATE

Based on Jon Skeet's suggestion, I'm now using this, though I'm not 100% sure it'll work correctly:

static readonly string UlongFormatString = new string('0', ulong.MaxValue.ToString().Length);

public static string ToSortableString(this double n)
{
    return BitConverter.ToUInt64(BitConverter.GetBytes(BitConverter.DoubleToInt64Bits(n)), 0).ToString(UlongFormatString);
}

public static double DoubleFromSortableString(this string n)
{
    return BitConverter.Int64BitsToDouble(BitConverter.ToInt64(BitConverter.GetBytes(ulong.Parse(n)), 0));
}

UPDATE 2

I have confirmed what Jon suspected - negatives don't work using this method. Here is some sample code:

void Main()
{
    var a = double.MaxValue;
    var b = double.MaxValue/2;
    var c = 0d;
    var d = double.MinValue/2;
    var e = double.MinValue;
    Console.WriteLine(a.ToSortableString());
    Console.WriteLine(b.ToSortableString());
    Console.WriteLine(c.ToSortableString());
    Console.WriteLine(d.ToSortableString());
    Console.WriteLine(e.ToSortableString());
}

static class Test
{
    static readonly string UlongFormatString = new string('0', ulong.MaxValue.ToString().Length);
    public static string ToSortableString(this double n)
    {
        return BitConverter.ToUInt64(BitConverter.GetBytes(BitConverter.DoubleToInt64Bits(n)), 0).ToString(UlongFormatString);
    }
}

Which produces the following output:

09218868437227405311
09214364837600034815
00000000000000000000
18437736874454810623
18442240474082181119

Clearly not sorted as expected.

UPDATE 3

The accepted answer below is the correct one. Thanks guys!

回答1:

Padding is potentially rather awkward for doubles, given the enormous range (double.MaxValue is 1.7976931348623157E+308).

Does the string representation still have to be human-readable, or just reversible?

That gives a reversible conversion leading to a reasonably short string representation preserving lexicographic ordering - but it wouldn't be at all obvious what the double value was just from the string.

EDIT: Don't use BitConverter.DoubleToInt64Bits alone. That reverses the ordering for negative values.

I'm sure you can perform this conversion using DoubleToInt64Bits and then some bit-twiddling, but unfortunately I can't get it to work right now, and I have three kids who are desperate to go to the park...


In order to make everything sort correctly, negative numbers need to be stored in ones-complement format instead of sign magnitude (otherwise negatives and positives sort in opposite orders), and the sign bit needs to be flipped (to make negative sort less-than positives). This code should do the trick:

static ulong EncodeDouble(double d)
{
    long ieee = System.BitConverter.DoubleToInt64Bits(d);
    ulong widezero = 0;
    return ((ieee < 0)? widezero: ((~widezero) >> 1)) ^ (ulong)~ieee;
}

static double DecodeDouble(ulong lex)
{
    ulong widezero = 0;
    long ieee = (long)(((0 <= (long)lex)? widezero: ((~widezero) >> 1)) ^ ~lex);
    return System.BitConverter.Int64BitsToDouble(ieee);
}

Demonstration here: http://ideone.com/JPNPY

Here's the complete solution, to and from strings:

static string EncodeDouble(double d)
{
    long ieee = System.BitConverter.DoubleToInt64Bits(d);
    ulong widezero = 0;
    ulong lex = ((ieee < 0)? widezero: ((~widezero) >> 1)) ^ (ulong)~ieee;
    return lex.ToString("X16");
}

static double DecodeDouble(string s)
{
    ulong lex = ulong.Parse(s, System.Globalization.NumberStyles.AllowHexSpecifier);
    ulong widezero = 0;
    long ieee = (long)(((0 <= (long)lex)? widezero: ((~widezero) >> 1)) ^ ~lex);
    return System.BitConverter.Int64BitsToDouble(ieee);
}

Demonstration: http://ideone.com/pFciY



回答2:

I believe that a modified scientific notation, with the exponent first, and using underscore for positive, would sort lexically in the same order as numerically.

If you want, you can even append the normal representation, since a suffix won't affect sorting.

Examples

E000M3    +3.0
E001M2.7  +27.0

Unfortunately, it doesn't work for either negative numbers or negative exponents. You could introduce a bias for the exponent, like the IEEE format uses internally.



回答3:

As it turns out... The org.apache.solr.util package contains the NumberUtils class. This class has static methods that do everything needed to convert doubles (and other data values) to sortable strings (and back). The methods could not be easier to use. A few notes:

  1. Of course, NumberUtils is written in Java (not c#). My guess it that the code could be converted to c#... However, I am not well versed in c#. The source is readily available online.
  2. The resulting strings are not printable (at all).
  3. The comments in the code indicate that all exotic cases, including negative numbers and infinities, should work correctly.
  4. I haven't done any benchmarks... However, based on a quick scan of the code, it should be very fast.

The code below shows what needs to done to use this library.

String key = NumberUtils.double2sortableStr(35.2);