P/Invoke with arrays of double - marshalling data

2020-03-18 07:25发布

问题:

I've read the various MSDN pages on C++ Interop with P/Invoke here and here but I am still confused.

I have some large arrays of doubles that I need to get into native code, and some resulting arrays that need to get back. I do not know the sizes of the output arrays in advance. For simplicity, I will use only a single array in the example. The platform is x64; I read that marshalling internals are quite different between 32- and 64-bit environments so this might be important.

C#

    [DllImport("NativeLib.dll")]
    public static extern void ComputeSomething(double[] inputs, int inlen, 
        [Out] out IntPtr outputs, [Out] out int outlen);

    [DllImport("NativeLib.dll")]
    public static extern void FreeArray(IntPtr outputs);

    public void Compute(double[] inputs, out double[] outputs)
    {           
        IntPtr output_ptr;
        int outlen;
        ComputeSomething(inputs, inputs.Length, out output_ptr, out outlen);

        outputs = new double[outlen];
        Marshal.Copy(output_ptr, outputs, 0, outlen);

        FreeArray(output_ptr);
    }

C++

extern "C" 
{
    void ComputeSomething(double* inputs, int input_length, 
        double** outputs, int* output_length)
    {
    //...
    *output_length = ...;
    *outputs = new double[output_length];
    //...
    }

    void FreeArray(double* outputs)
    {
        delete[] outputs;
    }
}

It works, that is, I can read out the doubles I wrote into the array on the C++ side. However, I wonder:

  • Is this really the right way to use P/Invoke?
  • Aren't my signatures needlessly complicated?
  • Can P/Invoke be used more efficiently to solve this problem?
  • I believe I read that marshalling for single dimensional arrays of built-in types can be avoided. Is there a way around Marshal.Copy?

Note that we have a working C++/Cli version, but there are some problems related to local statics in third-party library code that lead to crashes. Microsoft marked this issue as WONTFIX, which is why I am looking for alternatives.

回答1:

If it were practical to separate the code that determines the output length from the code that populates the output then you could:

  • Export a function that returned the output length.
  • Call that from the C# code and then allocate the output buffer.
  • Call the unmanaged code again, this time asking it to populate the output buffer.

But I'm assuming that you have rejected this option because it is impractical. In which case your code is a perfectly reasonable way to solve your problem. In fact I would say that you've done a very good job.

The code will work just the same in x86 once you fix the calling convention mismatch. On the C++ side the calling convention is cdecl, but on the C# side it is stdcall. That doesn't matter on x64 since there is only one calling convention. But it would be a problem under x86.

Some comments:

  • You don't need to use [Out] as well as out. The latter implies the former.
  • You can avoid exporting the deallocator by allocating off a shared heap. For instance CoTaskMemAlloc on the C++ side, and then deallocate with Mashal.FreeCoTaskMem on the C# side.


回答2:

It is okayish. The complete lack of a way to return an error code is pretty bad, that's going to hurt when the arrays are large and the program runs out of memory. The hard crash you get is pretty undiagnosable.

The need to copy the arrays and to explicitly release them doesn't win any prizes of course. You solve that by letting the caller pass a pointer to its own array and you just write the elements. You however need a protocol to let the caller figure out how large the array needs to be, that is going to require calling the method twice. The first call returns the required size, the second call gets the job done.

A boilerplate example would be:

[DllImport("foo.dll")]
private static int ReturnData(double[] data, ref int dataLength);

And a sample usage:

int len = 0;
double[] data = null;
int err = ReturnData(data, ref len);
if (err == ERROR_MORE_DATA) {    // NOTE: expected
    data = new double[len];
    err = ReturnData(data, len);
}

No need to copy, no need to release memory, good thing. The native code can corrupt the GC heap if it doesn't pay attention to the passed len, not such a good thing. But of course easy to avoid.



回答3:

If you knew the array size beforehand, you could write a C++/CLI DLL that takes the managed array as parameter, pins it, and calls the native C++ DLL on the pinned pointer it obtains.

But if it's output-only, I don't see any version without a copy. You can use a SAFEARRAY so P/Invoke does the copying instead of you, but that's all.