Pass an array to a wrapped function as pointer+siz

2019-01-07 23:50发布

问题:

Given a header like:

#include <iostream>
#include <algorithm>
#include <iterator>

inline void foo(const signed char *arr, size_t sz) {
  std::copy_n(arr, sz, std::ostream_iterator<int>(std::cout, "\n"));
}

inline void bar(const signed char *begin, const signed char *end) {
  std::copy(begin, end, std::ostream_iterator<int>(std::cout, "\n"));
}

(I used C++11 here for convenience, this could be either C or C++ if you changed the implementations though)

How can I wrap these functions to take just an array on the Java side and use the (known) size of the array to provide the second parameter for these functions?

回答1:

The crux of this is that to wrap either of these functions you'll want to use a multi-argument typemap.

The preamble is pretty standard for SWIG. I used my personal favourite prgama to automatically load the shared library without the user of the interface needing to know:

%module test

%{
#include "test.hh"
%}

%pragma(java) jniclasscode=%{
  static {
    try {
        System.loadLibrary("test");
    } catch (UnsatisfiedLinkError e) {
      System.err.println("Native code library failed to load. \n" + e);
      System.exit(1);
    }
  }
%}

First though you'll need use a few Java typemaps to instruct SWIG to use byte[] as the type of both parts of the Java interface - the JNI and the wrapper that calls it. In the generate module file we'll be using the JNI type jbyteArray. We're passing the input directly from the SWIG interface to the JNI it generates.

%typemap(jtype) (const signed char *arr, size_t sz) "byte[]"
%typemap(jstype) (const signed char *arr, size_t sz) "byte[]"
%typemap(jni) (const signed char *arr, size_t sz) "jbyteArray"
%typemap(javain) (const signed char *arr, size_t sz) "$javainput"

When this is done we can write a multi-argument typemap:

%typemap(in,numinputs=1) (const signed char *arr, size_t sz) {
  $1 = JCALL2(GetByteArrayElements, jenv, $input, NULL);
  $2 = JCALL1(GetArrayLength, jenv, $input);
}

The job of the in typemap is to convert from what we're given by the JNI call to what the real function really expects as an input. I used numinputs=1 to indicate that the two real function arguments only take one input on the Java side, but this is the default value anyway, so it's not required to state that explicitly.

In this typemap $1 is the first argument of the typemap, i.e. the first argument of our function in this case. We set that by asking for a pointer to the underlying storage of the Java array (which may or may not be a copy really). We set $2, the second typemap argument to be the size of the array.

The JCALLn macros here make sure that the typemap can compile with both C and C++ JNI. It expands to the appropriate call for the language.

We need another typemap to clean up once the real function call has returned:

%typemap(freearg) (const signed char *arr, size_t sz) {
  // Or use  0 instead of ABORT to keep changes if it was a copy
  JCALL3(ReleaseByteArrayElements, jenv, $input, $1, JNI_ABORT); 
}

This calls ReleaseByteArrayElements to tell the JVM we're done with the array. It needs the pointer and the Java array object we obtained it from. In addition it takes a parameter that indicates if the contents should be copied back iff they were modified and the pointer we got was a copy in the first place. (The argument we passed NULL is an optional pointer to a jboolean which indicates if we've been given a copy).

For the second variant the typemaps are substantially similar:

%typemap(in,numinputs=1) (const signed char *begin, const signed char *end) {
  $1 = JCALL2(GetByteArrayElements, jenv, $input, NULL);
  const size_t sz = JCALL1(GetArrayLength, jenv, $input);
  $2 = $1 + sz;
}

%typemap(freearg) (const signed char *begin, const signed char *end) {
  // Or use  0 instead of ABORT to keep changes if it was a copy
  JCALL3(ReleaseByteArrayElements, jenv, $input, $1, JNI_ABORT);
}

%typemap(jtype) (const signed char *begin, const signed char *end) "byte[]"
%typemap(jstype) (const signed char *begin, const signed char *end) "byte[]"
%typemap(jni) (const signed char *begin, const signed char *end) "jbyteArray"
%typemap(javain) (const signed char *begin, const signed char *end) "$javainput"

The only difference being the use of the local variable sz to compute the end arugment using the begin pointer.

The only thing left to do is to tell SWIG to wrap the header file itself, using the typemaps we've just written:

%include "test.hh"

I tested both these functions with:

public class run {
  public static void main(String[] argv) {
    byte[] arr = {0,1,2,3,4,5,6,7};
    System.out.println("Foo:");
    test.foo(arr);
    System.out.println("Bar:");
    test.bar(arr);
  }
}

Which worked as expected.

For convenience I've shared the files I used in writing this on my site. Every line of every file in that archive can be reconstructed by following this answer sequentially.


For reference we could have done the whole thing without any JNI calls, using %pragma(java) modulecode to generate an overload that we use convert the input (in pure Java) into the form expected by the real functions. For that the module file would have been:

%module test

%{
#include "test.hh"
%}

%include <carrays.i>
%array_class(signed char, ByteArray);

%pragma(java) modulecode = %{
  // Overload foo to take an array and do a copy for us:
  public static void foo(byte[] array) {
    ByteArray temp = new ByteArray(array.length);
    for (int i = 0; i < array.length; ++i) {
      temp.setitem(i, array[i]);
    }
    foo(temp.cast(), array.length);
    // if foo can modify the input array we'll need to copy back to:
    for (int i = 0; i < array.length; ++i) {
      array[i] = temp.getitem(i);
    }
  }

  // How do we even get a SWIGTYPE_p_signed_char for end for bar?
  public static void bar(byte[] array) {
    ByteArray temp = new ByteArray(array.length);
    for (int i = 0; i < array.length; ++i) {
      temp.setitem(i, array[i]);
    }
    bar(temp.cast(), make_end_ptr(temp.cast(), array.length));
    // if bar can modify the input array we'll need to copy back to:
    for (int i = 0; i < array.length; ++i) {
      array[i] = temp.getitem(i);
    }
  }
%}

// Private helper to make the 'end' pointer that bar expects
%javamethodmodifiers make_end_ptr "private";
%inline {
  signed char *make_end_ptr(signed char *begin, int sz) {
    return begin+sz;
  }
}

%include "test.hh"

%pragma(java) jniclasscode=%{
  static {
    try {
        System.loadLibrary("test");
    } catch (UnsatisfiedLinkError e) {
      System.err.println("Native code library failed to load. \n" + e);
      System.exit(1);
    }
  }
%}

Besides the obvious (two) copies required to get the data into the right type (there's no trivial way to go from byte[] to SWIGTYPE_p_signed_char) and back this has another disadvantage - it's specific to the functions foo and bar, whereas the typemaps we wrote earlier are not specific to a given function - they'll be applied anywhere they match, even multiple times on the same function if you happen to have a function that takes two ranges or two pointer+length combinations. The one advantage of doing it this way is that if you happen to have other wrapped functions that are giving you SWIGTYPE_p_signed_char back then you'll still have the overloads available to use if you desire. Even in the case where you have a ByteArray from the %array_class you still can't do the pointer arithmetic in Java needed to generate end for you.

The original way shown gives a cleaner interface in Java, with the added advantages of not making excessive copies and being more reusable.


Yet another alternative approach to wrapping would be to write a few %inline overloads for foo and bar:

%inline {
  void foo(jbyteArray arr) {
    // take arr and call JNI to convert for foo
  }
  void bar(jbyteArray arr) {
    // ditto for bar
  }
}

These are presented as overloads in the Java interface, but they're still module specific and additionally the JNI required here is more complex than it would otherwise need to be - you need to arrange to get hold of jenv somehow, which isn't accessible by default. The options are a slow call to get it, or a numinputs=0 typemap that fills the parameter in automatically. Either way the multi-argument typemap seems far nicer.



标签: java c++ c swig