Cannot encode to UTF-8 in JNI

2019-07-31 18:59发布

问题:

I'm trying to return jstring encoded with UTF-8, but app crashes and JNI writes out error:

JNI DETECTED ERROR IN APPLICATION: input is not valid Modified UTF-8: illegal continuation byte 0x30

My snippet:

jstring Java_tgio_rncryptor_RNCryptorNative_generateKey(JNIEnv *env, jobject instance, const jstring salt_, const jstring password_)
 {
    const char *salt = env->GetStringUTFChars(salt_, 0);
    const char *password = env->GetStringUTFChars(password_, 0);
    RNCryptor *cryptor = new RNCryptor();
    string value = (char * )cryptor->generateKey(salt, password).data();
    delete cryptor;
    env->ReleaseStringUTFChars(salt_, salt);
    env->ReleaseStringUTFChars(password_, password);
    return env->NewStringUTF(value.c_str());
}

Also tried:

const char *returning = env->GetStringUTFChars(env->NewStringUTF(value.c_str()), 0);
return env->NewStringUTF(returning);

Any suggestions?

回答1:

cryptor->generateKey returns SecByteBlock, a sequence of bytes. While casting to (char *) and constructing a std::string make sense because they do not deal only with text, jstring holds text (from the Unicode character set in the UTF-16 encoding).

You code tries to convert non-text bytes to a Java string. If you really want to do that you have to use a character set and encoding for which any sequence of values 0-255 are valid. (CP437 would be one.)

But instead, you could keep the data in a datatype closer to what it is: return a Java byte[]. Then on the Java side, you could convert the sequence of bytes in to Base 64, if you want to pass the key around as a string.

In general, cryptographic algorithms operate on byte sequences or blocks. It's only the application or wrapper functions that deal with text. You'll have check if RNCryptor does that for you but it doesn't look like that to me.



回答2:

Try this code for UTF-8

  try {
URLEncoder.encode(yourValue, "UTF-8")

} catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }


回答3:

Not all bytes are valid printable characters and most are not valid unicode characters. Standard practice when it is necessary to encode a byte array to a string is to use Base64 or hexadecimal.



回答4:

I solved the problem with returning jcharArray instead of jstring:

     env->ReleaseStringUTFChars(salt_, salt);
     env->ReleaseStringUTFChars(password_, password);
     char array[1024];
     strcpy(array, value.c_str());
     jcharArray charArr = env->NewCharArray(1024);
     env->SetCharArrayRegion(charArr, 0, 1024, (jchar *) array);
     return charArr;

And I just did String.valueOf(arr) in Java source code, and printed it.