We using protobuf v.3 to transfer messages from C# client to Java server over HTTP.
The message proto looks like this:
message CLIENT_MESSAGE {
string message = 1;
}
Both client and server uses UTF-8 character encoding for strings.
Everything is fine whe we are using short string values like "abc", but when we trying to transfer string with 198 chars in it, we catchig an Exception:
com.google.protobuf.InvalidProtocolBufferException:
While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length.
We tried to compare even byte array containing protobuf data, and didn't found a solution. For "aaa" string byte array starts with this bytes:
10 3 97 97 97
Where 10 is protobuf field number, and 3 is string length, 69 65 67 is "aaa".
For string
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
which contains 198 characters in it, byte array starts with this:
10 198 1 97 97 97....
Where 10 is protobuf field number, and 198 is string length, and 1 seems to be like string identifier, or what?
And why protobuf cannot parse this message?
Already spent almost a day on looking for solution for this problem, any help appreciated.
UPDATE:
We made dumps both from client and server, and what is weird - the dumps is different!
Protobuf dump from client, before sending to server:
00000000 0A C6 01 61 61 61 61 61 61 61 61 61 61 61 61 61 ·Æ·aaaaaaaaaaaaa
00000010 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00000020 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00000030 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00000040 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00000050 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00000060 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00000070 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00000080 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00000090 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
000000A0 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
000000B0 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
000000C0 61 61 61 61 61 61 61 61 61 aaaaaaaaa
Protobuf dump which server receives:
0000: 0A EF BF BD 01 61 61 61 61 61 61 61 61 61 61 61 .....aaaaaaaaaaa
0010: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
0020: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
0030: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
0040: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
0050: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
0060: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
0070: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
0080: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
0090: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00A0: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00B0: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa
00C0: 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaa
As you can see, the protobuf data headers are different... Thats totally breaking my mind, how could that happens?
UPDATE2: we made a research, and found that this problem happens only with strings longer than 128 symbols. If string consist from 128 symbols, or lesser - there is no problem.