1

Trying to deserialize a message using protobuf in Java and getting the below exception.

Caused by: com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length. at com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:86) at com.google.protobuf.CodedInputStream$ArrayDecoder.readRawLittleEndian64(CodedInputStream.java:1179) at com.google.protobuf.CodedInputStream$ArrayDecoder.readFixed64(CodedInputStream.java:791) at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:534) at com.google.protobuf.GeneratedMessageV3.parseUnknownFieldProto3(GeneratedMessageV3.java:305)

4

1 回答 1

2

I've manually decoded your string, and I agree with the library: your message is truncated. I'm guessing that this is because you're using string-based APIs, and there is a zero-byte in the data - many text APIs see a zero-byte (NUL in ASCII terms) to mean the end of the string.

Here's the breakdown:

\n=10=field 1, length prefix - I'm assuming this is a string
\x14=20
"id:article:v1:964000"
(22 bytes used for field 1)

\x12=18=field 2, length prefix - I'm assuming this is a sub-messssage
$=36
  \n=10=field 1, length prefix - I'm assuming this is a string
  \x10=16
  "predicted_topics"
  (18 bytes used for field 2.1)

  \x12=18=field 2, length prefix - I'm assuming this is a string
  \x06=6
  "IS/biz"
  (8 bytes used for field 2.2)

  \x1a=26=field 3, length prefix - I'm assuming this is "bytes"
  \x08=8
    \xf0
    l
    \x8f
    \xde
    p
    \x9f
    \xe4

    (unexpected EOF)

at the end, we're trying to decode 8 bytes of the inner-most message, and we've only got 7 bytes left. I know this isn't a sub-message because that would result in an invalid tag, and it doesn't look like UTF-8, so I'm assuming that this is a bytes field (but frankly it doesn't matter: we need 8 bytes, and we only have 7).

My guess is that the last byte in the bytes field was zero; if we assume a missing \x00 at the end, then field 2.3 is 10 bytes, and we've accounted for 18+8+10=36 bytes, which would make the sub-message (field 2) complete. There may well be more missing data after the outer sub-message - I have no way of knowing.

So: make sure you're not using text-based APIs with binary data.

于 2018-08-28T11:28:21.877 回答