“Failed to parse input” from Google protocol buffe

2019-03-06 12:46发布

问题:

I have a google protobuf file from OpenStreetMap, specifically I have the 1.4MB Liechtenstein country extract from Geofabrik. The protoc command says it "write the raw tag/values to stdout" with the --decode_raw option. However I keep getting this error:

$ cat liechtenstein-latest.osm.pbf | protoc --decode_raw
Failed to parse input.

I have compiled and installed the protobuf library direct from Google, version 2.6.1 which is the current one.

This file is valid, various OpenStreetMap tools that read pbf files (osm2pgsql, osmosis) can read it fine.

What could be wrong? How can I get --decode_raw to work? Am I doing something wrong?

回答1:

The OpenStreetMap .osm.pbf format is not a raw protocol buffer. The format is documented here:

http://wiki.openstreetmap.org/wiki/PBF_Format

Key quote:

The format is a repeating sequence of:

  • int4: length of the BlobHeader message in network byte order
  • serialized BlobHeader message
  • serialized Blob message (size is given in the header)

So you need to read four bytes first, interpret them as an integer (big-endian), then read that many bytes and parse as a BlobHeader, and that in turn will tell you how many bytes to read and parse as a Blob.

The protoc tool will not do this automatically since it doesn't know this format. Probably there is an OSM-specific tool out there that you can use.