I am trying to parse Well Known Binary a binary encoding of geometry objects used in Geographic Information Systems (GIS). I am using this spec from ESRI (same results here from esri). I have input data from Osmosis a tool to parse OpenStreetMap data, specifically the pgsimp-dump format which gives the hex represenation of the binary.
The ESRI docs say that there should only be 21 bytes for a Point
, 1 byte for byte order, 4 for uint32 for typeid, and 8 for double x and 8 for double y.
An example from osmosis is this (hex) example: 0101000020E6100000DB81DF2B5F7822C0DFBB7262B4744A40
, which is 25 bytes long.
Shapely a python programme to parse WKB (etc), which is based on the popular C library GEOS is able to parse this string:
>>> import shapely.wkb
>>> shapely.wkb.loads("0101000020E6100000DB81DF2B5F7822C0DFBB7262B4744A40", hex=True)
<shapely.geometry.point.Point object at 0x7f221f2581d0>
When I ask Shapely to parse from then convert to WKB I get a 21 bytes.
>>> shapely.wkb.loads("0101000020E6100000DB81DF2B5F7822C0DFBB7262B4744A40", hex=True).wkb.encode("hex").upper()
'0101000000DB81DF2B5F7822C0DFBB7262B4744A40'
The difference is the 4 bytes in the middle, which appear 3 bytes into the uint32 for the typeif=d
01010000**20E61000**00DB81DF2B5F7822C0DFBB7262B4744A40
Why can shapely/geos parse this WKB when it's invalid WKB? What do these bytes mean?
GEOS / Shapely use an Extended variant of WKT/WKB called EWKT / EWKB, which is documented by PostGIS. If you have access to PostGIS, you can see what's going on here:
Returns the EWKT
SRID=4326;POINT(-9.2351011 52.9117549)
. So the extra data was the spatial reference identifier, or SRID. Specifically EPSG:4326 for WGS 84.Shapely does not support SRIDs, however there are a few hacks, e.g.:
should now make
wkb
orwkb_hex
output the EWKB, which includes the SRID. The default isFalse
, which would output ISO WKB for 2D geometries (but not for 3D).So it seems your objective is to convert EWKB to ISO WKB, which you can do with GEOS / Shapely for 2D geometries only. If you have 3D (Z or M) or 4D (ZM) geometries, then only PostGIS is able to do this conversion.