I'm very new to mapping, and to Altair/Vega. There's an example in the Altair documentation for how to make a map starting with an outline of US states, which is created basically with:
states = alt.topo_feature(data.us_10m.url, feature='states')
# US states background
background = alt.Chart(states).mark_geoshape(
fill='lightgray',
stroke='white'
)
but I want to plot points in the British Isles, instead. Since there are only US and World maps in the vega data collections, I would have to create my own GeoJSON, no?
So I tried getting GeoJSON for the British Isles from a world map, by running some of the command-line commands from this blog post, namely,
ogr2ogr -f GeoJSON -where "adm0_a3 IN ('GBR','IRL','IMN','GGY','JEY','GBA')" subunits.json ne_10m_admin_0_map_subunits/ne_10m_admin_0_map_subunits.shp
This seems to have created a GeoJSON file, subunits.json, which probably represents the British Isles. But how can I get this into Altair? Or is there another way to make a map of the British Isles using Altair?
The example you refer to is using topojson
structured data, while you have geojson
structured data. So you probably need:
# remote geojson data object
url_geojson = 'https://raw.githubusercontent.com/mattijn/datasets/master/two_polygons.geo.json'
data_geojson_remote = alt.Data(url=url_geojson, format=alt.DataFormat(property='features',type='json'))
# chart object
alt.Chart(data_geojson_remote).mark_geoshape(
).encode(
color="properties.name:N"
).properties(
projection={'type': 'identity', 'reflectY': True}
)
For more insight read on
Explaining the differences between geojson
and topojson
structured json
files and their usage within Altair
import geojson
import topojson
import pprint
import altair as alt
Inline GeoJSON
We start with creating a collection containing two features, namely two adjacent polygons.
Example of the two polygons that we will create in the GeoJSON data format.:
feature_1 = geojson.Feature(
geometry=geojson.Polygon([[[0, 0], [1, 0], [1, 1], [0, 1], [0, 0]]]),
properties={"name":"abc"}
)
feature_2 = geojson.Feature(
geometry=geojson.Polygon([[[1, 0], [2, 0], [2, 1], [1, 1], [1, 0]]]),
properties={"name":"def"}
)
var_geojson = geojson.FeatureCollection([feature_1, feature_2])
Inspect the created GeoJSON by pretty print the variable var_geojson
pprint.pprint(var_geojson)
{'features': [{'geometry': {'coordinates': [[[0, 0],
[1, 0],
[1, 1],
[0, 1],
[0, 0]]],
'type': 'Polygon'},
'properties': {'name': 'abc'},
'type': 'Feature'},
{'geometry': {'coordinates': [[[1, 0],
[2, 0],
[2, 1],
[1, 1],
[1, 0]]],
'type': 'Polygon'},
'properties': {'name': 'def'},
'type': 'Feature'}],
'type': 'FeatureCollection'}
As can be seen, the two Polygon
Features
are nested within the features
object and the geometry
is part of each feature
.
Altair has the capability to parse nested json
objects using the property
key within format
. The following is an example of such:
# inline geojson data object
data_geojson = alt.InlineData(values=var_geojson, format=alt.DataFormat(property='features',type='json'))
# chart object
alt.Chart(data_geojson).mark_geoshape(
).encode(
color="properties.name:N"
).properties(
projection={'type': 'identity', 'reflectY': True}
)
Inline TopoJSON
TopoJSON is an extension of GeoJSON, where the geometry
of the features
are referred to from a top-level object named arcs
. This makes it possible to apply a hash function on the geometry, so each shared arc
should only be stored once.
We can convert the var_geojson
variable into a topojson
file format structure:
var_topojson = topojson.Topology(var_geojson, prequantize=False).to_json()
var_topojson
{'arcs': [[[1.0, 1.0], [0.0, 1.0], [0.0, 0.0], [1.0, 0.0]],
[[1.0, 0.0], [2.0, 0.0], [2.0, 1.0], [1.0, 1.0]],
[[1.0, 1.0], [1.0, 0.0]]],
'objects': {'data': {'geometries': [{'arcs': [[-3, 0]],
'properties': {'name': 'abc'},
'type': 'Polygon'},
{'arcs': [[1, 2]],
'properties': {'name': 'def'},
'type': 'Polygon'}],
'type': 'GeometryCollection'}},
'type': 'Topology'}
Now the nested geometry
objects are replaced by arcs
and refer by index to the top-level arcs
object. Instead of having a single FeatureCollection
we now can have multiple objects
, where our converted FeatureCollection
is stored within the key data
as a GeometryCollection
.
NOTE: the key-name data
is arbitrary and differs in each dataset.
Altair has the capability to parse the nested data
object in the topojson
formatted structure using the feature
key within format
, while declaring it is a topojson
type
. The following is an example of such:
# inline topojson data object
data_topojson = alt.InlineData(values=var_topojson, format=alt.DataFormat(feature='data',type='topojson'))
# chart object
alt.Chart(data_topojson).mark_geoshape(
).encode(
color="properties.name:N"
).properties(
projection={'type': 'identity', 'reflectY': True}
)
TopoJSON from URL
There also exist a shorthand to extract the objects from a topojson
file if this file is accessible by URL:
alt.topo_feature(url, feature)
Altair example where a topojson
file is referred by URL
# remote topojson data object
url_topojson = 'https://raw.githubusercontent.com/mattijn/datasets/master/two_polygons.topo.json'
data_topojson_remote = alt.topo_feature(url=url_topojson, feature='data')
# chart object
alt.Chart(data_topojson_remote).mark_geoshape(
).encode(
color="properties.name:N"
).properties(
projection={'type': 'identity', 'reflectY': True}
)
GeoJSON from URL
But for geojson
files accessible by URL there is no such shorthand and should be linked as follows:
alt.Data(url, format)
Altair example where a geojson
file is referred by URL
# remote geojson data object
url_geojson = 'https://raw.githubusercontent.com/mattijn/datasets/master/two_polygons.geo.json'
data_geojson_remote = alt.Data(url=url_geojson, format=alt.DataFormat(property='features',type='json'))
# chart object
alt.Chart(data_geojson_remote).mark_geoshape(
).encode(
color="properties.name:N"
).properties(
projection={'type': 'identity', 'reflectY': True}
)