I'd like to be able to generate anchors in the YAML generated by PyYAML's dump() function. Is there a way to do this? Ideally the anchors would have the same name as the YAML nodes.
Example:
import yaml
yaml.dump({'a': [1,2,3]})
'a: [1, 2, 3]\n'
What I'd like to be able to do is generate YAML like:
import yaml
yaml.dump({'a': [1,2,3]})
'a: &a [1, 2, 3]\n'
Can I write a custom emitter or dumper to do this? Is there another way?
This is not so easy. Unless the data that you want to use for the anchor is inside the node. This is because the anchor gets attached to the node contents, in your example '[1,2,3]' and doesn't know that this value is associated with key 'a'.
Gives you:
So far I haven't found a way to get the key 'a' given the node...
By default, anchors are only emitted when it detects a reference to an object previously seen:
If you want to override how it is named, you'll have to customize the Dumper class, specifically the
generate_anchor()
function.ANCHOR_TEMPLATE
may also be useful.In your example, the node name is simple, but you need to take into account the many possibilities for YAML values, ie it could be a sequence rather than a single value:
I wrote a custom anchor class to force an anchor value for top level nodes. It does not simply override the anchor string (using generate_anchor), but actually forces the Anchor to be emitted, even if the node is not referenced later:
Note that I override the node name to be suffixed with "_ALIAS", but you could strip that line to leave the node name and anchor name the same, or change it to something else.
E.g. dumping {'FOO': 'BAR'} results in:
FOO_ALIAS: &FOO BAR
Also, I only wrote it to deal with single top level key/value pairs at a time, and it will only force an anchor for the top level key. If you want to turn a dict into a YAML file with all the keys being top level YAML nodes, you will need to iterate over the dict and dump each key/value pair as {key:value}, or rewrite this class to handle a dict with multiple keys.