I need to get the line numbers of certain keys of a YAML file.
Please note, this answer does not solve the issue: I do use ruamel.yaml, and the answers do not work with ordered maps.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from ruamel import yaml
data = yaml.round_trip_load("""
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: item5
- key6: item6
""")
print(data)
As a result I get this:
CommentedMap([('key1', CommentedOrderedMap([('key2', 'item2'), ('key3', 'item3'), ('key4', CommentedOrderedMap([('key5', 'item5'), ('key6', 'item6')]))]))])
what does not allow to access to the line numbers, except for the !!omap
keys:
print(data['key1'].lc.line) # output: 1
print(data['key1']['key4'].lc.line) # output: 4
but:
print(data['key1']['key2'].lc.line) # output: AttributeError: 'str' object has no attribute 'lc'
Indeed, data['key1']['key2]
is a str
.
I've found a workaround:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from ruamel import yaml
DATA = yaml.round_trip_load("""
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: item5
- key6: item6
""")
def get_line_nb(data):
if isinstance(data, dict):
offset = data.lc.line
for i, key in enumerate(data):
if isinstance(data[key], dict):
get_line_nb(data[key])
else:
print('{}|{} found in line {}\n'
.format(key, data[key], offset + i + 1))
get_line_nb(DATA)
output:
key2|item2 found in line 2
key3|item3 found in line 3
key5|item5 found in line 5
key6|item6 found in line 6
but this looks a little bit "dirty". Is there a more proper way of doing it?
EDIT: this workaround is not only dirty, but only works for simple cases like the one above, and will give wrong results as soon as there are nested lists in the way
This issue is not that you are using
!omap
and that it doesn't give you the line-numbers as with "normal" mappings. That should be clear from the fact that you get 4 from doingprint(data['key1']['key4'].lc.line)
(wherekey4
is a key in the outer!omap
).As this answers indicates,
The value for
data['key1']['key4']
is a collection item (another!omap
), but the value fordata['key1']['key2']
is not a collection item but a, built-in, python string, which has no slot to store thelc
attribute.To get an
.lc
attribute on a non-collection like a string you have to subclass theRoundTripConstructor
, to use something like the classes inscalarstring.py
(with__slots__
adjusted to accept thelc
attribute and then transfer the line information available in the nodes to that attribute and then set the line, column information:Please note that the output of the last call to
print
is 6, as the literal scalar string starts with the|
.If you also want to dump
data
, you'll need to make aRepresenter
aware of thoseMy....
types.