Parsing YAML, return with line number

2020-02-22 03:06发布

I'm making a document generator from YAML data, which would specify which line of the YAML file each item is generated from. What is the best way to do this? So if the YAML file is like this:

- key1: item 1
  key2: item 2
- key1: another item 1
  key2: another item 2

I want something like this:

[
     {'__line__': 1, 'key1': 'item 1', 'key2': 'item 2'},
     {'__line__': 3, 'key1': 'another item 1', 'key2': 'another item 2'},
]

I'm currently using PyYAML, but any other library is OK if I can use it from Python.

3条回答
一夜七次
2楼-- · 2020-02-22 03:44

If you are using ruamel.yaml >= 0.9 (of which I am the author), and use the RoundTripLoader, you can access the property lc on collection items to get line and column where they started in the source YAML:

def test_item_04(self):
    data = load("""
     # testing line and column based on SO
     # http://stackoverflow.com/questions/13319067/
     - key1: item 1
       key2: item 2
     - key3: another item 1
       key4: another item 2
        """)
    assert data[0].lc.line == 2
    assert data[0].lc.col == 2
    assert data[1].lc.line == 4
    assert data[1].lc.col == 2

(line and column start counting at 0).

This answer show how to add the lc attribute to string types during loading.

查看更多
看我几分像从前
3楼-- · 2020-02-22 03:53

Here's an improved version of puzzlet's answer:

import yaml
from yaml.loader import SafeLoader

class SafeLineLoader(SafeLoader):
    def construct_mapping(self, node, deep=False):
        mapping = super(SafeLineLoader, self).construct_mapping(node, deep=deep)
        # Add 1 so line numbering starts at 1
        mapping['__line__'] = node.start_mark.line + 1
        return mapping

You can use it like this:

data = yaml.load(whatever, Loader=SafeLineLoader)
查看更多
做自己的国王
4楼-- · 2020-02-22 03:54

I've made it by adding hooks to Composer.compose_node and Constructor.construct_mapping:

import yaml
from yaml.composer import Composer
from yaml.constructor import Constructor

def main():
    loader = yaml.Loader(open('data.yml').read())
    def compose_node(parent, index):
        # the line number where the previous token has ended (plus empty lines)
        line = loader.line
        node = Composer.compose_node(loader, parent, index)
        node.__line__ = line + 1
        return node
    def construct_mapping(node, deep=False):
        mapping = Constructor.construct_mapping(loader, node, deep=deep)
        mapping['__line__'] = node.__line__
        return mapping
    loader.compose_node = compose_node
    loader.construct_mapping = construct_mapping
    data = loader.get_single_data()
    print(data)
查看更多
登录 后发表回答