pyyaml and using quotes for strings only

2019-02-22 05:47发布

问题:

I have the following YAML file:

---
my_vars:
  my_env: "dev"
  my_count: 3

When I read it with PyYAML and dump it again, I get the following output:

---
my_vars:
  my_env: dev
  my_count: 3

The code in question:

with open(env_file) as f:
    env_dict = yaml.load(f)
    print(yaml.dump(env_dict, indent=4, default_flow_style=False, explicit_start=True))

I tried using the default_style parameter:

with open(env_file) as f:
    env_dict = yaml.load(f)
    print(yaml.dump(env_dict, indent=4, default_flow_style=False, explicit_start=True, default_style='"'))

But now I get:

---
"my_vars":
  "my_env": "dev"
  "my_count": !!int "3"

What do I need to do to keep the original formatting, without making any assumptions about the variable names in the YAML file?

回答1:

I suggest you update to using YAML 1.2 (released in 2009) with the backwards compatible ruamel.yaml package instead of using PyYAML which implements most of YAML 1.1 (2005). (Disclaimer: I am the author of that package).

Then you just specify preserve_quotes=True when loading for round-tripping the YAML file:

import sys
import ruamel.yaml

yaml_str = """\
---
my_vars:
  my_env: "dev"    # keep "dev" quoted
  my_count: 3
"""

data = ruamel.yaml.round_trip_load(yaml_str, preserve_quotes=True)
ruamel.yaml.round_trip_dump(data, sys.stdout, explicit_start=True)

which outputs (including the preserved comment):

---
my_vars:
  my_env: "dev"    # keep "dev" quoted
  my_count: 3

After loading the string scalars will be a subclass of string, to be able to accommodate the quoting info, but will work like a normal string for all other purposes. If you want to replace such a string though (dev to fgw) you have to cast the string to this subclass ( DoubleQuotedScalarString from ruamel.yaml.scalarstring).

When round-tripping ruamel.yaml by default preserves the order (by insertion) of the keys.



回答2:

Right, so borrowing heavily from this answer, you can do something like this:

import yaml

# define a custom representer for strings
def quoted_presenter(dumper, data):
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='"')

yaml.add_representer(str, quoted_presenter)


env_file = 'input.txt'
with open(env_file) as f:
    env_dict = yaml.load(f)
    print yaml.dump(env_dict, default_flow_style=False)

However, this just overloads it on all strings types in the dictionary so it'll quote the keys as well, not just the values.

It prints:

"my_vars":
  "my_count": 3
  "my_env": "dev"

Is this what you want? Not sure what you mean by variable names, do you mean the keys?