I'd like to be able to dump a dictionary containing long strings that I'd like to have in the block style for readability. For example:
foo: |
this is a
block literal
bar: >
this is a
folded block
PyYAML supports the loading of documents with this style but I can't seem to find a way to dump documents this way. Am I missing something?
This can be relatively easily done, the only "hurdle" being how to indicate which of the spaces in the string, that needs to be represented as a folded scalar, needs to become a fold. The literal scalar has explicit newlines containing that information, but this cannot be used for folded scalars, as they can contain explicit newlines e.g. in case there is leading whitespace and also needs a newline at the end in order not to be represented with a stripping chomping indicator (
>-
)which gives:
The
fold_pos
attribute expects a reversable iterable, representing positions of spaces indicating where to fold.If you never have pipe characters ('|') in your strings you could have done something like:
which also gives exactly the output you expect
The result:
For completeness, one should also have str implementations, but I'm going to be lazy :-)
pyyaml
does support dumping literal or folded blocks.Using
Representer.add_representer
defining types:
Then you can define the representers for those types. Please note that while Gary's solution works great for unicode, you may need some more work to get strings to work right (see implementation of represent_str).
Then you can add those representers to the default dumper:
... and test it:
result:
Using
default_style
If you are interested in having all your strings follow a default style, you can also use the
default_style
keyword argument, e.g:or for folded literals:
or for double-quoted literals:
Caveats:
Here is an example of something you may not expect:
results in:
1) non-printable characters
See the YAML spec for escaped characters (Section 5.7):
If you want to preserve non-printable characters (e.g. TAB), you need to use double-quoted scalars. If you are able to dump a scalar with literal style, and there is a non-printable character (e.g. TAB) in there, your YAML dumper is non-compliant.
E.g.
pyyaml
detects the non-printable character\t
and uses the double-quoted style even though a default style is specified:2) leading and trailing white spaces
Another bit of useful information in the spec is:
This means that if your string does have leading or trailing white space, these would not be preserved in scalar styles other than double-quoted. As a consequence,
pyyaml
tries to detect what is in your scalar and may force the double-quoted style.