Using python 2 (atm) and ruamel.yaml 0.13.14 (RedHat EPEL)
I'm currently writing some code to load yaml definitions, but they are split up in multiple files. The user-editable part contains eg.
users:
xxxx1:
timestamp: '2018-10-22 11:38:28.541810'
<< : *userdefaults
xxxx2:
<< : *userdefaults
timestamp: '2018-10-22 11:38:28.541810'
the defaults are stored in another file, which is not editable:
userdefaults: &userdefaults
# Default values for user settings
fileCountQuota: 1000
diskSizeQuota: "300g"
I can process these together by loading both and concatinating the strings, and then running them through merged_data = list(yaml.load_all("{}\n{}".format(defaults_data, user_data), Loader=yaml.RoundTripLoader))
which correctly resolves everything. (when not using RoundTripLoader I get errors that the references cannot be resolved, which is normal)
Now, I want to do some updates via python code (eg. update the timestamp), and for that I need to just write back the user part. And that's where things get hairy. I sofar haven't found a way to just write that yaml document, not both.
First of all, unless there are multiple documents in your defaults file, you don't have to use
load_all
, as you don't concatenate two documents into a multiple-document stream. If you had by using a format string with a document-end marker ("{}\n...\n{}"
) or with a directives-end marker ("{}\n---\n{}"
) your aliases would not carry over from one document to another, as per the YAML specification:The anchor has to be in the document, not just in the stream (which can consist of multiple documents).
I tried some hocus pocus, pre-populating the already represented dictionary of anchored nodes:
Since the PyYAML based API requires a class instead of an object, you need to use a class generator, that actually adds the data elements to pre-populate on the fly from withing
yaml.load()
.But this doesn't work, as a node only gets written out with an anchor once it is determined that the anchor is used (i.e. there is a second reference). So actually the first merge key gets written out as an anchor. And although I am quite familiar with the code base, I could not get this to work properly in a reasonable amount of time.
So instead, I would just rely on the fact that there is only one key that matches the first key of
users.yaml
at the root level of the dump of the combined updated file and strip anything before that.which gives:
I had to make a virtualenv to make sure I could run the above with
ruamel.yaml==0.13.14
. That version is from the time I was still young (I won't claim to have been innocent). There have been over 85 releases of the library since then.I can understand that you might not be able to run anything but Python2 at the moment and cannot compile/use a newer version. But what you really should do is install
virtualenv
(can be done using EPEL, but also without further "polluting" your system installation), make a virtualenv for the code you are developping and install the latest version ofruamel.yaml
(and your other libraries) in there. You can also do that if you need to distribute your software to other systems, just install virtualenv there as well.I have all my utilties under
/opt/util
, and managedvirtualenvutils
a wrapper around virtualenv.For writing the user part, you will have to manually split the output of
yaml.dump()
multifile output and write the appropriate part back to users yaml file.