When using ruamel.yaml version 0.15.92 with Python 3.6.6 on CentOS 7, I cannot seem to update the value of an anchored scalar in a sequence without destroying the anchor itself or creating invalid YAML from the next dump.
I have attempted to recreate the original node type with the new value (old PlainScalarString
-> new PlainScalarString
, old FoldedScalarString
-> new FoldedScalarString
, etc), copying the anchor
to it. While this restores the anchor to the updated scalar value, it also creates invalid YAML because the first alias later in the YAML file duplicates the same anchor name and assigns to it the old value of the scalar I'm trying to update.
I then attempted to replace all of the affected aliases with actual alias text -- like *anchor_name
-- but that causes the value to become quoted like '*anchor_name'
, rendering the alias useless.
I reverted that and then attempted to suppress the duplicate anchor name (by setting always_dump=False
on every affected alias). While that does suppress the duplicate anchor name, it unfortunately just dumps the old value of the anchored scalar.
My entire test data is as follows; assume this is named test.yaml:
# Header comment
---
# Post-header comment
# Reusable aliases
aliases:
- &plain_value This is unencrypted
- &string_password ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMIIBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAYnFbMveZGBgd9aw7h4VV+M202zRdcP96UQs1q+ViznJK2Ee08hoW9jdIqVhNaecYALUihKjVYijJa649VF7BLZXV0svLEHD8LZeduoLS3iC9uszdhDFB2Q6R/Vv/ARjHNoWc6/D0nFN9vwcrQNITnvREl0WXYpR9SmW0krUpyr90gSAxTxPNJVlEOtA0afeJiXOtQEu/b8n+UDM3eXXRO+2SEXM4ub7fNcj6V9DgT3WwKBUjqzQ5DicnB19FNQ1cBGcmCo8qRv0JtbVqZ4+WJFGc06hOTcAJPsAaWWUn80ChcTnl4ELNzpJFoxAxHgepirskuIvuWZv3h/PL8Ez3NDBMBgkqhkiG9w0BBwEwHQYJYIZIAWUDBAEqBBBSuVIsvWXMmdFJtJmtJxXxgCAGFCioe/zdphGqynmj6vVDnCjA3Xc0VPOCmmCl/cTKdg==]
- &block_password >
ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMIIBHQIBADAFMAACAQEw
DQYJKoZIhvcNAQEBBQAEggEAojErrxuNcdX6oR+VA/I3PyuV2CwXx166nIUp
asEHo1/CiCIoE3qCnjK2FJF8vg+l3AqRmdb7vYrqQ+30RFfHSlB9zApSw8NW
tnEpawX4hhKAxnTc/JKStLLu2k7iZkhkor/UA2HeVJcCzEeYAwuOQRPaolmQ
TGHjvm2w6lhFDKFkmETD/tq4gQNcOgLmJ+Pqhogr/5FmGOpJ7VGjpeUwLteM
er3oQozp4l2bUTJ8wk9xY6cN+eeOIcWXCPPdNetoKcVropiwrYH8QV4CZ2Ky
u0vpiybEuBCKhr1EpfqhrtuG5s817eOb7+Wf5ctR0rPuxlTUqdnDY31zZ3Kb
mcjqHDBMBgkqhkiG9w0BBwEwHQYJYIZIAWUDBAEqBBBATq6BjaxU2bfcLL5S
bxzsgCDsWzggzxsCw4Dp0uYLwvMKjJEpMLeFXGrLHJzTF6U2Nw==]
top_key: unencrypted value
top_alias: *plain_value
top::hash:
ignore: more
# This pulls its string-form value from above
stringified_alias: *string_password
sub:
ignore: value
key: unencrypted subbed-value
# This pulls its block-form value from above
blocked_alias: *block_password
sub_more:
# This is a stringified EYAML value, NOT an alias
inline_string: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMIIBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAafmyrrae2kx8HdyPmn/RHQRcTPhqpx5Idm12hCDCIbwVM++H+c620z4EN2wlugz/GcLaiGsybaVWzAZ+3r+1+EwXn5ec4dJ5TTqo7oxThwUMa+SHliipDJwGoGii/H+y2I+3+irhDYmACL2nyJ4dv4IUXwqkv6nh1J9MwcOkGES2SKiDm/WwfkbPIZc3ccp1FI9AX/m3SVqEcvsrAfw6HtkolM22csfuJREHkTp7nBapDvOkWn4plzfOw9VhPKhq1x9DUCVFqqG/HAKv++v4osClK6k1MmSJWaMHrW1z3n7LftV9ZZ60E0Cgro2xSaD+itRwBp07H0GeWuoKB4+44TBMBgkqhkiG9w0BBwEwHQYJYIZIAWUDBAEqBBCRv9r2lvQ1GJMoD064EtdigCCw43EAKZWOc41yEjknjRaWDm1VUug6I90lxCsUrxoaMA==]
# Also NOT an alias, in block form
block_string: >
ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMIIBHQIBADAFMAACAQEw
DQYJKoZIhvcNAQEBBQAEggEAafmyrrae2kx8HdyPmn/RHQRcTPhqpx5Idm12
hCDCIbwVM++H+c620z4EN2wlugz/GcLaiGsybaVWzAZ+3r+1+EwXn5ec4dJ5
TTqo7oxThwUMa+SHliipDJwGoGii/H+y2I+3+irhDYmACL2nyJ4dv4IUXwqk
v6nh1J9MwcOkGES2SKiDm/WwfkbPIZc3ccp1FI9AX/m3SVqEcvsrAfw6Htko
lM22csfuJREHkTp7nBapDvOkWn4plzfOw9VhPKhq1x9DUCVFqqG/HAKv++v4
osClK6k1MmSJWaMHrW1z3n7LftV9ZZ60E0Cgro2xSaD+itRwBp07H0GeWuoK
B4+44TBMBgkqhkiG9w0BBwEwHQYJYIZIAWUDBAEqBBCRv9r2lvQ1GJMoD064
EtdigCCw43EAKZWOc41yEjknjRaWDm1VUug6I90lxCsUrxoaMA==]
# Signature line
There are two forms of this issue, so here are two code examples for reproducing the conditions:
First, "How can we most simply update the value of an anchored scalar in a sequence without destroying the anchor or its aliases?" This looks like:
with open('test.yaml', 'r') as f:
yaml_data = yaml.load(f)
yaml_data['aliases'][1] = "New string password"
yaml.dump(yaml_data, sys.stdout)
Note that this destroys the anchor. I would very much prefer the solution look as similar to this first snippet as possible; perhaps something like yaml_data['aliases'][1].set_value("New string password") # Changes only the scalar value while preserving the original anchor, comments, position, et al.
.
Second, "If we must instead wrap the new value in some object to preserve the anchor (and other attributes of the entry being replaced), what is the simplest approach which also preserves all aliases that refer to it (such that they adopt the updated value) when dumped?" My attempt to solve this requires quite a lot more code including recursive functions. Since SO guidelines advise against dumping large code, I will offer the relevant bits. Please assume the unlisted code is working perfectly well.
### <snip def FindEYAMLPaths(...) returns lists of paths through the YAML to every value starting with 'ENC['>
### <snip def GetYAMLValue(...) returns the node -- as a PlainScalarString, FoldedScalarString, et al. -- identified by a path from FindEYAMLPaths>
### <snip def DisableAnchorDump(...) sets `anchor.always_dump=False` if the node has an anchor attribute>
def ReplaceYAMLValue(value, data, path=None):
if path is None:
return
ref = data
last_ref = path.pop()
for p in path:
ref = ref[p]
# All I'm trying to do here is change the scalar value without disrupting its comments, anchor, positioning, or any of its aliases.
# This succeeds in changing the scalar value and preserving its original anchor, but disrupts its aliases which insist on preserving the old value.
if isinstance(ref[last_ref], PlainScalarString):
ref[last_ref] = PlainScalarString(value, anchor=ref[last_ref].anchor.value)
elif isinstance(ref[last_ref], FoldedScalarString):
ref[last_ref] = FoldedScalarString(value, anchor=ref[last_ref].anchor.value)
else:
ref[last_ref] = value
with open('test.yaml', 'r') as f:
yaml_data = yaml.load(f)
seen_anchors = []
for path in FindEYAMLPaths(yaml_data):
if path is None:
continue
node = GetYAMLValue(yaml_data, deque(path))
if hasattr(node, 'anchor'):
test_anchor = node.anchor.value
if test_anchor is not None:
if test_anchor in seen_anchors:
# This is expected to just be an alias, pointing at the newly updated anchor
DisableAnchorDump(node)
continue
seen_anchors.append(test_anchor)
ReplaceYAMLValue("New string password", yaml_data, path)
yaml.dump(yaml_data, sys.stdout)
Note that this produces valid YAML except that all of the affected aliases are gone, replaced instead by the old value of the anchored scalar.
I expect to be able to change the value of an aliased scalar in a sequence without disrupting any other part of the YAML content. Based on other posts I've seen about ruamel.yaml, I fully accept that I may need to dump the updated YAML to file and reload it for the in-memory aliases to update to the new value. I simply expect to change:
Input File
aliases:
- &some_anchor Old value
usage: *some_anchor
to:
Output File
aliases:
- &some_anchor NEW VALUE
usage: *some_anchor
Instead, here's the output from the above two examples:
First, notice that the original anchor was destroyed and the value for top::hash:stringified_alias:
now carries the original anchor and old value instead of the alias to the newly updated scalar value at ['aliases'][1]:
---
# Post-header comment
# Reusable aliases
aliases:
- &plain_value This is unencrypted
- New string password
- &block_password >
ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMIIBHQIBADAFMAACAQEw
DQYJKoZIhvcNAQEBBQAEggEAojErrxuNcdX6oR+VA/I3PyuV2CwXx166nIUp
asEHo1/CiCIoE3qCnjK2FJF8vg+l3AqRmdb7vYrqQ+30RFfHSlB9zApSw8NW
tnEpawX4hhKAxnTc/JKStLLu2k7iZkhkor/UA2HeVJcCzEeYAwuOQRPaolmQ
TGHjvm2w6lhFDKFkmETD/tq4gQNcOgLmJ+Pqhogr/5FmGOpJ7VGjpeUwLteM
er3oQozp4l2bUTJ8wk9xY6cN+eeOIcWXCPPdNetoKcVropiwrYH8QV4CZ2Ky
u0vpiybEuBCKhr1EpfqhrtuG5s817eOb7+Wf5ctR0rPuxlTUqdnDY31zZ3Kb
mcjqHDBMBgkqhkiG9w0BBwEwHQYJYIZIAWUDBAEqBBBATq6BjaxU2bfcLL5S
bxzsgCDsWzggzxsCw4Dp0uYLwvMKjJEpMLeFXGrLHJzTF6U2Nw==]
# ... snip ...
top::hash:
ignore: more
# This pulls its string-form value from above
stringified_alias: &string_password ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMIIBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAYnFbMveZGBgd9aw7h4VV+M202zRdcP96UQs1q+ViznJK2Ee08hoW9jdIqVhNaecYALUihKjVYijJa649VF7BLZXV0svLEHD8LZeduoLS3iC9uszdhDFB2Q6R/Vv/ARjHNoWc6/D0nFN9vwcrQNITnvREl0WXYpR9SmW0krUpyr90gSAxTxPNJVlEOtA0afeJiXOtQEu/b8n+UDM3eXXRO+2SEXM4ub7fNcj6V9DgT3WwKBUjqzQ5DicnB19FNQ1cBGcmCo8qRv0JtbVqZ4+WJFGc06hOTcAJPsAaWWUn80ChcTnl4ELNzpJFoxAxHgepirskuIvuWZv3h/PL8Ez3NDBMBgkqhkiG9w0BBwEwHQYJYIZIAWUDBAEqBBBSuVIsvWXMmdFJtJmtJxXxgCAGFCioe/zdphGqynmj6vVDnCjA3Xc0VPOCmmCl/cTKdg==]
# ... snip ...
Second, notice that ['aliases'][1] now looks correct -- it is the new value with the original anchor -- but where I expect to see aliases to it, I instead see the old value. I expect to see *string_password
instead of ENC[...]
.
---
# Post-header comment
# Reusable aliases
aliases:
- &plain_value This is unencrypted
- &string_password New string password
- &block_password >-
New string password
# ... snip ...
top::hash:
ignore: more
# This pulls its string-form value from above
stringified_alias: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMIIBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAYnFbMveZGBgd9aw7h4VV+M202zRdcP96UQs1q+ViznJK2Ee08hoW9jdIqVhNaecYALUihKjVYijJa649VF7BLZXV0svLEHD8LZeduoLS3iC9uszdhDFB2Q6R/Vv/ARjHNoWc6/D0nFN9vwcrQNITnvREl0WXYpR9SmW0krUpyr90gSAxTxPNJVlEOtA0afeJiXOtQEu/b8n+UDM3eXXRO+2SEXM4ub7fNcj6V9DgT3WwKBUjqzQ5DicnB19FNQ1cBGcmCo8qRv0JtbVqZ4+WJFGc06hOTcAJPsAaWWUn80ChcTnl4ELNzpJFoxAxHgepirskuIvuWZv3h/PL8Ez3NDBMBgkqhkiG9w0BBwEwHQYJYIZIAWUDBAEqBBBSuVIsvWXMmdFJtJmtJxXxgCAGFCioe/zdphGqynmj6vVDnCjA3Xc0VPOCmmCl/cTKdg==]
# ... snip ...