5

我希望能够在 PyYAML 的 dump() 函数生成的 YAML 中生成锚点。有没有办法做到这一点?理想情况下,锚点与 YAML 节点具有相同的名称。

例子:

import yaml
yaml.dump({'a': [1,2,3]})
'a: [1, 2, 3]\n'

我想做的是生成 YAML,例如:

import yaml
yaml.dump({'a': [1,2,3]})
'a: &a [1, 2, 3]\n'

我可以编写自定义发射器或转储器来执行此操作吗?还有其他方法吗?

4

5 回答 5

3

默认情况下,仅当检测到对先前看到的对象的引用时才会发出锚点:

>>> import yaml
>>>
>>> foo = {'a': [1,2,3]}
>>> doc = (foo,foo)
>>>
>>> print yaml.safe_dump(doc, default_flow_style=False)
- &id001
  a:
  - 1
  - 2
  - 3
- *id001

如果要覆盖它的命名方式,则必须自定义Dumper 类,特别是generate_anchor()函数。ANCHOR_TEMPLATE也可能有用。

在您的示例中,节点名称很简单,但您需要考虑 YAML 值的多种可能性,即它可能是一个序列而不是单个值:

>>> import yaml
>>>
>>> foo = {('a', 'b', 'c'): [1,2,3]}
>>> doc = (foo,foo)
>>>
>>> print yaml.dump(doc, default_flow_style=False)
!!python/tuple
- &id001
  ? !!python/tuple
  - a
  - b
  - c
  : - 1
    - 2
    - 3
- *id001
于 2014-04-02T21:52:28.333 回答
2

这并不容易。除非您要用于锚点的数据节点内。这是因为锚点附加到节点内容,在您的示例中为“[1,2,3]”,并且不知道该值与键“a”相关联。

l = [1, 2, 3]
foo = {'a': l, 'b': l}
class SpecialAnchor(yaml.Dumper):

    def generate_anchor(self, node):
        print('Generating anchor for {}'.format(str(node)))
        anchor =  super().generate_anchor(node)
        print('Generated "{}"'.format(anchor))
        return anchor

y1 = yaml.dump(foo, Dumper=Anchor)

给你:

Generating anchor for SequenceNode(tag='tag:yaml.org,2002:seq', value=[ScalarNode(tag='tag:yaml.org,2002:int', value='1'), ScalarNode(tag='tag:yaml.org,2002:int', value='2'), ScalarNode(tag='tag:yaml.org,2002:int', value='3')])
Generated "id001"
a: &id001 [1, 2, 3]
b: *id001

到目前为止,我还没有找到一种方法来获取给定节点的密钥“a”......

于 2015-09-08T03:21:22.330 回答
2

我编写了一个自定义锚类来强制顶级节点的锚值。它不是简单地覆盖锚字符串(使用 generate_anchor),而是实际上强制发出 Anchor,即使稍后没有引用该节点:

class CustomAnchor(yaml.Dumper):
    def __init__(self, *args, **kwargs):
        super(CustomAnchor, self).__init__(*args, **kwargs)
        self.depth = 0
        self.basekey = None
        self.newanchors = {}

    def anchor_node(self, node):
        self.depth += 1
        if self.depth == 2:
            assert isinstance(node, yaml.ScalarNode), "yaml node not a string: %s" % node
            self.basekey = str(node.value)
            node.value = self.basekey + "_ALIAS"
        if self.depth == 3:
            assert self.basekey, "could not find base key for value: %s" % node
            self.newanchors[node] = self.basekey
        super(CustomAnchor, self).anchor_node(node)
        if self.newanchors:
            self.anchors.update(self.newanchors)
            self.newanchors.clear()

请注意,我将节点名称覆盖为以“_ALIAS”为后缀,但您可以删除该行以使节点名称和锚点名称保持不变,或将其更改为其他名称。

例如,倾倒 {'FOO': 'BAR'} 会导致:

FOO_ALIAS:&FOO 酒吧

另外,我只写它一次处理单个顶级键/值对,它只会强制顶级键的锚点。如果要将 dict 转换为 YAML 文件,所有键都是顶级 YAML 节点,则需要遍历 dict 并将每个键/值对转储为 {key:value},或重写此类以处理带有多个键的dict。

于 2016-03-29T22:06:45.713 回答
0

这个问题已经很老了,aaa90210 在他的回答中已经有一些很好的指示,但是提供的类并没有真正做到我想要的,我认为它不能很好地概括。

我试图想出一个允许添加锚点的转储程序,并确保在文件稍后再次出现密钥时创建相应的别名。

这绝不是功能齐全,它可能会变得更安全,但我希望它可以对其他人有所启发:

import yaml
from typing import Dict


class CustomAnchor(yaml.Dumper):
    """Customer Dumper class to create anchors for keys throughout the YAML file.

    Attributes:
        added_anchors: mapping of key names to the node objects representing their value, for nodes that have an anchor
    """

    def __init__(self, *args, **kwargs):
        """Initialize class.

        We call the constructor of the parent class.
        """
        super().__init__(*args, **kwargs)
        self.filter_keys = ['a', 'b']
        self.added_anchors: Dict[str, yaml.ScalarNode] = {}

    def anchor_node(self, node):
        """Override method from parent class.

        This method first checks if the node contains the keys of interest, and if anchors already exist for these keys,
        replaces the reference to the value node to the one that the anchor points to. In case no anchor exist for
        those keys, it creates them and keeps a reference to the value node in the ``added_anchors`` class attribute.

        Args:
            node (yaml.Node): the node being processed by the dumper
        """
        if isinstance(node, yaml.MappingNode):
            # let's check through the mapping to find keys which are of interest
            for i, (key_node, value_node) in enumerate(node.value):
                if (
                    isinstance(key_node, yaml.ScalarNode)
                    and key_node.value in self.filter_keys
                ):
                    if key_node.value in self.added_anchors:  # anchor exists
                        # replace value node to tell the dumper to create an alias
                        node.value[i] = (key_node, self.added_anchors[key_node.value])
                    else:  # no anchor yet exists but we need to create one
                        self.anchors.update({value_node: key_node.value})
                        self.added_anchors[key_node.value] = value_node
        super().anchor_node(node)

于 2020-01-20T12:47:49.823 回答
0

我根本无法得到@beeb 的答案,所以我继续尝试概括@aaa90210 的答案

import yaml

class _CustomAnchor(yaml.Dumper):
  anchor_tags = {}
  def __init__(self,*args,**kwargs):
    super().__init__(*args,**kwargs)
    self.new_anchors = {}
    self.anchor_next = None
  def anchor_node(self, node):
    if self.anchor_next is not None:
      self.new_anchors[node] = self.anchor_next
      self.anchor_next = None
    if isinstance(node.value, str) and node.value in self.anchor_tags:
      self.anchor_next = self.anchor_tags[node.value]

    super().anchor_node(node)

    if self.new_anchors:
      self.anchors.update(self.new_anchors)
      self.new_anchors.clear()
def CustomAnchor(tags):
  return type('CustomAnchor', (_CustomAnchor,), {'anchor_tags': tags})

print(yaml.dump(foo, Dumper=CustomAnchor({'a': 'a_name'})))

这没有提供区分具有相同名称值的两个节点的方法,这需要一个等效于 XML 的 xpath 的 yaml,我在 pyyaml 中看不到 :(


类工厂CustomAnchor允许您传入基于节点值的锚点字典。{value: anchor_name}

于 2020-02-17T20:49:08.297 回答