我有两个 XML 文件,其中包含有关文档的信息。我需要根据这些文件中的信息创建一个 DOT 图。
布局.xml
<layout>
<segmentation>
<layout-unit id="lay-1.01" xref="u-1.01 u-1.02 u-1.03"/>
<layout-unit id="lay-1.02" xref="u-1.04 u-1.05 u-1.06 u-1.07 u-1.08">
<layout-unit id="lay-1.03" xref="u-1.09"/>
<layout-unit id="lay-1.04" xref="u-1.10 u-1.11 u-1.12"/>
<layout-unit id="lay-1.05" xref="u-1.13 u-1.14 u-1.15 u-1.16"/>
</segmentation>
</layout>
修辞.xml
<rhetoric>
<segmentation>
<segment id="s-1.01" xref="u-1.01"/>
<segment id="s-1.02" xref="u-1.02"/>
<segment id="s-1.03" xref="u-1.03"/>
<segment id="s-1.04" xref="u-1.04"/>
<segment id="s-1.05" xref="u-1.05"/>
<segment id="s-1.06" xref="u-1.06"/>
<segment id="s-1.07" xref="u-1.07"/>
<segment id="s-1.08" xref="u-1.08"/>
<segment id="s-1.09" xref="u-1.09"/>
<segment id="s-1.10" xref="u-1.10"/>
<segment id="s-1.11" xref="u-1.11"/>
<segment id="s-1.12" xref="u-1.12"/>
<segment id="s-1.13" xref="u-1.13"/>
<mini-segment id="s-1.14" xref="u-1.14"/>
<mini-segment id="s-1.15" xref="u-1.15"/>
<mini-segment id="s-1.16" xref="u-1.16"/>
</segmentation>
<rst-structure root="s-1.01">
<span id="span-1.01" nucleus="s-1.01" satellites="span-1.02" relation="elaboration"><title xref="s-1.09"></title></span>
<span id="span-1.02" nucleus="s-1.02" satellites="s-1.03" relation="elaboration"/>
<span id="span-1.03" nucleus="s-1.01" satellites="span-1.04" relation="enablement"/>
<span id="span-1.04" nucleus="s-1.04" satellites="span-1.05" relation="enablement"/>
<multi-span id="span-1.05" nuclei="span-1.08 span-1.06" relation="sequence"/>
<span id="span-1.06" nucleus="s-1.06" satellites="span-1.07" relation="elaboration"></span>
<multi-span id="span-1.07" nuclei="s-1.07 s-1.08" relation="restatement"></multi-span>
<span id="span-1.08" nucleus="s-1.05" satellites="s-1.10 span-1.09" relation="elaboration"/>
<span id="span-1.09" nucleus="s-1.11" satellites="span-1.10" relation="nonvolitional-result"/>
<span id="span-1.10" nucleus="s-1.12" satellites="span-1.11" relation="elaboration"/>
</rst-structure>
<mini-structure>
<mini-span id="span-1.11" attribute="s-1.14 s-1.15 s-1.16" attribuend="s-1.13" relation="class-ascription"/>
</mini-structure>
</rhetoric>
为了创建 DOT 图,我有一个 XQuery 脚本,它获取rhetoric.xml中的数据,将其转换为 DOT 并根据layout.xml将图分类为子图。
图表如下所示。
我使用@xref属性来选择两个文件中的相关数据,如下所示:
declare function local:add-subgraphs($rhetoric, $layout) {
for $layout-unit-id in $layout/segmentation/layout-unit/@id
let $layout-unit-xrefs := tokenize($layout/segmentation/layout-unit[@id = $layout-unit-id]/@xref, " ")
let $rst-id := $rhetoric/segmentation/segment/@id
let $segment := $rhetoric/segmentation/segment[@xref = $layout-unit-xrefs and @id = $rst-id]/@id
然后我开始通过遍历rhetoric/rst-structure下的不同元素来填充 DOT 图:
let $add-edges-nucleus := for $span-id in $rhetoric/rst-structure/span[@nucleus = $segment]/@id
let $nucleus := tokenize($rhetoric/rst-structure/span[@id = $span-id]/@nucleus, " ")
return concat('"', $nucleus, '" ', $arrow, ' "', $span-id, '"', ';', $newline)
如您所见,$segment变量用于定义哪些跨度属于某个子图。
rhetoric.xml中的这个实例出现了问题:
<multi-span id="span-1.07" nuclei="s-1.07 s-1.08" relation="restatement"></multi-span>
在这种情况下,我不能使用 $segment 变量来选择要包含在子图中的跨度,因为它的结构与跨度元素不同。
例如,考虑段 s-1.07 和 s-1.08,它们应包含在lay-1.02 下,但仍保留在上图中的子图之外。
关于如何定义额外的标准来处理多跨度元素的任何想法,以便将它们放在正确的子图下?