我正在尝试从一些 XML 中剥离标签,如下所示:
<vocabularyModel>
<conceptDomain name="ActAccountType">
<annotations>
<documentation>
<definition>
<text>
<p>
<b>Description: </b>more txt here </p>
<p>
<i>Examples: </i>
</p>
<p/>
<ul>
<li>
<p>Patient billing accounts</p>
</li>
<li>
<p>Cost center</p>
</li>
<li>
<p>Cash</p>
</li>
</ul>
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
<conceptDomain name="ActAdjudicationInformationCode">
<annotations>
<documentation>
<definition>
<text>
<p>long text.</p>
<p>long text.</p>
<p>long text.</p>
<p>long text.</p>
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
<conceptDomain name="ActAdjudicationType">
<annotations>
<documentation>
<definition>
<text>
<p>
<b>Description: </b>more text.</p>
<p>
<i>Examples: </i>
</p>
<p/>
<ul>
<li>
<p>adjudicated with adjustments</p>
</li>
<li>
<p>adjudicated as refused</p>
</li>
<li>
<p>adjudicated as submitted</p>
</li>
</ul>
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
文本下方的所有子标签都将被剥离,但所需的 xml 和文本如下所示:
<vocabularyModel>
<conceptDomain name="ActAccountType">
<annotations>
<documentation>
<definition>
<text>
Description: more txt here
Examples:
Patient billing accounts
Cost center
Cash
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
<conceptDomain name="ActAdjudicationInformationCode">
<annotations>
<documentation>
<definition>
<text>
long text.
long text.
long text.
long text.
</text>
</definition>
</documentation>
</annotations>>
</conceptDomain>
<conceptDomain name="ActAdjudicationReason">
<annotations>
<documentation>
<definition>
<text>
long text.
long text.
long text.
long text.
</text>
</definition>
</documentation>
</annotations>
<specializesDomain name="ActReason"/>
</conceptDomain>
<conceptDomain name="ActAdjudicationType">
<annotations>
<documentation>
<definition>
<text>
Description: more text.
Examples:
adjudicated with adjustments
adjudicated as refused
adjudicated as submitted
</text>
</definition>
</documentation>
</annotations>
</conceptDomain>
我已经尝试了以下其他地方发现并修改:
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p | b | li | ul | i">
<xsl:apply-templates/>
</xsl:template>
但这并没有去除任何元素,即使我将匹配限制在元素上也是如此。我还尝试了以下几种变体:
<xsl:output method="xml" indent="yes"/>
<xsl:template name="strip-tags">
<xsl:param name="html"/>
<xsl:choose>
<xsl:when test="contains($html, '<')">
<xsl:value-of select="substring-before($html, '<')"/>
<xsl:call-template name="strip-tags">
<xsl:with-param name="html" select="substring-after($html, '>')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$html"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="definition">
<xsl:call-template name="strip-tags">
<xsl:with-param name="html" select="text"/>
</xsl:call-template>
</xsl:template>
如果我省略了身份转换,这将去除所有标签,但其他明智的做法只是复制原始 XML 的内容。任何帮助都感激不尽。-斯科特