1

我正在尝试按照标准将丰富的代码段数据应用于我的网页http://schema.org/Article。其中一个属性是articleBody,我希望它应该包括构成文章的整个正文。

不幸的是,文章的 HTML 表示偶尔会出现按钮、广告和其他提示,其中包含不应进入articleBody.

例如:

<div itemscope itemtype="http://schema.org/Article">
  <div itemtype="articleBody">
    <p>1st Paragraph</p>
    <p>2nd paragraph</p>
    <a>A few useful links for my users</a>
    <p>3rd paragraph</p>
    <div>A few text ads</div>
    <p>4th paragraph</p>
  </div>
</div>

有没有办法从文章本身中排除广告/链接中的文字?

4

1 回答 1

1

不,Microdata 不提供排除内容的方法。

articleBody值将是元素的textContent


一个丑陋的“hack”是为这个项目指定几个articleBody属性:

<div itemscope itemtype="http://schema.org/Article">
  <div itemtype="articleBody">
    <p>1st Paragraph</p>
    <p>2nd paragraph</p>
  </div>
    <a>A few useful links for my users</a>
    <p itemtype="articleBody">3rd paragraph</p>
    <div>A few text ads</div>
    <p itemtype="articleBody">4th paragraph</p>
  </div>
</div>

但请注意,Microdata 并未定义如何解释这些值,因此取决于消费者。


另一个丑陋的方法:

复制包含在meta元素中的信息:

<div itemscope itemtype="http://schema.org/Article">
  <div>
    <p>1st Paragraph</p>
    <p>2nd paragraph</p>
    <a>A few useful links for my users</a>
    <p>3rd paragraph</p>
    <div>A few text ads</div>
    <p>4th paragraph</p>
  </div>
  <meta itemtype="articleBody" content="1st Paragraph. 2nd paragraph. 3rd paragraph. 4th paragraph." />
</div>
于 2014-02-01T22:09:16.187 回答