html - 用一段 html 加载 hpricot 元素

Question

有没有办法将一大块 html 加载到 Hpricot::Doc 对象中？

我正在尝试解析页面中自定义标签中的各种 html 块。

所以如果我有：

<foo>
  <b>here is some stuff</b>
  <table>
    <tr>
      <td>one</td>
      <td>two</td>
    </tr>
    <tr>
      <td>three</td>
      <td><four</td>
    </tr>
  </table>
</foo>

我希望能够在 Hpricot::Doc 对象中获取 foo 及其内容，因为我需要进行一些额外的处理并最终交换（）它，以便 foo 及其所有子项在文档中被替换。

我知道我可以通过 foo 的孩子进行迭代，但我希望有一种方法可以将它们全部抓起来以保持清洁。此外，可能有也可能没有属性。会有很多项目，每个项目都有一段 HTML，但没有 foo 项目将包含另一个 foo 项目。

这是可能吗？最后，我从 Hpricot 开始，但如果它会有所作为，我对 Nokogiri 持开放态度。

score 1 · Accepted Answer

I'm not clear on what you are having trouble with.

You can pass hpricot your html any way you like.

From the Readme

doc = Hpricot("<p>A simple <b>test</b> string.</p><foo>foo content</foo>")

You can search for foo and swap it

doc.search("//foo").first.swap "<blink>not foo</blink>"

html - 用一段 html 加载 hpricot 元素

1 回答 1

Related

Reference