0

我有一个问题,我有这个 html 源代码:

<td class="specs_title">
Processortype
<a href="#" class="info-link">
<img src="x.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Intel Core i3
</td>
<td class="specs_title">
Spec
<a href="#" class="info-link">
<img src="y.jpg" title="" height="16" alt="" width="16" />
<span class="info-popup">
<span class="hd">Processortype</span>
<span class="bd">Text</span>
</span>
</a>
</td>
<td class="specs_descr">
Other Spec
</td>

我必须通过 php 和 XPath 从该页面中获取“Intel Core i3”,并且我想通过搜索文本处理器类型的查询来做到这一点并用它来做一些事情。这甚至可能吗,如果是这样,如何?感谢您甚至回复!

4

2 回答 2

1

一种方法是使用Symfony 的 DomCrawler 组件

use Symfony\Component\DomCrawler\Crawler;

$html = <<<EOF
<td class="specs_title">
    Processortype
    <a href="#" class="info-link">
        <img src="x.jpg" title="" height="16" alt="" width="16" />
        <span class="info-popup">
            <span class="hd">Processortype</span>
            <span class="bd">Text</span>
        </span>
    </a>
</td>
<td class="specs_descr">
    Intel Core i3
</td>
<td class="specs_title">
    Spec
    <a href="#" class="info-link">
        <img src="y.jpg" title="" height="16" alt="" width="16" />
        <span class="info-popup">
            <span class="hd">Processortype</span>
            <span class="bd">Text</span>
        </span>
    </a>
</td>
<td class="specs_descr">
    Other Spec
</td>
EOF;

$crawler = new Crawler();
$crawler->addContent($html);
$nodes = $crawler->filterXPath("//td[@class='specs_descr']");
echo $nodes->first()->text(); //This prints exactly "Intel Core i3"
于 2013-10-15T18:32:20.337 回答
0
$XML = '
<root>
    <td class="specs_title">
        Processortype
        <a href="#" class="info-link">
            <img src="x.jpg" title="" height="16" alt="" width="16" />
            <span class="info-popup">
                <span class="hd">Processortype</span>
                <span class="bd">Text</span>
            </span>
        </a>
    </td>
    <td class="specs_descr">
        Intel Core i3
    </td>
    <td class="specs_title">
        Spec
        <a href="#" class="info-link">
            <img src="y.jpg" title="" height="16" alt="" width="16" />
            <span class="info-popup">
                <span class="hd">Processortype</span>
                <span class="bd">Text</span>
            </span>
        </a>
    </td>
    <td class="specs_descr">
        Other Spec
    </td>
</root>';
$sxe = new SimpleXMLElement($XML);
var_dump(array_map('strval',$sxe->xpath("
    //td[@class='specs_title' and contains(.,'Processortype')]
    /following-sibling::td[@class='specs_descr'][1]")));

输出:

array(2) {
  [0] =>
  string(27) "
        Intel Core i3
    "
  [1] =>
  string(24) "
        Other Spec
    "
}
于 2013-10-15T18:30:23.337 回答