php - 在两个选择器块之间查找文本

Question

所以，有我的代码：

<div id="first">
 <div id="third">Lorem</div>
   Lorem Ipsum Dolorez [...]
 <script></script>
  ....

 <div id="second">
  Lorem Ipsum[...]
  <a href=""/>
 </div>
  ....
</div>

我需要得到Lorem Ipsum Dolorez [...]哪个在~~两个 div~~块之间，一个块div和一个块script，Lorem Ipsum[...]哪个在 div 内部，但没有超链接。

我尝试使用simple_html_dom.php，但我不知道该怎么做。

编辑：这是一个网站 - 我无法更改此代码。

score 1 · Accepted Answer

您可以使用DOM 库和 xpath选择这些节点：（注释中嵌入的说明）

$html = '
    <div id="first">
 <div id="third">Lorem</div>
    Lorem Ipsum Dolorez [...]
     <script></script>
    this never gets picked up
   <div id="second">
     Lorem Ipsum[...]
       <a href=""></a>
        <span> this span is extraced since its not an anchor element </span>
    </div>
  </div>';

$doc = new DOMDocument;
$doc->loadHTML($html);

$xpath = new DOMXPath($doc);
$first_lorem = $xpath->query('//div[@id="first"]/div[@id="third"]/following-sibling::text()[following::script]');
// first, find the div#first and inside that a div#third ...
// ... and take text node siblings of that div ...
// ... if those siblings have a script node following them (so if there's a <script> after them)

$first_lorem_html = '';
// loop the results and concat the html output
foreach ($first_lorem as $node) {
    $first_lorem_html .= $doc->saveHTML($node);
}
print $first_lorem_html;

// get the every child of div#second except the ones named 'a'
$second_lorem = $xpath->query('//div[@id="second"]/node()[name() != "a"]');
$second_lorem_html = '';
foreach ($second_lorem as $node) {
    $second_lorem_html .= $doc->saveHTML($node);
}
print $second_lorem_html;

score 0 · Accepted Answer

尝试使用 strip_tags php 函数。例子：

echo strip_tags('<div id="second">Lorem Ipsum[...]<a href=""/></div>');

回报：

洛雷姆·伊普苏姆[...]

http://php.net/manual/en/function.strip-tags.php

score 0 · Accepted Answer

根据 simple_html_dom 参考：http ://simplehtmldom.sourceforge.net/

你可以这样做：

   $html->find('div[id=third]', 0)->plaintext

php - 在两个选择器块之间查找文本

3 回答 3

Related

Reference