php - 用 preg_match 提取一些 html

Question

我使用 preg_mach 提取一些 html （我尝试使用 DOMDocument 但我在换行时遇到了一些问题）...这是我的代码..

1.html

<body>


            <!-- icon and title -->
            <div class="smallfont">
                <img class="inlineimg" src="images/icons/icon1.gif" alt="" border="0" />
                <strong>qrtoobah 3nwan</strong>
            </div>
            <hr size="1" style="color:#CCCCCC; background-color:#CCCCCC" />
            <!-- / icon and title -->


        <div id="post_message_14142536">

            <font size="7"><font color="red">msaha 700</font></font><br />
<font size="7"><font color="red">shamali 20</font></font><br />
<font size="7"><font color="red"> 1700 almetr</font></font><br />
<font size="7"><font color="#ff0000">sooom bs</font></font><br />
<font size="7"><font color="#ff0000">albee3 qreeb</font></font>
        </div>
        <!-- message -->


</body>

提取.php

<?php 
$html = file_get_contents("1.html");
$pattern = '/<([!]+)([^]+).*>([^]+)(message\ \-\-\>)/';
   preg_match($pattern, $html, $matches);
 print_r($matches);


?>

我想在)blablabla(......之间得到任何东西，但我得到了那个数组：

Array ( [0] => [1] => ! [2] => -- [3] => message --> )

score 0 · Accepted Answer

用于strpos查找第一个标签位置。然后用 . 找到结束标签strpos。我的意思是-如果您知道从哪里到您要寻找的东西并且它们是独一无二的..那么preg_*功能有什么重要的？

所以我想这样的事情会很好用（我让代码尽可能清晰，以便在逐步操作中理解我的想法）：

$tag_begin = "<!-- icon and title -->";
$tag_end   = "<!-- message -->";
$begin     = strpos($tag_begin,$text)+strlen($tag_begin);
$end       = strpos($tag_end,$text);
$result    = substr($begin,$end, $text);

如果您想查找并存储打开和关闭之间的所有结构，您也可以执行完全相同的操作。
只有你必须做的改变——首先用 preg_match 找到所有打开的结构名称。例如：

$result_cnt = preg_match_all('#<!-- [^/].*-->#', $text , $openings);

// Output for your example HTML is:
$openings = 
array (
  0 => 
  array (
    0 => '<!-- icon and title -->',
    1 => '<!-- message -->',
  ),
)

在 $openings 的一个循环之后，首先需要使用代码进行查找。只是在正确的位置添加关闭“/”字符的开口。

php - 用 preg_match 提取一些 html

1 回答 1

Related

Reference