php - php：在使用正则表达式检测前面带有空格/空格的字符串后删除某些字符

Question

我目前正在尝试使用网络爬虫，并遇到了正则表达式的这个问题。

我想从下面的字符串中存储的字符是“09:00 AM”：

<td style="border: #080707 1px solid;" lang="lang" valign="top" scope="scope"> 09:00 AM</td>

下面是我的正则表达式部分：

preg_match_all ('/<td .+ scope="scope">(.*)<\/td>/i',$link_string,$details);

结果输出是 09:00 AM，我不想要 Â。我知道这是由空格引起的，但我尝试了几种不同的方法，例如：

    preg_match_all ('/<td .+ scope="scope">\s(.*)<\/td>/i',$link_string,$details);

    preg_match_all ('/<td .+ scope="scope">(\w+)<\/td>/i',$link_string,$details);

    preg_match_all ('/<td .+ scope="scope"> (.*)<\/td>/i',$link_string,$details);

但是，返回是假的，我想要的字符不匹配。

希望对进行这种正则表达式的最佳方式有所启发。

score 0 · Accepted Answer

如果您不能自己标记trim()，td那么为什么不使用substr()输出来切断第一个字符：

$time = substr($details[0][1],1) //[0][1] to be changed to actual output

score 0 · Accepted Answer

您必须添加 u 修饰符。使用此标志，正则表达式引擎会将您的字符串视为 unicode 字符串。例子：

preg_match_all ('/<td .+ scope="scope">(.*)<\/td>/iu',$link_string,$details);

php - php：在使用正则表达式检测前面带有空格/空格的字符串后删除某些字符

2 回答 2

Related

Reference