php - $ 与作为最后一个字符的换行符之前的位置不匹配

Question

$ 与作为最后一个字符的换行符之前的位置不匹配。

理想情况下 /1...$/ 应该匹配，但匹配发生在模式 /1....$/ 上，这似乎是错误的。

可能是什么原因？

PHP 文档还说，美元字符 ($) 是一个断言，仅当当前匹配点位于主题字符串的末尾或紧接在作为字符串中最后一个字符的换行符之前（默认情况下）时才为 TRUE。

$subject = 'abc#
123#
';
$pattern = '/1...$/';
preg_match_all($pattern,$subject,$matches); // no match

更新： 由于换行符的 \r\n 格式，我怀疑额外的点。我做了以下实验并看到了一些提示。

$pattern = '/1...(.)$/';

echo bin2hex($matches[1]); // 28

28 似乎等于 \r (CR) 所以基本上 $ 在 \n 之前而不是在 \r\n 之前匹配，这可能是我的问题的原因。

在此处输入图像描述

不可打印字符打开后的图像

score 3 · Accepted Answer

问题是由于窗口文件和 linux 文件的换行符表示不同

为什么这个问题：

我在窗口中创建了 php 文件并转移到安装 PHP 的 linux 中。
Windows 使用 \r\n 来表示换行符，而 linux \n ==> 这就是为什么最初它需要额外的点来匹配。

下面的实验也证实了这一点：

$subject = 'abc#
123#
';
$pattern = '/1...(.)$/';
preg_match_all($pattern,$subject,$matches);
echo bin2hex($matches[1]); // 28 
// 28 is equivalent of \r or CR(carriage return)

在 linux 系统中创建了新文件，并且 /1...$/ 捕获了匹配项:)

如果遇到同样的问题，我希望这可以节省某人的时间。

score 2 · Accepted Answer

你的字符串是多行的。默认情况下，正则表达式不会做多行。您必须添加m修饰符才能发生这种情况。

例如：

/1...$/m

score 0 · Accepted Answer

我已经在这个问题上停留了两天。我做了很多测试来找到这背后的任何逻辑，因为这完全取决于你的数据来自哪里（内部和受控与外部和不受控制）。在我的情况下，它是我网站上的输入字段 ( <textarea>)，可从各种浏览器（和各种操作系统）获得，并且在 JavaScript 中不存在模式测试/匹配/检查的此类问题。对于那些试图解决（或至少解决）在多行模式 (/m) 中的任何行的末尾 ($) 正确匹配模式的问题的人，这里有一个提示。

<?php 
// Various OS-es have various end line (a.k.a line break) chars:
// - Windows uses CR+LF (\r\n);
// - Linux LF (\n);
// - OSX CR (\r).
// And that's why single dollar meta assertion ($) sometimes fails with multiline modifier (/m) mode - possible bug in PHP 5.3.8 or just a "feature"(?).
$str="ABC ABC\n\n123 123\r\ndef def\rnop nop\r\n890 890\nQRS QRS\r\r~-_ ~-_";
//          C          3                   p          0                   _
$pat1='/\w$/mi';    // This works excellent in JavaScript (Firefox 7.0.1+)
$pat2='/\w\r?$/mi'; // Slightly better
$pat3='/\w\R?$/mi'; // Somehow disappointing according to php.net and pcre.org when used improperly
$pat4='/\w(?=\R)/i';    // Much better with allowed lookahead assertion (just to detect without capture) without multiline (/m) mode; note that with alternative for end of string ((?=\R|$)) it would grab all 7 elements as expected
$pat5='/\w\v?$/mi';
$pat6='/(*ANYCRLF)\w$/mi';  // Excellent but undocumented on php.net at the moment (described on pcre.org and en.wikipedia.org)
$n=preg_match_all($pat1, $str, $m1);
$o=preg_match_all($pat2, $str, $m2);
$p=preg_match_all($pat3, $str, $m3);
$r=preg_match_all($pat4, $str, $m4);
$s=preg_match_all($pat5, $str, $m5);
$t=preg_match_all($pat6, $str, $m6);
echo $str."\n1 !!! $pat1 ($n): ".print_r($m1[0], true)
    ."\n2 !!! $pat2 ($o): ".print_r($m2[0], true)
    ."\n3 !!! $pat3 ($p): ".print_r($m3[0], true)
    ."\n4 !!! $pat4 ($r): ".print_r($m4[0], true)
    ."\n5 !!! $pat5 ($s): ".print_r($m5[0], true)
    ."\n6 !!! $pat6 ($t): ".print_r($m6[0], true);
// Note the difference among the three very helpful escape sequences in $pat2 (\r), $pat3 and $pat4 (\R), $pat5 (\v) and altered newline option in $pat6 ((*ANYCRLF)) - for some applications at least.

/* The code above results in the following output:
ABC ABC

123 123
def def
nop nop
890 890
QRS QRS

~-_ ~-_
1 !!! /\w$/mi (3): Array
(
    [0] => C
    [1] => 0
    [2] => _
)

2 !!! /\w\r?$/mi (5): Array
(
    [0] => C
    [1] => 3
    [2] => p
    [3] => 0
    [4] => _
)

3 !!! /\w\R?$/mi (5): Array
(
    [0] => C

    [1] => 3
    [2] => p
    [3] => 0
    [4] => _
)

4 !!! /\w(?=\R)/i (6): Array
(
    [0] => C
    [1] => 3
    [2] => f
    [3] => p
    [4] => 0
    [5] => S
)

5 !!! /\w\v?$/mi (5): Array
(
    [0] => C

    [1] => 3
    [2] => p
    [3] => 0
    [4] => _
)

6 !!! /(*ANYCRLF)\w$/mi (7): Array
(
    [0] => C
    [1] => 3
    [2] => f
    [3] => p
    [4] => 0
    [5] => S
    [6] => _
)
 */
?>

不幸的是，我无法访问具有最新 PHP 版本的服务器——我的本地 PHP 是 5.3.8，而我的公共主机的 PHP 是 5.2.17 版本。

php - $ 与作为最后一个字符的换行符之前的位置不匹配

3 回答 3

Related

Reference