我正在尝试编写一个匹配这些条件的正则表达式:
- 最多 8000 个字符(任何字符,包括“\r\n”)
- 最多 10 行(由 \r\n 分隔)。
- 仅从匹配的文本中提取前 4 行。
找不到好办法……:/
谢谢!!
我正在尝试编写一个匹配这些条件的正则表达式:
找不到好办法……:/
谢谢!!
正则表达式不是您需要的。它们用于匹配某个模式,而不是某个长度。如果您将数据保存在 a 中string
,myString.length <= 8000
那么您只需要计算字符数(当然,使用您的语言的正确语法)。对于行数,您必须计算\r\n
字符串中的序列数(可以迭代完成)。要获取前四行,只需找到第 4 行\r\n
并使用方法获取之前的所有内容substring
。
此表达式执行以下操作:
\A(?=.{0,8000}\Z)(?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)(?:^.*?[\r\n\Z]+){0,4}
这需要选项:m
多行和s
点匹配所有字符
\A
锚点到字符串的开头,这个锚点允许使用允许匹配换行符和换行符的s
选项.
(?=.{0,8000}\Z)
向前看并验证有 0 到 8000 个字符(?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)
向前看并验证不超过 10 个新行分隔线(?:^.*?[\r\n\Z]+){0,4}
匹配前 4 行文本您没有指定语言,所以我将这个 PHP 示例包括在内以展示它的工作原理和示例输出。
输入文本
此输入测试是 8 行新行分隔的字符串。这里只有 1779 个字符。
Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small
river named Duden flows by their place and supplies it with the necessary regelialia. It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about
the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were
thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. When she reached the first hills of
the Italic Mountains, she had a last view back on the skyline of her hometown Bookmarksgrove, the headline of Alphabet Village and the subline of her own road, the Line Lane. Pityful a rethoric question ran over her cheek, then
she continued her way. On her way she met a copy. The copy warned the Little Blind Text, that where it came from it would have been rewritten a thousand times and everything that was left from its origin would be the word "and"
and the Little Blind Text should turn around and return to its own, safe country. But nothing the copy said could convince her and so it didn’t take long until a few insidious Copy Writers ambushed her, made her drunk with Longe
and Parole and dragged her into their agency, where they abused her for their projects again and again. And if she hasn’t been rewritten, then they are still using her.
代码
<?php
$sourcestring="your source string";
preg_match('/\A(?=.{0,8000}\Z)(?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)(?:^.*?[\r|\n\Z]+){0,4}/ims',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
火柴
$matches Array:
(
[0] => Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small
river named Duden flows by their place and supplies it with the necessary regelialia. It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about
the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were
thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. When she reached the first hills of
)