0

我正在尝试编写一个匹配这些条件的正则表达式:

  • 最多 8000 个字符(任何字符,包括“\r\n”)
  • 最多 10 行(由 \r\n 分隔)。
  • 仅从匹配的文本中提取前 4 行

找不到好办法……:/

谢谢!!

4

2 回答 2

1

正则表达式不是您需要的。它们用于匹配某个模式,而不是某个长度。如果您将数据保存在 a 中stringmyString.length <= 8000那么您只需要计算字符数(当然,使用您的语言的正确语法)。对于行数,您必须计算\r\n字符串中的序列数(可以迭代完成)。要获取前四行,只需找到第 4 行\r\n并使用方法获取之前的所有内容substring

于 2013-06-20T21:44:30.717 回答
1

描述

此表达式执行以下操作:

  • 验证输入字符串是否介于 0 到 8,000 个字符之间
  • 验证最多有 10 行新行分隔文本
  • 然后捕获前 4 个新行分隔的文本行

\A(?=.{0,8000}\Z)(?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)(?:^.*?[\r\n\Z]+){0,4}这需要选项:m多行和s点匹配所有字符

在此处输入图像描述

展开

  • \A锚点到字符串的开头,这个锚点允许使用允许匹配换行符和换行符的s选项.
  • (?=.{0,8000}\Z)向前看并验证有 0 到 8000 个字符
  • (?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)向前看并验证不超过 10 个新行分隔线
  • (?:^.*?[\r\n\Z]+){0,4}匹配前 4 行文本

PHP 代码示例:

您没有指定语言,所以我将这个 PHP 示例包括在内以展示它的工作原理和示例输出。

输入文本

此输入测试是 8 行新行分隔的字符串。这里只有 1779 个字符。

Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small 
river named Duden flows by their place and supplies it with the necessary regelialia. It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about 
the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were 
thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. When she reached the first hills of 
the Italic Mountains, she had a last view back on the skyline of her hometown Bookmarksgrove, the headline of Alphabet Village and the subline of her own road, the Line Lane. Pityful a rethoric question ran over her cheek, then 
she continued her way. On her way she met a copy. The copy warned the Little Blind Text, that where it came from it would have been rewritten a thousand times and everything that was left from its origin would be the word "and" 
and the Little Blind Text should turn around and return to its own, safe country. But nothing the copy said could convince her and so it didn’t take long until a few insidious Copy Writers ambushed her, made her drunk with Longe 
and Parole and dragged her into their agency, where they abused her for their projects again and again. And if she hasn’t been rewritten, then they are still using her.

代码

<?php
$sourcestring="your source string";
preg_match('/\A(?=.{0,8000}\Z)(?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)(?:^.*?[\r|\n\Z]+){0,4}/ims',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>

火柴

$matches Array:
(
    [0] => Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small 
river named Duden flows by their place and supplies it with the necessary regelialia. It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about 
the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were 
thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. When she reached the first hills of 

)
于 2013-06-21T03:24:25.783 回答