php - 使用正则表达式查找最多 16 个字符的短语？(php)

Question

我面临的小问题。我有一个长字符串，里面有很多单词，我正在尝试将其拆分，但是字符串的大部分部分都有一个开始和结束来引用它是静态的，但是这个只有一个结束，我想要得到的字符串的实际位是动态的，但最多 16 个字符，它可能会更少，并且短语中的单词数量是未知的。

例子：

Name: John Smith Occupation: Doctor Currently Busy Gender: Male

我想自己获得“当前很忙”，而不是之前得到另一个字符串的结尾。

但我也想使用相同的代码从这个字符串中获取“Not Yet Here”：

Name: John Smith Occupation: Doctor Not Yet Here Gender: Male

我找不到办法，我什至不知道这是否可能，所以希望这里有人可以帮助我。

score 1 · Accepted Answer

您的问题是 RegEx 可能无法解决的问题。如果“职业”的值可以是一个或多个词，并且它直接被另一个可以是一个或多个词的值所取代，那么作为人类，你将如何区分这两个短语？

我希望至少，你有一组已知的Occupation值。如果是这种情况，那么您可以像这样制作您的表达式：

(?<=Doctor |Nurse ).*(?= Gender)

和位是lookbehind(?<=...)和(?=...)lookahead断言，本质上说“确保表达式Doctor |Nurse出现在匹配的短语之前（但不匹配它的那部分），并且表达式Gender出现在匹配的短语之后（但也不匹配那部分的）。”

看看这个：http ://regexr.com?34buq

score 0 · Accepted Answer

不是最优雅的方式，但这里有一个解决方案：

$string = 'Name: John Smith Occupation: Doctor Currently Busy Gender: Male';
$groups = array_filter(preg_split('/\s?\w+:\s?/', $string));
// Split by [\s? => optional space][\w+ => characters a-zA-Z0-9_][:][\s? => optional space]

// $groups[2] contains 'Doctor Currently Busy'
$pieces = explode(' ', $groups[2]);
$pieces = array_reverse($pieces);
$length = 0;$i = 0;$c = count($pieces);$result = array(); // We need this for the loop
// $c and $i are to preserve the first word if the length of all words are < 16 !

foreach($pieces as $piece){
    $length += strlen($piece);
    $i++;
    if($length <= 16 && $c != $i){
        $result[] = $piece;
    }else{
        break;
    }
}

$result = array_reverse($result);
$final_result = implode(' ', $result);
echo $final_result; // Currently Busy

php - 使用正则表达式查找最多 16 个字符的短语？(php)

2 回答 2

Related

Reference