2

我尝试将一个典型的谷歌搜索字符串剥离到它的一部分。即刺痛可能是:“如何”发动机燃料

所以我想分别了解“如何”和发动机和燃料

我尝试了以下 preg_match_all,但我也分别得到了“如何和如何,这可能会变得不必要地难以处理。

preg_match_all(
     '=(["]{1}[^"]{1,}["]{1})'
    .'|([-]{1}[^ ]{1,}[ ]{1})'
    .'|([^-"]{1}[^ ]{1,}[ ]{1})=si', 
  $filter, 
  $matches,
  PREG_PATTERN_ORDER);

任何人都知道如何正确地做到这一点?

4

2 回答 2

2

尝试:

$q = '"how to" engine -fuel';
preg_match_all('/"[^"]*"|\S+/', $q, $matches);
print_r($matches);

这将打印:

大批
(
    [0] => 数组
        (
            [0] =>“如何”
            [1] => 引擎
            [2] => -燃料
        )

)

意义:

"[^"]*"    # match a quoted string
|          # OR
\S+        # 1 or more non-space chars
于 2012-06-01T07:12:18.903 回答
1

试试这个

(?i)("[^"]+") +([a-z]+) +(\-[a-z]+)\b

代码

if (preg_match('/("[^"]+") +([a-z]+) +(-[a-z]+)\b/i', $subject, $regs)) {
    $howto = $regs[1];
    $engine = $regs[2];
    $fuel = $regs[3];
} else {
    $result = "";
}

解释

"
(?i)        # Match the remainder of the regex with the options: case insensitive (i)
(           # Match the regular expression below and capture its match into backreference number 1
   \"           # Match the character “\"” literally
   [^\"]        # Match any character that is NOT a “\"”
      +           # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   \"           # Match the character “\"” literally
)
\           # Match the character “ ” literally
   +           # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(           # Match the regular expression below and capture its match into backreference number 2
   [a-z]       # Match a single character in the range between “a” and “z”
      +           # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
\           # Match the character “ ” literally
   +           # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(           # Match the regular expression below and capture its match into backreference number 3
   \-          # Match the character “-” literally
   [a-z]       # Match a single character in the range between “a” and “z”
      +           # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
\b          # Assert position at a word boundary
"

希望这可以帮助。

于 2012-06-01T07:10:57.897 回答