3

我正在尝试学习一些关于正则表达式的知识。
这是我要匹配的内容:

/parent/child  
/parent/child?  
/parent/child?firstparam=abc123  
/parent/child?secondparam=def456  
/parent/child?firstparam=abc123&secondparam=def456  
/parent/child?secondparam=def456&firstparam=abc123  
/parent/child?thirdparam=ghi789&secondparam=def456&firstparam=abc123  
/parent/child?secondparam=def456&firstparam=abc123&thirdparam=ghi789  
/parent/child?thirdparam=ghi789  
/parent/child/  
/parent/child/?  
/parent/child/?firstparam=abc123  
/parent/child/?secondparam=def456  
/parent/child/?firstparam=abc123&secondparam=def456  
/parent/child/?secondparam=def456&firstparam=abc123  
/parent/child/?thirdparam=ghi789&secondparam=def456&firstparam=abc123  
/parent/child/?secondparam=def456&firstparam=abc123&thirdparam=ghi789  
/parent/child/?thirdparam=ghi789

我的表达应该“抓住” abc123def456
现在只是一个关于我不会匹配的示例(“问号”丢失):

/parent/child/firstparam=abc123&secondparam=def456

好吧,我构建了以下表达式:

^(?:/parent/child){1}(?:^(?:/\?|\?)+(?:firstparam=([^&]*)|secondparam=([^&]*)|[^&]*)?)?

但这不起作用。
你能帮我理解我做错了什么吗?
提前致谢。

更新 1

好的,我做了其他测试。我正在尝试使用以下内容修复以前的版本:

/parent/child(?:(?:\?|/\?)+(?:firstparam=([^&]*)|secondparam=([^&]*)|[^&]*)?)?$

让我解释一下我的想法:
必须以 /parent/child 开头:

/parent/child

以下组是可选的

(?: ... )?

前一个可选组必须以 ? 或者 /?

(?:\?|/\?)+

可选参数(如果指定的参数是查询字符串的一部分,我会获取值)

(?:firstparam=([^&]*)|secondparam=([^&]*)|[^&]*)?

行结束

$

有什么建议吗?

更新 2

我的解决方案必须仅基于正则表达式。举个例子,我之前写过以下一个:

/parent/child(?:[?&/]*(?:firstparam=([^&]*)|secondparam=([^&]*)|[^&]*))*$

这很好用。但它也匹配以下输入:

/parent/child/firstparam=abc123&secondparam=def456

我如何修改表达式以匹配先前的字符串?

4

5 回答 5

2

您没有指定语言,所以我将只使用 Perl。所以基本上我没有匹配所有东西,而是完全匹配了我认为你需要的东西。如果我错了,请纠正我。

while ($subject =~ m/(?<==)\w+?(?=&|\W|$)/g) {
    # matched text = $&
}

(?<=        # Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
   =        # Match the character “=” literally
)
\\w         # Match a single character that is a “word character” (letters, digits, and underscores)
   +?       # Between one and unlimited times, as few times as possible, expanding as needed (lazy)
(?=         # Assert that the regex below can be matched, starting at this position (positive lookahead)
            # Match either the regular expression below (attempting the next alternative only if this one fails)
      &     # Match the character “&amp;” literally
   |        # Or match regular expression number 2 below (attempting the next alternative only if this one fails)
      \\W   # Match a single character that is a “non-word character”
   |        # Or match regular expression number 3 below (the entire group fails if this one fails to match)
      \$    # Assert position at the end of the string (or before the line break at the end of the string, if any)
)

输出:

结果

于 2012-08-02T08:25:23.777 回答
1

只要您知道参数名称将是什么并且您确定它们不会改变,这个正则表达式就可以工作。

\/parent\/child\/?\?(?:(?:firstparam|secondparam|thirdparam)\=([\w]+)&?)(?:(?:firstparam|secondparam|thirdparam)\=([\w]+)&?)?(?:(?:firstparam|secondparam|thirdparam)\=([\w]+)&?)?

虽然正则表达式不是最好的解决方案(上面的代码示例效率更高,因为字符串函数比正则表达式快得多)如果您需要一个最多具有 3 个参数的正则表达式解决方案,这将起作用。出于兴趣,为什么解决方案必须只使用正则表达式?

无论如何,此正则表达式将匹配以下字符串:

/parent/child?firstparam=abc123  
/parent/child?secondparam=def456  
/parent/child?firstparam=abc123&secondparam=def456  
/parent/child?secondparam=def456&firstparam=abc123  
/parent/child?thirdparam=ghi789&secondparam=def456&firstparam=abc123  
/parent/child?secondparam=def456&firstparam=abc123&thirdparam=ghi789  
/parent/child?thirdparam=ghi789  
/parent/child/?firstparam=abc123  
/parent/child/?secondparam=def456  
/parent/child/?firstparam=abc123&secondparam=def456  
/parent/child/?secondparam=def456&firstparam=abc123  
/parent/child/?thirdparam=ghi789&secondparam=def456&firstparam=abc123  
/parent/child/?secondparam=def456&firstparam=abc123&thirdparam=ghi789  
/parent/child/?thirdparam=ghi789

它现在只会匹配那些包含查询字符串参数的参数,并将它们放入捕获组中。

您使用什么语言来处理您的比赛?

如果您将 preg_match 与 PHP 一起使用,则可以获取整个匹配项以及捕获数组中的组

preg_match($regex, $string, $matches);

然后您可以使用 $matches[0] 访问整个比赛,其余的则使用 $matches[1]、$matches[2] 等。

如果您想添加其他参数,您还需要在正则表达式中添加它们,并添加其他部分以获取您的数据。例如,如果你有

/parent/child/?secondparam=def456&firstparam=abc123&fourthparam=jkl01112&thirdparam=ghi789

正则表达式将变为

\/parent\/child\/?\?(?:(?:firstparam|secondparam|thirdparam|fourthparam)\=([\w]+)&?)(?:(?:firstparam|secondparam|thirdparam|fourthparam)\=([\w]+)&?)?(?:(?:firstparam|secondparam|thirdparam|fourthparam)\=([\w]+)&?)?(?:(?:firstparam|secondparam|thirdparam|fourthparam)\=([\w]+)&?)?

但是,随着您添加更多参数,这将变得更加繁琐。

如果启用了多行标志,您可以选择在开头和结尾包含 ^ $。如果您还需要匹配没有查询字符串的整行,请将整个正则表达式包装在非捕获组中(包括 ^ $)并添加

|(?:^\/parent\/child\/?\??$)

到最后。

于 2016-11-07T11:57:44.520 回答
0

/对于初学者,您并没有逃避正则表达式中的s ,并且{1}没有必要使用单次重复;仅当您想要多次重复或一系列重复时才使用它们。

您尝试做的部分事情根本不是很好地使用正则表达式。我将向您展示一种更简单的处理方法:您想使用 split 之类的东西并将信息放入哈希中,以便稍后检查其内容。因为您没有指定语言,所以我将使用 Perl 作为我的示例,但是我所知道的每种使用正则表达式的语言也可以轻松访问散列和拆分之类的东西,所以这应该很容易移植:

 # I picked an example to show how this works.
 my $route = '/parent/child/?first=123&second=345&third=678';
 my %params;  # I'm going to put those URL parameters in this hash.

 # Perl has a way to let me avoid escaping the /s, but I wanted an example that
 # works in other languages too.
 if ($route =~ m/\/parent\/child\/\?(.*)/) {  # Use the regex for this part
   print "Matched route.\n";
   # But NOT for this part. 
   my $query = $1;  # $1 is a Perl thing.  It contains what (.*) matched above.
   my @items = split '&', $query;  # Each item is something like param=123
   foreach my $item (@items) {
     my ($param, $value) = split '=', $item;
     $params{$param} = $value;  # Put the parameters in a hash for easy access.
     print "$param set to $value \n";
   }
 }

 # Now you can check the parameter values and do whatever you need to with them.
 # And you can add new parameters whenever you want, etc.
 if ($params{'first'} eq '123') {
   # Do whatever
 }
于 2012-08-02T08:46:20.187 回答
0

这个脚本会帮助你。
首先,我检查一下,有没有像?.
然后,我杀死了行的第一部分(左起?)。
接下来,我将行拆分为&,其中每个值拆分为=

my $r = q"/parent/child  
/parent/child?  
/parent/child?firstparam=abc123  
/parent/child?secondparam=def456  
/parent/child?firstparam=abc123&secondparam=def456  
/parent/child?secondparam=def456&firstparam=abc123  
/parent/child?thirdparam=ghi789&secondparam=def456&firstparam=abc123  
/parent/child?secondparam=def456&firstparam=abc123&thirdparam=ghi789  
/parent/child?thirdparam=ghi789  
/parent/child/  
/parent/child/?  
/parent/child/?firstparam=abc123  
/parent/child/?secondparam=def456  
/parent/child/?firstparam=abc123&secondparam=def456  
/parent/child/?secondparam=def456&firstparam=abc123  
/parent/child/?thirdparam=ghi789&secondparam=def456&firstparam=abc123  
/parent/child/?secondparam=def456&firstparam=abc123&thirdparam=ghi789  
/parent/child/?thirdparam=ghi789";


for my $string(split /\n/, $r){
        if (index($string,'?')!=-1){
            substr($string, 0, index($string,'?')+1,"");
            #say "string = ".$string;
            if (index($string,'=')!=-1){
                my @params = map{$_ = [split /=/, $_];}split/\&/, $string;
                $"="\n";
                say "$_->[0] === $_->[1]" for (@params);
                say "######next########";
                }
            else{
                #print "there is no params!"
            }       

        }
        else{
            #say "there is no params!";
        }       
    }
于 2012-08-02T10:30:58.497 回答
0

我的解决方案:
/(?:\w+/)*(?:(?:\w+)?\?(?:\w+=\w+(?:&\w+=\w+)*)?|\w+|)

解释:
/(?:\w+/)*匹配/parent/child//parent/

(?:\w+)?\?(?:\w+=\w+(?:&\w+=\w+)*)?匹配child?firstparam=abc123?firstparam=abc123?

\w+匹配文本child

..|)不匹配(空)

如果您只需要查询字符串,则模式会减少,例如:
/(?:\w+/)*(?:\w+)?\?(\w+=\w+(?:&\w+=\w+)*)

如果您想从查询字符串中获取每个参数,这是一个 Ruby 示例:

re = /\/(?:\w+\/)*(?:\w+)?\?(\w+=\w+(?:&\w+=\w+)*)/
s = '/parent/child?secondparam=def456&firstparam=abc123&thirdparam=ghi789'
if m = s.match(re)
    query_str = m[1] # now, you can 100% trust this string
    query_str.scan(/(\w+)=(\w+)/) do |param,value| #grab parameter
        printf("%s, %s\n", param, value)
    end
end

输出

secondparam, def456
firstparam, abc123
thirdparam, ghi789
于 2012-08-02T08:42:40.087 回答