1

我需要一些关于正则表达式的帮助。请看下面的例子。我正在捕获包含在这之间的特定摆脱值

“,“孩子们”:[

并以此结束

 
}]}]}

如下所示。

我的问题是下面显示的块会重复几次,我","children":[ to }]}]}只希望每个块开始之间的所有摆脱。

我知道我可以通过以下方式捕捉个人摆脱价值:rid":"([\w\d\-\."]+)

但我不知道如何指定捕获开始 rid":"([\w\d\-\."]+)之间存在的所有内容","children":[}]}]}

例子:

     ","children":[{"type":"stub","context":"","rid":"b1c4922237ce.ee6a3644443fe.10711226e93.d0af7aadbd0-4be3-4353ddd.8b47.f2f4aaf2474f","metaclass":" ASAPModel.BarrierCategory"},
{"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.290c6e93.91c15f91-a1c-4c36.9939.4ab7b94a39ad","metaclass":"ASAPModel.BarrierCategory"},
{"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.27c3ee93.22e90c22-7406-463a.8bff.f6ea88f6ffcc","metaclass":"ASAPModel.BarrierCategory"},
{"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.6a182e93.5c0e7d5c-ff65-451d.afc0.cfc7fbcfc02d","metaclass":"ASAPModel.BarrierCategory"},
{"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.6970ae93.8ea3978e-112b-4bbb.8405.d17071d105d2","metaclass":"ASAPModel.BarrierCategory"}]} ]},

     ","children":[{"type":"stub","context":"","rid":"b1c4922237ce.ee6a3644443fe.10711226e93.d0af7aadbd0-4be3-4353ddd.8b47.f2f4aaf2474f","metaclass":" ASAPModel.BarrierCategory"},
{"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.290c6e93.91c15f91-a1c-4c36.9939.4ab7b94a39ad","metaclass":"ASAPModel.BarrierCategory"},
{"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.27c3ee93.22e90c22-7406-463a.8bff.f6ea88f6ffcc","metaclass":"ASAPModel.BarrierCategory"},
{"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.6a182e93.5c0e7d5c-ff65-451d.afc0.cfc7fbcfc02d","metaclass":"ASAPModel.BarrierCategory"},
{"type":"stub","context":"","rid":"b1c497ce.ee6a64fe.6970ae93.8ea3978e-112b-4bbb.8405.d17071d105d2","metaclass":"ASAPModel.BarrierCategory"}]} ]},

我的问题是我不明白如何指定开始非捕获组的开始和结束值,以及如何说识别这些捕获组中的一个或多个,有点像[]+

4

3 回答 3

6

这看起来像JSON(尽管您的示例数据不完整,无法有效)。

如果是这样,那么来自CPAN的JSON模块可能是最好的方法:

use strict;
use warnings;
use JSON qw( from_json );

# my example data
my $data = q( [ 
    {"children":[ {"type":"stub","rid":"aa"}, {"type":"stub2","rid":"bb"} ] }, 
    {"children":[ {"type":"stub","rid":"cc"}, {"type":"stub2","rid":"dd"} ] } ]
);

my $json = from_json( $data );

for my $rec ( @$json ) {
    for my $child ( @{ $rec->{children} } ) {
        say "rid: ", $child->{rid};
    }
}

这打印:

摆脱:aa
摆脱:bb
摆脱:cc
摆脱:dd
于 2009-08-20T14:26:58.317 回答
1

您需要将其分为两个步骤:

  1. 获取数据长度
  2. 摆脱困境

    # Make sure you get the first one
    my ( $child ) = $record =~ m/"children":\[([^\]]+)\]/g;
    # Get all in span - the g operator tells the regex to get all ( 'global' )
    my @rids     = $child =~ m/"rid":"([^"]+)"/g; # <-- g operator
    

但对我来说它看起来像 JSON,你可以用JSON::Syck解析这样的数据

于 2009-08-20T14:20:40.903 回答
0

就像是\",\"children\":(.*)(?=\\]\\}\\]\\})

玩弄它

论坛正在吸收我的一些反斜杠,警告要为其他任何人加倍

回应编辑

尝试先将数据分成括号内的组,然后在 for 循环中对每个组进行一次搜索。您可以使用正则表达式组一次获取所有组。

于 2009-08-20T14:08:18.423 回答