0

我正在尝试使用正则表达式来匹配前面有某些单词的文本(在这种情况下是“幻灯片”和“标题”)。我一直在尝试使用后视断言并将它们相互嵌套,但它似乎不起作用。另一个问题是我用来匹配的“幻灯片”和“标题”词之间还有一些其他文本。

我试过了:

(?<=slide(?<=title\\s=\\s)).*(?=\\\")
(?<=title\\s=\\s(?<=slide)).*(?=\\\")

关于如何做到这一点的任何建议?额外的斜线用于转义;我在objective-c中使用它,但我不知道这是否很重要。

我正在抓取 JSON 的一部分以获得更好的上下文(我希望在每张幻灯片中的“标题”之后获得标题):

slide =                 {
                createdAt = "2013-06-18T20:06:50Z";
                description = "<p>Due to the amount of attention paid to each organization's top prospects and early-round draft picks, many of the game's underrated prospects are perpetually obscured. Most of the time, these prospects are younger players who are housed in the low minors and still require considerable physical projection. At the same time, there are countless prospects on the older side of the age curve who have dipped off the radar due to injury.</p><p>Here's a look at one \"hidden gem\" from each organization who could make a push for the major leagues in the coming years.</p>";
                embedCode = "";
                externalId = "<null>";
                id = 3219926;
                photoHasCropExact = 1;
                photoHasCropNorth = 0;
                primaryPhoto =                     {
                    url = "http://img.bleacherreport.net/img/slides/photos/003/219/926/hi-res-5382332_crop_north.jpg";
                };
                title = "Each MLB Team's 'Hidden Gem' Prospect Fans May Not Know About";
                updatedAt = "2013-06-18T22:26:30Z";
                url = "<null>";
4

1 回答 1

1

我同意@MartinR 对此的看法,但要回答正则表达式问题,这是因为您确实指定了一个不可能的条件。你的意思

(?<=(?<=title\\s=\\s)slide).*(?=\\\")
(?<=(?<=slide)title\\s=\\s).*(?=\\\")

要了解原因,请考虑以下正则表达式:

(?<=foo(?<=bar)).

您正在寻找一个以“foo”开头以“bar”开头的字符。为什么,当然,因为“foo”和“bar”永远不会相同,所以这个条件永远不会匹配。如果你想让“bar”在“foo”之前,那么你必须这样做:

(?<=(?<=bar)foo).

另外,请记住,在大多数情况下,正则表达式引擎不支持可变宽度的lookbehinds。您的示例仅包含固定宽度的lookbehinds,但如果您的实际实现更复杂,这可能是您的正则表达式不起作用的另一个原因。

于 2013-06-19T21:42:05.367 回答