4

试图抓住引号之间的短语时,我似乎无法正确使用正则表达式。例如粗体(注意:输入前后都有字符串):

“我很能理解你的想法。” 我说。“当然,在你这个非官方的顾问和帮助者的位置上,在三个大陆上,你会接触到所有奇怪和奇异的东西。但是在这里”

“当然,在你这个非官方的顾问和帮助者的位置上,在三个大陆上,你会接触到所有奇怪和离奇的事物。但是在这里” ——我从地上捡起晨报—— “让我们进行实际测试。这是我来的第一个标题。'丈夫对妻子的残忍。' 有半行字,但我不读就知道,这一切对我来说都是非常熟悉的。当然,还有另一个女人,喝酒,推搡,打击,瘀伤,同情的姐姐或女房东. 最粗鲁的作家不能发明更粗鲁的东西。

我试图在引号之前和之后获取文本,但我无法获得所需的输出。必须有某种方法将正则表达式分组,以便我可以捕获引号之间的字符串以及周围的两个引号

试过:

import re

def get_quotes(paragraph):
    quote_rx = r'''([""])(?:(?=(\\?))\2.)*?\1'''
    return [i.group(0) for i in \
           re.finditer(quote_rx, paragraph, re.S)]

def get_said(paragraph, quote):
    quote_start = paragraph.index(quote)
    quote_end = quote_start + len(quote)
    before = paragraph[:quote_start]
    after = paragraph[quote_end:]
    return before, after


paragraphs = ['''I smiled and shook my head. "I can quite understand your thinking so." I said. "Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"--I picked up the morning paper from the ground--"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."''', 
'''Such was the remarkable narrative to which I listened on that April evening -- a narrative which would have been utterly incredible to me had it not been confirmed by the actual sight of the tall, spare figure and the keen, eager face, which I had never thought to see again. In some manner he had learned of my own sad bereavement, and his sympathy was shown in his manner rather than in his words. "Work is the best antidote to sorrow, my dear Watson," said he, "and I have a piece of work for us both to-night which, if we can bring it to a successful conclusion, will in itself justify a man's life on this planet." In vain I begged him to tell me more. "You will hear and see enough before morning," he answered. "We have three years of the past to discuss. Let that suffice until half-past nine, when we start upon the notable adventure of the empty house."''']

for p in paragraphs:
    saids = set()
    for i in get_quotes(p):
        b,a = get_said(p,i)
        print b
        print a
        print

期望的输出:

in-btw: I said.
quotes: ["I can quite understand your thinking so.","Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"]
section: "I can quite understand your thinking so." **I said.** "Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"


in-btw: --I picked up the morning paper from the ground--
quotes: ['''"Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"''', '''"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."''']
section: "Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"**--I picked up the morning paper from the ground--**"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."
4

1 回答 1

2

很简单,你需要的正则表达式是r'^("[^"]+")([^"]+)("[^"]+")'

import re

s = """
"I can quite understand your thinking so." I said. "Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"

"Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"--I picked up the morning paper from the ground--"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."
"""

for segment in s.splitlines():
    if not segment:
        continue
    first, said, second = re.match(r'^("[^"]+")([^"]+)("[^"]+")', segment).groups()
    print first
    print said
    print second

>>> 
"I can quite understand your thinking so."
 I said. 
"Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"
"Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"
--I picked up the morning paper from the ground--
"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."
于 2013-10-21T12:23:06.147 回答