5

So I tried using join() after splitting a string into words and punctuation but it joins the string with a space in between the word and punctuation.

b = ['Hello', ',', 'who', 'are', 'you', '?']
c = " ".join(b)

But that returns:
c = 'Hello , who are you ?'

and I want:
c = 'Hello, who are you?'

4

4 回答 4

5

您可以先加入标点符号:

def join_punctuation(seq, characters='.,;?!'):
    characters = set(characters)
    seq = iter(seq)
    current = next(seq)

    for nxt in seq:
        if nxt in characters:
            current += nxt
        else:
            yield current
            current = nxt

    yield current

c = ' '.join(join_punctuation(b))

生成器join_punctuation生成的字符串已经加入了任何以下标点符号:

>>> b = ['Hello', ',', 'who', 'are', 'you', '?']
>>> list(join_punctuation(b))
['Hello,', 'who', 'are', 'you?']
>>> ' '.join(join_punctuation(b))
'Hello, who are you?'
于 2013-04-11T14:02:50.843 回答
2

在得到结果后执行此操作,不完整,但有效......

c = re.sub(r' ([^A-Za-z0-9])', r'\1', c)

输出:

c = 'Hello , who are you ?'
>>> c = re.sub(r' ([^A-Za-z0-9])', r'\1', c)
>>> c
'Hello, who are you?'
>>> 
于 2013-04-11T14:03:03.867 回答
2

也许是这样的:

>>> from string import punctuation
>>> punc = set(punctuation) # or whatever special chars you want
>>> b = ['Hello', ',', 'who', 'are', 'you', '?']
>>> ''.join(w if set(w) <= punc else ' '+w for w in b).lstrip()
'Hello, who are you?'

b这会在不完全由标点符号组成的单词之前添加一个空格。

于 2013-04-11T14:07:52.783 回答
-1

怎么样

c = " ".join(b).replace(" ,", ",")
于 2013-04-11T14:01:58.667 回答