-1

好的,所以我有以下小功能:

def swap(inp):
    inp = inp.split()
    out = ""

    for item in inp:
        ind  = inp.index(item)
        item = item.replace("i am",    "you are")
        item = item.replace("you are", "I am")
        item = item.replace("i'm",     "you're")
        item = item.replace("you're",  "I'm")
        item = item.replace("my",      "your")
        item = item.replace("your",    "my")
        item = item.replace("you",     "I")
        item = item.replace("my",      "your")
        item = item.replace("i",       "you")
        inp[ind] = item

    for item in inp:
        ind  = inp.index(item)
        item = item + " "
        inp[ind] = item

    return out.join(inp)

其中,虽然它不是特别有效,但可以完成较短句子的工作。基本上,它所做的只是交换代词等观点。当我向它扔一个像“我爱你”这样的字符串时,这很好,它返回“你爱我”,但是当我抛出类似的东西时:

you love your version of my couch because I love you, and you're a couch-lover.

我得到:

I love your versyouon of your couch because I love I, and I'm a couch-lover. 

我很困惑为什么会这样。我明确地将字符串拆分为一个列表以避免这种情况。为什么它能够将其检测为列表项的一部分,而不仅仅是完全匹配?

此外,为了避免不得不发布另一个如此相似的问题,稍微偏离了一点;如果对此的解决方案破坏了此功能,逗号、句号、其他标点符号会怎样?

它犯了一些非常令人惊讶的错误。我的预期输出是:

I love my version of your couch because you love I, and I'm a couch-lover.

我这样格式化的原因是因为我最终希望能够用数据库中的单词替换 item.replace(x, y) 变量。

4

3 回答 3

2

For this specific problem you need regular expressions. Basically, along the lines of:

table = [
    ("I am", "you are"),
    ("I'm",  "you're"),
    ("my",   "your"),
    ("I",    "you"),
]

import re

def swap(s):
    dct = dict(table)
    dct.update((y, x) for x, y in table)
    return re.sub(
        '|'.join(r'(?:\b%s\b)' % x for x in dct),
        lambda m: dct[m.group(0)], 
        s)

print swap("you love your version of my couch because I love you, and you're a couch-lover.")
# I love my version of your couch because you love I, and I'm a couch-lover.

But in general, natural language processing by the means of string/re functions is naive at best (note "you love I" above).

于 2012-12-16T11:06:21.487 回答
1

下面是一个简单的代码:

def swap(inp):
    inp = inp.split()
    out = []
    d1 = ['i am', 'you are', 'i\'m', 'you\'re', 'my', 'your', 'I', 'my', 'you']
    d2 = ['you are', 'I am', 'you\'re', 'I\'m', 'your', 'my', 'you', 'your', 'I']
    for item in inp:
        itm = item.replace(',','')
        if itm not in d1:
            out.append(item)
        else: out.append(d2[d1.index(itm)])
    return ' '.join(out)

    print(swap('you love your version of my couch because I love you, and you\'re a couch-lover.'))
于 2012-12-16T11:49:24.617 回答
0

问题是两者都index()适用replace()于子字符串(在您的情况下是子词)。

看看我对另一个问题的回答:用字典替换字符串,用标点符号复杂化

该答案中的代码可用于解决您的问题。

于 2012-12-16T10:54:30.617 回答