1

我遇到了一个逻辑问题。

我有一个声明如下的字符串:

fruits = "banana grapes apple"
vegetables = "potatoes cucumber carrot"

现在有一些文本句子,我必须搜索文本格式前面的单词<vegetables> <fruits>

I ate carrot grapes ice cream for dessert.

答:吃了

Dad and mom brought banana cucumber and milk.

答案:带来

我在想的是拆分句子并将其放入一个数组中,然后查找序列,我能够打破句子但匹配序列是一个问题。

wd = sentence.split(' ')
for x in wd.strip().split():
# now i will have to look for the sequence

现在,我将不得不寻找文本格式前面的文本

4

3 回答 3

2

您在这里使用了错误的数据结构,将水果和蔬菜转换为集合。那么问题就很容易解决了:

>>> fruits = set("banana grapes apple".split())
>>> vegetables = set("potatoes cucumber carrot".split())
>>> fruits_vegs = fruits | vegetables                  
>>> from string import punctuation
def solve(text):                                   
    spl = text.split()
    #use itertools.izip and iterators for memory efficiency.
    for x, y in zip(spl, spl[1:]): 
        #strip off punctuation marks
        x,y = x.translate(None, punctuation), y.translate(None, punctuation)
        if y in fruits_vegs and x not in fruits_vegs:
            return x
...         
>>> solve('I ate carrot grapes ice cream for dessert.')
'ate'
>>> solve('Dad and mom brought banana cucumber and milk.')
'brought'
>>> solve('banana cucumber and carrot.')
'and'
于 2013-08-13T03:37:42.503 回答
1
fruits = "banana grapes apple".split(" ")
vegetables = "potatoes cucumber carrot".split(" ")

sentence = 'Dad and mom brought banana cucumber and milk.'

wd = sentence.split(' ')
for i, x in enumerate(wd):
    if (x in fruits or x in vegetables) and i > 0:
        print wd[i-1]
        break
于 2013-08-13T03:33:11.483 回答
1

您可以使用正则表达式执行此操作:

def to_group(l):
    ''' make a regex group from a list of space-separated strings '''
    return '(?:%s)' % ('|'.join(l.split()))

pattern = r'(\w+) %s %s' % (to_group(vegetables), to_group(fruits))
print re.findall(pattern, string)
于 2013-08-13T03:33:54.773 回答