2

当在多行的输入上运行时,以下代码给我错误“没有这样的属性_ParseResuls__tokdict”。

对于单行文件,没有错误。如果我注释掉此处显示的第二行或第三行,那么无论文件有多长,我都不会收到该错误。

for line in input:
    final = delimitedList(expr).parseString(line)
    notid = delimitedList(notid).parseString(line)
    dash_tags = ', '.join(format_tree(notid))

    print final.lineId + ": " + dash_tags

有谁知道这里发生了什么?

编辑:正如建议的那样,我正在添加完整的代码以允许其他人重现错误。

from pyparsing import *

#first are the basic elements of the expression

#number at the beginning of the line, unique for each line
#top-level category for a sentiment
#semicolon should eventually become a line break

lineId = Word(nums)
topicString = Word(alphanums+'-'+' '+"'")
semicolon = Literal(';')

#call variable early to allow for recursion
#recursive function allowing for a line id at first, then the topic,
#then any subtopics, and so on. Finally, optional semicolon and repeat.
#set results name lineId.lineId here
expr = Forward()
expr << Optional(lineId.setResultsName("lineId")) + topicString.setResultsName("topicString") + \
Optional(nestedExpr(content=delimitedList(expr))).setResultsName("parenthetical") + \
Optional(Suppress(semicolon).setResultsName("semicolon") + expr.setResultsName("subsequentlines"))

notid = Suppress(lineId) + topicString + \
Optional(nestedExpr(content=delimitedList(expr))) + \
Optional(Suppress(semicolon) + expr)



#naming the parenthetical portion for independent reference later
parenthetical = nestedExpr(content=delimitedList(expr))


#open files for read and write
input = open('parserinput.txt')
output = open('parseroutput.txt', 'w')

#defining functions

#takes nested list output of parser grammer and translates it into
#strings suited for the final output
def format_tree(tree):                                                                                            
    prefix = ''
    for node in tree:
        if isinstance(node, basestring):
            prefix = node
            yield node
        else:
            for elt in format_tree(node):
                yield prefix + '_' + elt

#function for passing tokens from setResultsName
def id_number(tokens):
    #print tokens.dump()
    lineId = tokens
    lineId["lineId"] = lineId.lineId

def topic_string(tokens):
    topicString = tokens
    topicString["topicString"] = topicString.topicString

def parenthetical_fun(tokens):
    parenthetical = tokens
    parenthetical["parenthetical"] = parenthetical.parenthetical

#function for splitting line at semicolon and appending numberId
#not currently in use
def split_and_prepend(tokens):
    return '\n' + final.lineId


#setting parse actions
lineId.setParseAction(id_number)
topicString.setParseAction(topic_string)
parenthetical.setParseAction(parenthetical)


#reads each line in the input file
#calls the grammar expressed in 'expr' and uses it to read the line and assign names to the tokens for later use
#calls the 'notid' varient to easily return the other elements in the line aside from the lineId
#applies the format tree function and joins the tokens in a comma-separated string
#prints the lineId + the tokens from that line
for line in input:
    final = delimitedList(expr).parseString(line)
    notid = delimitedList(notid).parseString(line)
    dash_tags = ', '.join(format_tree(notid))

    print final.lineId + ": " + dash_tags

输入文件是一个txt文档,有以下两行:

1768    dummy; data
1768    dummy data; price
4

1 回答 1

2

在 中使用时重新分配notid中断第二次迭代delimitedList。您的第三行破坏了notid代码中前面定义的表达式,因此它只会在第一次迭代中起作用。为 notid 分配使用不同的名称。

于 2012-04-29T00:05:24.817 回答