https://github.com/peldszus/arg-microtexts/blob/master/corpus/en/micro_b001.xml
我只想提取此标签信息:
<arggraph id="micro_b001" topic_id="waste_separation" stance="pro">
这是:“micro_b001”“waste_separation”
我想将它们保存为列表
我试过这个:
myList = []
myEdgesList=[]
#read the whole text from
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith('.xml'):
with open(os.path.join(root, file), encoding="UTF-8") as content:
tree = ET.parse(content)
myList.append(tree)
上面的代码是正确的,它给出了每个文件的信息
<xml.etree.ElementTree.ElementTree at 0x21c893e34c0>,
但这看起来不正确
for k in myList:
arg= [e.attrib['stance'] for e in k.findall('.//arggraph') ]
print(arg)
第二个代码没有给我所需的值