1

我有一个长字符串,其中包含需要分离/提取以进行评估的信息集。

作为一个初学者程序员,我知道各种解析操作,如 split() append() remove() 等 - 但我正在努力想出一种将它们组合起来以提取相关数据的逻辑方法......

长串……

"<Sets X="s"><B s="1" e="2176" t="-2.0774E4" r="1" /><B s="2177" e="8982" t="-1.8597E4" r="1" /><B s="8983" e="10393" t="-150.22" r="1" /></Sets>"

包含 3 组需要存储为浮点值的数据

[Set1] s=1 e=2176 t=-20774 r=1

[Set2] s=2177 e=8982 t=-18597 r=1

[Set3] s=8983 e=10393 t=-150.2 r=1

我希望将每组数据存储为一个列表

Set1 = [1,2176,-20774,1]
Set2 = [2177,8982,-18597,1]
Set3 = [2178,10393,-150.2,1]

注意:套数可能会有所不同

4

2 回答 2

6

使用内置的ElementTree库从 xml 中提取数据:

import xml.etree.ElementTree as ET


data = '<Sets X="s"><B s="1" e="2176" t="-2.0774E4" r="1" /><B s="2177" e="8982" t="-1.8597E4" r="1" /><B s="8983" e="10393" t="-150.22" r="1" /></Sets>'

tree = ET.fromstring(data)
for b in tree.findall('.//B'):
     print map(float, itemgetter(*'setr')(b.attrib))

印刷:

[1.0, 2176.0, -20774.0, 1.0]
[2177.0, 8982.0, -18597.0, 1.0]
[8983.0, 10393.0, -150.22, 1.0]
于 2013-09-25T11:33:48.533 回答
1

注意:这是先前答案的扩展...

(@alecxe 和 @Jon Clements 的道具)

为了标记每个数据集并以易于访问的格式存储结果

import xml.etree.ElementTree as ET
import operator

data = '<Sets X="s"><B s="1" e="2176" t="-2.0774E4" r="1" /><B s="2177" e="8982" t="-1.8597E4" r="1" /><B s="8983" e="10393" t="-150.22" r="1" /></Sets>'

dataDictionary = {}

tree = ET.fromstring(data)
setNumber = 0

for b in tree.findall('.//B'):
    setNumber = setNumber + 1
    dataSet = map(float, operator.itemgetter(*'setr')(b.attrib))
    dataDictionary[setNumber] = dataSet
    print "This is dataset " +str(setNumber)
    print dataSet

print ""
print "This is the Dictionary of datasets"
print dataDictionary

这会产生以下输出 - 这很容易用于未来的操作:)

This is dataSet 1
[1.0, 2176.0, -20774.0, 1.0]
This is dataSet 2
[2177.0, 8982.0, -18597.0, 1.0]
This is dataSet 3
[8983.0, 10393.0, -150.22, 1.0]

This is the dataDictionary
{1: [1.0, 2176.0, -20774.0, 1.0], 2: [2177.0, 8982.0, -18597.0, 1.0], 3: [8983.0, 10393.0, -150.22, 1.0]}
于 2013-09-25T14:54:53.753 回答