我有一个包含数千条记录的 XML 文件,格式如下:
<custs>
<record cust_ID="B123456@Y1996" l_name="Jungle" f_name="George" m_name="OfThe" city="Fairbanks" zip="00010" current="1" />
<record cust_ID="Q975697@Z2000" l_name="Freely" f_name="I" m_name="P" city="Yellow River" zip="03010" current="1" />
<record cust_ID="M7803@J2323" l_name="Jungle" f_name="Jim" m_name="" city="Fallen Arches" zip="07008" current="0" />
</custs>
# (I know it's not normalized. This is just sample data)
如何将其转换为 CSV 或制表符分隔的文件?我知道我可以使用 re.compile() 语句在 Python 中对其进行硬编码,但是在 diff XML 文件布局中必须有一些更简单、更便携的东西。
我在这里找到了几个关于属性的线程,(Beautifulsoup 无法使用 attrs=class 提取数据,使用beautifulsoup 提取属性值)并且他们让我几乎到了那里:
# Python 3.30
#
from bs4 import BeautifulSoup
import fileinput
Input = open("C:/Python/XML Tut/MinGrp.xml", encoding = "utf-8", errors = "backslashreplace")
OutFile = open('C:/Python/XML Tut/MinGrp_Out.ttxt', 'w', encoding = "utf-8", errors = "backslashreplace")
soup = BeautifulSoup(Input, features="xml")
results = soup.findAll('custs', attrs={})
# output = results [0]#[0]
for each_tag in results:
cust_attrb_value = results[0]
# print (cust_attrb_value)
OutFile.write(cust_attrb_value)
OutFile.close()
下一步(最后?)步骤是什么?