请多多包涵,因为我对 python(以及更大的编程社区)非常陌生,但我一直在由一位比我更有经验的同事指导。我们正在尝试编写一个读取 XML 的 python 脚本文件并挑选数据的某些部分,编辑一些变量值,然后重新组合 XML。我们遇到的问题是数据在使用 toprettyxml() 传递回新的 do 时被格式化的方式
基本上,文件的上半部分有一堆我们根本不需要修改的元素,所以我们试图完全抓取这些元素,然后在我们重新组合时将它们附加到根。同一页面上同一级别的一些较低元素被挑选成内存中的较小项目,并在最低的子级别重新组合。那些被手动重新组装和附加的工作正常。
所以这里应该大致是相关的代码位:
def __handleElemsWithAtrributes(elem):
#returns empty element with all attributes of source element
tmpDoc = Document()
result = tmpDoc.createElement(elem.item(0).tagName)
attr_map = elem.item(0).attributes
for i in range(attr_map.length):
result.setAttribute(attr_map.item(i).name,attr_map.item(i).value)
return result
def __getWholeElement(elems):
#returns element with all attributes of source element and all contents
if len(elems) == 0:
return 0
temp = Document()
for e in elems:
result = temp.createElement(e.tagName)
attr_map = e.attributes
for i in range(attr_map.length):
result.setAttribute(attr_map.item(i).name,attr_map.item(i).value)
result = e
return result
def __init__():
##A bunch of other stuff I'm leaving out...
f = xml.dom.minidom.parse(pathToFile)
doc = Document()
modules = f.getElementsByTagName("Module")
descriptions = f.getElementsByTagName("Description")
steptree = f.getElementsByTagName("StepTree")
reference = f.getElementsByTagName("LessonReference")
mod_val = __handleElemsWithAtrributes(modules)
des_val = __getWholeElement(descriptions)
step_val = __getWholeElement(steptree)
ref_val = __getWholeElement(reference)
if des_val != 0 and mod_val != 0 and step_val != 0 and ref_val != 0:
mod_val.appendChild(des_val)
mod_val.appendChild(step_val)
mod_val.appendChild(ref_val)
doc.appendChild(mod_val)
o.write(doc.toprettyxml())
不,这里的标签没有准确地保留,因为我从几个不同的区域复制了,但我相信你明白了要点。
基本上,我使用的输入看起来像这样:
<Module aatribute="" attribte2="" attribute3="" >
<Description>
<Title>SomeTitle</Title>
<Objective>An objective</Objective>
<Action>
<Familiarize>familiarize text</Familiarize>
</Action>
<Condition>
<Familiarize>Condition text</Familiarize>
</Condition>
<Standard>
<Familiarize>Standard text</Familiarize>
</Standard>
<PerformanceMeasures>
<Measure>COL text</Measure>
</PerformanceMeasures>
<TMReferences>
<Reference>Reference text</Reference>
</TMReferences>
</Description>
然后当它重新组装时,它看起来像这样:
<Module aatribute="" attribte2="" attribute3="" >
<Description>
<Title>SomeTitle</Title>
<Objective>An objective</Objective>
<Action>
<Familiarize>familiarize text</Familiarize>
</Action>
<Condition>
<Familiarize>Condition text</Familiarize>
</Condition>
<Standard>
<Familiarize>Standard text</Familiarize>
</Standard>
<PerformanceMeasures>
<Measure>COL text</Measure>
</PerformanceMeasures>
<TMReferences>
<Reference>Reference text</Reference>
</TMReferences>
</Description>
如何让它停止制作所有额外的空行?有任何想法吗?