python - 如何强制 ElementTree 每次更改都写入磁盘？

Question

我必须处理从大型数据库创建的巨大 XML 文件。

从数据库结构中我生成了一个 XML 树，但它最终吃掉了我所有的内存（现在超过 10GB），所以这个过程永远不会结束，因为系统无法处理任何新的查询。

我认为解决方案是避免将所有 XML 结构保存到内存中，并在我添加新内容时将其直接转储到磁盘中。

这真的是我能做到的吗？如何？

谢谢！

score 1 · Accepted Answer

我不确定您的代码究竟是如何工作的，但这绝对是可以做到的。根据其结构，您可以创建时间戳并将传入数据与旧数据进行比较，或者您可以扫描文件以查看当前 XML 是否已添加。之后（假设它的新代码），您可以执行以下操作：

path = "path/"
name = "fileName"

xmlRoot = Element("root")#create a root element for the xml structure
xmlSub = SubElement(xmlRoot,"sub")
subName = SubElement(xmlCard,"name")
subName.text = "element text"

saveName = path + name + ".xml" #constructs location of xml file (path/fileName.xml)
tree = ElementTree(xmlRoot) #compiles the tree
tree.append(saveName) #appends to specified file

这将在名为“fileName.xml”的文件夹“path”中的文档中输出以下内容

<root>
    <sub>
        <name>element text</name>
    </sub>
</root>

如果要扫描 XML 文档中的对象，请执行以下操作：

xml = parse("path/fileName.xml")
nameList = xml.findall("sub/name") #find all objects in <name> brackets
for i in nameList:
    i.text #convert item in the list to a readable string
    #do comparison here

我希望这有帮助！快乐编码！

python - 如何强制 ElementTree 每次更改都写入磁盘？

1 回答 1

Related

Reference