1

I am successfully retreving the odt xml file in python but I have no idea how to pull xml file data?

Any techniques are there for pulling the odt xml file data.

Here my code for extracting odt xml file

#!/usr/lib/python2.7

import sys, zipfile

if len(sys.argv) < 2:
    print "input.odt & output.xml"
    sys.exit(0)

content=""
myfile = zipfile.ZipFile(sys.argv[1])
listoffiles = myfile.infolist()
for s in listoffiles:
    if s.orig_filename == 'content.xml':
        fd = open(sys.argv[2],'w')
        content = myfile.read(s.orig_filename)
        fd.write(content)
        fd.close()
4

1 回答 1

2

Any techniques are there for pulling the odt xml file data.我假设您对解析此 xml 文件的内容感到好奇。如果是这样,我推荐BeautifulSoup。BS 用于 html 解析,但可以更改为接受 xml 数据:

BS4:

from bs4 import BeautifulSoup

soup = Beautifulsoup(<xml file contents>, 'xml')

美丽汤3:

from BeautifulSoup import BeautifulStoneSoup

soup = BeautifulStoneSoup(<xml file contents>)

从这里您可以根据文档(上面链接)解析数据。

于 2013-03-26T04:54:21.417 回答