I have this Gzipped XML-file: http://cdon.com/xml_files/cdon_games_SE.xml.gz
According to lxml http://lxml.de/parsing.html lxml can parse gzipped XML-files: "lxml can parse from a local file, an HTTP URL or an FTP URL. It also auto-detects and reads gzip-compressed XML files (.gz)."
This code:
from lxml import etree
tree = urllib.urlopen('http://cdon.com/xml_files/cdon_games_SE.xml.gz')
parser = etree.XMLParser(recover=True)
tree = etree.parse(tree, parser)
tree = tree.xpath(//product)
Gives error:
tree = tree.xpath(//product)
File "lxml.etree.pyx", line 2038, in lxml.etree._ElementTree.xpath (src/lxml\lxml.etree.c:47529)
File "lxml.etree.pyx", line 1709, in lxml.etree._ElementTree._assertHasRoot (src/lxml\lxml.etree.c:44508)
AssertionError: ElementTree not initialized, missing root
What is wrong? Can't lxml
parse gzipped XML-files? If I save the file in xml (without gzip) as a file on the local server it works.