I have one large document (400 mb), which contains hundreds of XML documents, each with their own declarations. I am trying to parse each document using ElementTree in Python. I am having a lot of trouble with splitting each XML document in order to parse out the information. Here is an example of what the document looks like:
<?xml version="1.0"?>
<data>
<more>
<p></p>
</more>
</data>
<?xml version="1.0"?>
<different data>
<etc>
<p></p>
</etc>
</different data>
<?xml version="1.0"?>
<continues.....>
Ideally I would like to read through each XML declaration, parse the data, and continue on with the next XML document. Any suggestions will help.