I have a few XML files in a format like this
...
<section>
<header>Headline</header>
<par>Some text <em>here</em> and more text.</par>
</section>
....
At first I used the PHP pull-parser XMLReader which just walked through the file, and I could respond to whatever node type the reader would run across. This all worked quite well, but the code felt bloated and there was some context information that I needed which required me to drag additional state around. For example, a <header>
tag can be a child to a section or subsection tag.
So I switched to SimpleXML because it would represent the XML document as a recursive data structure, and provide me with XPath functionality. No more state (as I can query the context for, eg header tags) and the code is much more compact.
However... How do I access the "Some text "
and " and more text."
child nodes of the <par>
tag? Is that possible, or can I not parse a soup with SimpleXML? Should I better use a DOMDocument implementation, and if so, which one? Where's the difference to SimpleXML?