3

I`ve come across a problem when serializing special characters like TAB, linefeed and carriage return as an attribute value.

According to this http://www.w3.org/TR/1999/WD-xml-c14n-19991109.html#charescaping, these should be encoded as &\#x9;, &\#xA;, and &\#xD; respectively. But calling in chrome:

var root = new DOMParser().parseFromString('<root></root>', 'text/xml').documentElement;
root.setAttribute('a', 'first\nsecond');
var serialized = new XMLSerializer().serializeToString(root);

Gives a string < root a="first\nsecond"/> with the linefeed not escaped.

When loading that again:

var loaded = new DOMParser().parseFromString(serialized, 'text/xml').documentElement;
loaded.getAttribute('a');

returns "first second" and the linefeed was lost to just a space. Has anyone faced this issue before? Any help would be appreciated.

Thanks,

Viktor

4

1 回答 1

0

I ran into this problem, and solved it by writing a function removeInvalidCharacters(xmlNode) that removes invalid characters (from nodeValues) in the XML tree. You can use it before serializing to ensure you don't get invalid characters.

You can find removeInvalidCharacters() in my stackoverflow question on the same topic

You can use removeInvalidCharacters() like this:

var stringWithSTX = "Bad" + String.fromCharCode(2) + "News";
var xmlNode = $("<myelem/>").attr("badattr", stringWithSTX);

var serializer = new XMLSerializer();
var invalidXML = serializer.serializeToString(xmlNode);

// Now cleanse it:
removeInvalidCharacters(xmlNode);
var validXML = serializer.serializeToString(xmlNode);

I've also filed an issue report against chrome, but its worth noting that IE9 has its own bugs in this department, so a fix w/o a workaround is probably a long time coming.

于 2013-02-11T15:41:30.300 回答