<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">

  <title></title>

</head>

<body>

You are right. This is exactly what LibXML2 does. However, C14N spec

says that :<br>

<br>

&nbsp;&nbsp;&nbsp; <a class="moz-txt-link-freetext" href="http://www.w3.org/TR/2001/REC-xml-c14n-20010315#Terminology">http://www.w3.org/TR/2001/REC-xml-c14n-20010315#Terminology</a><br>

<br>

&nbsp;&nbsp;&nbsp; -&nbsp; All whitespace in character content is retained (excluding

characters     removed during <br>

&nbsp;&nbsp;&nbsp; line feed normalization)&nbsp;&nbsp; <br>

&nbsp;&nbsp;&nbsp; <br>

&nbsp;&nbsp;&nbsp; <a class="moz-txt-link-freetext" href="http://www.w3.org/TR/2001/REC-xml-c14n-20010315#DataModel">http://www.w3.org/TR/2001/REC-xml-c14n-20010315#DataModel</a>

<p>&nbsp;&nbsp;&nbsp; If an XML document must be converted to a node-set, XPath

REQUIRES that <br>

&nbsp;&nbsp;&nbsp; an XML processor be used to create the nodes of its data model to

fully represent <br>

&nbsp;&nbsp;&nbsp; the document. The XML processor performs the following tasks in

order:</p>

<ol>

  <li>normalize line feeds</li>

  <li>...<br>

  </li>

</ol>

And unless I misunterstood something, this means that \r characters

MUST be removed<br>

during C14N. LibXML2 does not do this. It inserts &amp;#D; and later

conoverts all entities back<br>

including &amp;#D; --&gt; '\r'. Which means that on c14n level I just

don't see difference between '\r'<br>

that came from \r-&gt;&amp;#D;-&gt;\r and '\r' that came from

&amp;#D;-&gt;\r. I need to kill the first ones and <br>

save the second ones. And as I said, fixing LibXML2 parser might be

tricky.<br>

<br>

I wrote only on c14n implementation but I hate c14n code. It just too

complicated with too many<br>

corner cases.<br>

<br>

Aleksey<br>

<br>

<br>

</body>

</html>