c# - PrinceXML：“输入不是正确的 UTF-8”

Question

我正在从数据库生成 HTML，然后将其发送到 PrinceXML 以转换为 PDF。我用来执行此操作的代码是：

string _htmlTemplate = @"<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd""><html lang=""en-GB"" xml:lang=""en-GB"" xmlns=""http://www.w3.org/1999/xhtml""><head><meta http-equiv=""Content-type"" content=""text/html;charset=UTF-8"" /><title>Generated PDF Contract</title></head><body>{0}</body></html>";

string _pgeContent = string.Format(_htmlTemplate, sb.ToString());
writer.Write(sb.ToString());
Byte[] arrBytes = UTF8Encoding.Default.GetBytes(_pgeContent);
Stream s = new MemoryStream(arrBytes);

Prince princeConverter = new Prince(ConfigurationManager.AppSettings["PrinceXMLInstallLoc"].ToString());
princeConverter.SetLog(ConfigurationManager.AppSettings["PrinceXMLLogLoc"]);
princeConverter.AddStyleSheet(Server.MapPath(ConfigurationManager.AppSettings["FormsDocGenCssLocl"]));
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "application/pdf";
Response.BufferOutput = true;

但是，转换失败并出现以下错误：

输入不是正确的UTF-8，表示编码！字节：0xA0 0x77 0x65 0x62

我已获取生成的 html 并将其上传到 W3C 验证器。它将标记验证为 UTF-8 编码的 XHTML 1.0 过渡，没有错误或警告。

我还用细齿梳浏览了文件，寻找无效字符。到目前为止什么都没有。

谁能建议我可以尝试的其他东西？

score 2 · Accepted Answer

经过一个下午的低语诅咒和扯掉我剩下的头发后，我找到了解决我的特殊问题的方法。

看来 System.Text.UTF8Encoding 默认情况下不输出 UTF-8 标识符字节。所以在我的情况下，我需要使用带有布尔参数的构造函数来控制它的输出。

UTF8Encoding u8enc = new UTF8Encoding(true);//Ensures a UTF8 identifier is emitted.

在此之后，一切都很好。希望这可以帮助某人:-)

c# - PrinceXML：“输入不是正确的 UTF-8”

1 回答 1

Related

Reference