c# - 在 .NET 中将 HTML 转换为 PDF

Question

我想通过将 HTML 内容传递给函数来生成 PDF。我已经为此使用了 iTextSharp，但是当它遇到表格并且布局变得混乱时，它表现不佳。

有没有更好的办法？

score 244 · Accepted Answer

编辑： 使用 PdfSharp 的 PDF 的新建议 HTML 渲染器

（在尝试 wkhtmltopdf 并建议避免它之后）

HtmlRenderer.PdfSharp 是100% 完全 C# 托管代码，易于使用、线程安全且最重要的是免费 （新 BSD 许可证）解决方案。

用法

下载HtmlRenderer.PdfSharp nuget 包。

使用示例方法。

public static Byte[] PdfSharpConvert(String html)
{
    Byte[] res = null;
    using (MemoryStream ms = new MemoryStream())
    {
        var pdf = TheArtOfDev.HtmlRenderer.PdfSharp.PdfGenerator.GeneratePdf(html, PdfSharp.PageSize.A4);
        pdf.Save(ms);
        res = ms.ToArray();
    }
    return res;
}

一个很好的替代品是iTextSharp的免费版本

在 4.1.6 版之前，iTextSharp 是在 LGPL 许可下获得许可的，并且 4.16 之前的版本（或者可能还有分叉）作为包提供并且可以免费使用。当然有人可以使用续订 5+付费版。

我试图在我的项目中集成wkhtmltopdf解决方案，但遇到了很多障碍。

由于以下原因，我个人会避免在托管企业应用程序上使用基于 wkhtmltopdf 的解决方案。

首先，wkhtmltopdf 是 C++ 实现的，而不是 C#，将它嵌入到 C# 代码中会遇到各种问题，尤其是在项目的 32 位和 64 位版本之间切换时。不得不尝试几种解决方法，包括条件项目构建等，以避免在不同机器上出现“无效格式异常”。
如果您管理自己的虚拟机，则可以。但是，如果您的项目在受限制的环境中运行，例如（Azure （TuesPenchin 作者提到的天蓝色实际上是不可能的）、 Elastic Beanstalk等），那么仅配置该环境以使 wkhtmltopdf 工作是一场噩梦。
wkhtmltopdf 正在您的服务器中创建文件，因此您必须管理用户权限并授予对 wkhtmltopdf 运行位置的“写入”访问权限。
Wkhtmltopdf 作为独立应用程序运行，因此它不受 IIS 应用程序池管理。因此，您必须将其作为服务托管在另一台机器上，否则您将在生产服务器中遇到处理峰值和内存消耗。
它使用临时文件来生成 pdf，在AWS EC2这样的磁盘 i/o 非常慢的情况下，这是一个很大的性能问题。
许多用户报告的最讨厌的“无法加载DLL'wkhtmltox.dll'”错误。

--- 预编辑部分 ---

对于任何想在更简单的应用程序/环境中从 html 生成 pdf 的人，我将我的旧帖子留作建议。

周二佩奇金

https://www.nuget.org/packages/TuesPechkin/

或特别适用于MVC Web 应用程序 （但我认为您可以在任何 .net 应用程序中使用它）

Rotativa

https://www.nuget.org/packages/Rotativa/

他们都利用 wkhtmtopdf二进制文件将 html 转换为 pdf。它使用 webkit 引擎来呈现页面，因此它也可以解析css 样式表。

它们提供易于使用的与 C# 的无缝集成。

Rotativa 还可以从任何Razor View 直接生成 PDF。

此外，对于现实世界的 Web 应用程序，他们还管理线程安全等......

score 103 · Accepted Answer

最后更新时间：2020 年 10 月

这是我整理的 .NET 中 HTML 到 PDF 转换的选项列表（有些是免费的，有些是付费的）

GemBox.Document
- https://www.nuget.org/packages/GemBox.Document/
- 免费（最多20段）
- 680 美元 - https://www.gemboxsoftware.com/document/pricelist
- https://www.gemboxsoftware.com/document/examples/c-sharp-convert-html-to-pdf/307
PDF变形.Net
HtmlRenderer.PdfSharp
- https://www.nuget.org/packages/HtmlRenderer.PdfSharp/1.5.1-beta1
- BSD 未指定许可证
PuppeteerSharp
- https://www.puppeteersharp.com/examples/index.html
- 麻省理工学院许可证
- https://github.com/kblok/puppeteer-sharp
EO.PDF
- https://www.nuget.org/packages/EO.Pdf/
- 799 美元 - https://www.essentialobjects.com/Purchase.aspx?f=3
WnvHtmlToPdf_x64
- https://www.nuget.org/packages/WnvHtmlToPdf_x64/
- 750 美元 - 1600 美元 - http://www.winnovative-software.com/Buy.aspx
- 演示 - http://www.winnovative-software.com/demo/default.aspx
铁PDF
- https://www.nuget.org/packages/IronPdf/
- 399 美元 - 1599 美元 - https://ironpdf.com/licensing/
- https://ironpdf.com/examples/using-html-to-create-a-pdf/
尖顶.PDF
- https://www.nuget.org/packages/Spire.PDF/
- 免费（最多 10 页）
- 599 美元 - 1799 美元 - https://www.e-iceblue.com/Buy/Spire.PDF.html
- https://www.e-iceblue.com/Tutorials/Spire.PDF/Spire.PDF-Program-Guide/Convert-HTML-to-PDF-Customize-HTML-to-PDF-Conversion-by-Yourself.html
假设.HTML
- https://www.nuget.org/packages/Aspose.Html/
- 599 美元 - 1797 美元 - https://purchase.aspose.com/pricing/html/net
- https://docs.aspose.com/html/net/html-to-pdf-conversion/
EvoPDF
- https://www.nuget.org/packages/EvoPDF/
- 450 美元 - 1200 美元 - http://www.evopdf.com/buy.aspx
ExpertPdfHtmlToPdf
- https://www.nuget.org/packages/ExpertPdfHtmlToPdf/
- 550 美元 - 1200 美元 - https://www.html-to-pdf.net/Pricing.aspx
Zetpdf
- https://zetpdf.com
- 299 美元 - 599 美元 - https://zetpdf.com/pricing/
- 不是一个众所周知或受支持的库 - ZetPDF - 有人知道这个产品的背景吗？
PDFtron
- https://www.pdftron.com/documentation/samples/cs/HTML2PDFTes
- 4000 美元/年 - https://www.pdftron.com/licensing/
WkHtmlToXSharp
- https://github.com/pruiz/WkHtmlToXSharp
- 自由的
- 并发转换被实现为处理队列。
选择PDF
- https://www.nuget.org/packages/Select.HtmlToPdf/
- 免费（最多 5 页）
- 499 美元 - 799 美元 - https://selectpdf.com/pricing/
- https://selectpdf.com/pdf-library-for-net/

如果以上选项都不能帮助您，您可以随时搜索 NuGet 包：
https ://www.nuget.org/packages?q=html+pdf

score 28 · Accepted Answer

大多数 HTML 到 PDF 转换器依赖 IE 来进行 HTML 解析和渲染。当用户更新他们的 IE 时，这可能会中断。这是一个不依赖 IE 的。

代码是这样的：

EO.Pdf.HtmlToPdf.ConvertHtml(htmlText, pdfFileName);

与许多其他转换器一样，您可以传递文本、文件名或 URL。结果可以保存到文件或流中。

score 27 · Accepted Answer

我强烈推荐NReco，认真的。它有免费版和付费版，真的很值得。它在后台使用 wkhtmtopdf，但您只需要一个程序集。极好的。

使用示例：

通过NuGet安装。

var htmlContent = String.Format("<body>Hello world: {0}</body>", DateTime.Now);
var pdfBytes = (new NReco.PdfGenerator.HtmlToPdfConverter()).GeneratePdf(htmlContent);

免责声明：我不是开发人员，只是该项目的粉丝 :)

score 12 · Accepted Answer

您可以在无头模式下使用 Google Chrome 打印到 PDF 功能。我发现这是最简单但最强大的方法。

var url = "https://stackoverflow.com/questions/564650/convert-html-to-pdf-in-net";
var chromePath = @"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe";
var output = Path.Combine(Environment.CurrentDirectory, "printout.pdf");
using (var p = new Process())
    {
        p.StartInfo.FileName = chromePath;
        p.StartInfo.Arguments = $"--headless --disable-gpu --print-to-pdf={output} {url}";
        p.Start();
        p.WaitForExit();
    }

score 6 · Accepted Answer

2018年的更新，让我们使用标准的HTML+CSS=PDF等式！

对于 HTML 到 PDF 的需求，有好消息。正如这个答案所示，W3C 标准css-break-3将解决这个问题……它是一个候选推荐，计划在 2017 年或 2018 年经过测试成为最终推荐。

正如print-css.rocks 所示，有一些解决方案不那么标准，带有 C# 插件。

score 6 · Accepted Answer

对于所有在这里寻找有效解决方案的人，.net 5您可以去。

这是我的工作解决方案。

使用`wkhtmltopdf`：

从这里下载并安装wkhtmltopdf最新版本。
使用下面的代码。

public static string HtmlToPdf(string outputFilenamePrefix, string[] urls,
    string[] options = null,
    string pdfHtmlToPdfExePath = @"C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe")
{
    string urlsSeparatedBySpaces = string.Empty;
    try
    {
        //Determine inputs
        if ((urls == null) || (urls.Length == 0))
            throw new Exception("No input URLs provided for HtmlToPdf");
        else
            urlsSeparatedBySpaces = String.Join(" ", urls); //Concatenate URLs

        string outputFilename = outputFilenamePrefix + "_" + DateTime.Now.ToString("yyyy-MM-dd-hh-mm-ss-fff") + ".PDF"; // assemble destination PDF file name

        var p = new System.Diagnostics.Process()
        {
            StartInfo =
            {
                FileName = pdfHtmlToPdfExePath,
                Arguments = ((options == null) ? "" : string.Join(" ", options)) + " " + urlsSeparatedBySpaces + " " + outputFilename,
                UseShellExecute = false, // needs to be false in order to redirect output
                RedirectStandardOutput = true,
                RedirectStandardError = true,
                RedirectStandardInput = true, // redirect all 3, as it should be all 3 or none
                WorkingDirectory = Path.Combine(Path.GetDirectoryName(Assembly.GetEntryAssembly().Location))
            }
        };

        p.Start();

        // read the output here...
        var output = p.StandardOutput.ReadToEnd();
        var errorOutput = p.StandardError.ReadToEnd();

        // ...then wait n milliseconds for exit (as after exit, it can't read the output)
        p.WaitForExit(60000);

        // read the exit code, close process
        int returnCode = p.ExitCode;
        p.Close();

        // if 0 or 2, it worked so return path of pdf
        if ((returnCode == 0) || (returnCode == 2))
            return outputFilename;
        else
            throw new Exception(errorOutput);
    }
    catch (Exception exc)
    {
        throw new Exception("Problem generating PDF from HTML, URLs: " + urlsSeparatedBySpaces + ", outputFilename: " + outputFilenamePrefix, exc);
    }
}

并将上述方法称为HtmlToPdf("test", new string[] { "https://www.google.com" }, new string[] { "-s A5" });
如果您需要将HTML字符串转换为PDF，请调整上述方法并将其替换Arguments为Process StartInfo$@"/C echo | set /p=""{htmlText}"" | ""{pdfHtmlToPdfExePath}"" {((options == null) ? "" : string.Join(" ", options))} - ""C:\Users\xxxx\Desktop\{outputFilename}""";

这种方法的缺点：

发布此答案时的最新版本wkhtmltopdf不支持 latestHTML5和CSS3. 因此，如果您尝试导出任何 html，CSS GRID那么输出将不符合预期。
您需要处理并发问题。

使用`chrome headless`：

从这里下载并安装最新的 chrome 浏览器。
使用下面的代码。

var p = new System.Diagnostics.Process()
{
    StartInfo =
    {
        FileName = "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe",
        Arguments = @"/C --headless --disable-gpu --run-all-compositor-stages-before-draw --print-to-pdf-no-header --print-to-pdf=""C:/Users/Abdul Rahman/Desktop/test.pdf"" ""C:/Users/Abdul Rahman/Desktop/grid.html""",
    }
};

p.Start();

// ...then wait n milliseconds for exit (as after exit, it can't read the output)
p.WaitForExit(60000);

// read the exit code, close process
int returnCode = p.ExitCode;
p.Close();

这会将html文件转换为pdf文件。
如果您需要将一些转换url为pdf然后使用以下Argument内容Process StartInfo

@"/C --headless --disable-gpu --run-all-compositor-stages-before-draw --print-to-pdf-no-header --print-to-pdf=""C:/Users/Abdul Rahman/Desktop/test.pdf"" ""https://www.google.com""",

这种方法的缺点：

这可以按预期使用最新的HTML5和CSS3功能。输出将与您在浏览器中查看的相同，但通过 IIS 运行时，您需要在 IdentityAppliactionPool下运行您的应用程序，LocalSystem或者您需要提供read/write访问IISUSRS.

使用`Selenium WebDriver`：

安装 Nuget 包Selenium.WebDriver和Selenium.WebDriver.ChromeDriver.
使用下面的代码。

public async Task<byte[]> ConvertHtmlToPdf(string html)
{
    var directory = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.CommonDocuments), "ApplicationName");
    Directory.CreateDirectory(directory);
    var filePath = Path.Combine(directory, $"{Guid.NewGuid()}.html");
    await File.WriteAllTextAsync(filePath, html);

    var driverOptions = new ChromeOptions();
    // In headless mode, PDF writing is enabled by default (tested with driver major version 85)
    driverOptions.AddArgument("headless");
    using var driver = new ChromeDriver(driverOptions);
    driver.Navigate().GoToUrl(filePath);

    // Output a PDF of the first page in A4 size at 90% scale
    var printOptions = new Dictionary<string, object>
    {
        { "paperWidth", 210 / 25.4 },
        { "paperHeight", 297 / 25.4 },
        { "scale", 0.9 },
        { "pageRanges", "1" }
    };
    var printOutput = driver.ExecuteChromeCommandWithResult("Page.printToPDF", printOptions) as Dictionary<string, object>;
    var pdf = Convert.FromBase64String(printOutput["data"] as string);

    File.Delete(filePath);

    return pdf;
}

这种方法的优点：

这只需要安装 Nuget 并按预期使用最新HTML5的CSS3功能。输出将与您在浏览器中查看的相同。

这种方法的缺点：

这种方法需要在运行应用程序的服务器上安装最新的 chrome 浏览器。

使用这种方法，请确保添加如下所示的文件<PublishChromeDriver>true</PublishChromeDriver>：.csproj

<PropertyGroup>
  <TargetFramework>net5.0</TargetFramework>
  <LangVersion>latest</LangVersion>
  <Nullable>enable</Nullable>
  <PublishChromeDriver>true</PublishChromeDriver>
</PropertyGroup>

这将chrome driver在发布项目时发布。

这是我的工作项目回购的链接 - HtmlToPdf

Selenium在几乎花了 2 天时间使用可用选项并最终实施了基于解决方案及其工作后，我得出了上述答案。希望这可以帮助您并节省您的时间。

score 5 · Accepted Answer

Quite likely most projects will wrap a C/C++ engine rather than implementing a C# solution from scratch. Try Project Gotenberg.

To test it

docker run --rm -p 3000:3000 thecodingmachine/gotenberg:6

Curl sample

curl --request POST \
    --url http://localhost:3000/convert/url \
    --header 'Content-Type: multipart/form-data' \
    --form remoteURL=https://brave.com \
    --form marginTop=0 \
    --form marginBottom=0 \
    --form marginLeft=0 \
    --form marginRight=0 \
    -o result.pdf

C# sample.cs

using System;
using System.Net.Http;
using System.Threading.Tasks;
using System.IO;
using static System.Console;

namespace Gotenberg
{
    class Program
    {
        public static async Task Main(string[] args)
        {
            try
            {
                var client = new HttpClient();            
                var formContent = new MultipartFormDataContent
                    {
                        {new StringContent("https://brave.com/"), "remoteURL"},
                        {new StringContent("0"), "marginTop" }
                    };
                var result = await client.PostAsync(new Uri("http://localhost:3000/convert/url"), formContent);
                await File.WriteAllBytesAsync("brave.com.pdf", await result.Content.ReadAsByteArrayAsync());
            }
            catch (Exception ex)
            {
                WriteLine(ex);
            }
        }
    }
}

To compile

csc sample.cs -langversion:latest -reference:System.Net.Http.dll && mono ./sample.exe

score 4 · Accepted Answer

下面是使用 iTextSharp (iTextSharp + itextsharp.xmlworker) 将 html + css 转换为 PDF 的示例

using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.tool.xml;


byte[] pdf; // result will be here

var cssText = File.ReadAllText(MapPath("~/css/test.css"));
var html = File.ReadAllText(MapPath("~/css/test.html"));

using (var memoryStream = new MemoryStream())
{
        var document = new Document(PageSize.A4, 50, 50, 60, 60);
        var writer = PdfWriter.GetInstance(document, memoryStream);
        document.Open();

        using (var cssMemoryStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(cssText)))
        {
            using (var htmlMemoryStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(html)))
            {
                XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, htmlMemoryStream, cssMemoryStream);
            }
        }

        document.Close();

        pdf = memoryStream.ToArray();
}

score 3 · Accepted Answer

这取决于您的任何其他要求。

一个非常简单但不易部署的解决方案是使用 WebBrowser 控件加载 Html，然后使用 Print 方法打印到本地安装的 PDF 打印机。有几种免费的 PDF 打印机可用，WebBrowser 控件是 .Net 框架的一部分。

编辑：如果您的 Html 是 XHtml，您可以使用PDFizer来完成这项工作。

score 3 · Accepted Answer

到目前为止，最好的免费.NET 解决方案似乎是TuesPechkin库，它是wkhtmltopdf本机库的包装器。

我现在使用单线程版本将几千个 HTML 字符串转换为 PDF 文件，它似乎工作得很好。它应该也可以在多线程环境（例如 IIS）中工作，但我还没有测试过。

另外由于我想使用最新版本的wkhtmltopdf（撰写本文时为 0.12.5），我从官方网站下载了 DLL，将其复制到我的项目根目录，将复制到输出设置为 true，然后初始化库，如所以：

var dllDir = AppDomain.CurrentDomain.BaseDirectory;
Converter = new StandardConverter(new PdfToolset(new StaticDeployment(dllDir)));

上面的代码将准确地查找“wkhtmltox.dll”，所以不要重命名文件。我使用了 64 位版本的 DLL。

确保您阅读了多线程环境的说明，因为您必须在每个应用程序生命周期中只初始化一次，因此您需要将其放入单例或其他东西中。

score 2 · Accepted Answer

我前段时间也在找这个。我遇到了 HTMLDOC http://www.easysw.com/htmldoc/，这是一个免费的开源命令行应用程序，它将 HTML 文件作为参数并从中输出 PDF。对于我的副业项目来说，这对我来说效果很好，但这一切都取决于你实际需要什么。

制造它的公司出售编译后的二进制文件，但您可以免费从源代码下载和编译并免费使用它。我设法编译了一个最近的修订版（1.9 版），我打算在几天内发布一个二进制安装程序，所以如果你有兴趣，我可以在发布后立即提供一个链接。

编辑（2/25/2014）：似乎文档和站点已移至http://www.msweet.org/projects.php?Z1

score 2 · Accepted Answer

如果您需要在 pdf 中完美呈现 html，则需要使用商业库。

ExpertPdf Html To Pdf Converter非常易于使用，它支持最新的html5/css3。您可以将整个 url 转换为 pdf：

using ExpertPdf.HtmlToPdf; 
byte[] pdfBytes = new PdfConverter().GetPdfBytesFromUrl(url);

或一个 html 字符串：

using ExpertPdf.HtmlToPdf; 
byte[] pdfBytes = new PdfConverter().GetPdfBytesFromHtmlString(html, baseUrl);

您还可以选择将生成的 pdf 文档直接保存到磁盘上的文件流中。

score 2 · Accepted Answer

这是一个免费的图书馆，很容易工作：OpenHtmlToPdf

string timeStampForPdfName = DateTime.Now.ToString("yyMMddHHmmssff");

string serverPath = System.Web.Hosting.HostingEnvironment.MapPath("~/FolderName");
string pdfSavePath = Path.Combine(@serverPath, "FileName" + timeStampForPdfName + ".FileExtension");


//OpenHtmlToPdf Library used for Performing PDF Conversion
var pdf = Pdf.From(HTML_String).Content();

//FOr writing to file from a ByteArray
 File.WriteAllBytes(pdfSavePath, pdf.ToArray()); // Requires System.Linq

score 1 · Accepted Answer

要在 C# 中将 HTML 转换为 PDF，请使用 ABCpdf。

ABCpdf 可以使用 Gecko 或 Trident 渲染引擎，因此您的 HTML 表格看起来与在 FireFox 和 Internet Explorer 中显示的相同。

在 www.abcpdfeditor.com 上有一个 ABCpdf 的在线演示。您可以使用它来检查您的表格将如何首先呈现，而无需下载和安装软件。

要渲染整个网页，您需要 AddImageUrl 或 AddImageHtml 函数。但是，如果您只想添加 HTML 样式的文本，那么您可以尝试 AddHtml 函数，如下所示：

Doc theDoc = new Doc();
theDoc.FontSize = 72;
theDoc.AddHtml("<b>Some HTML styled text</b>");
theDoc.Save(Server.MapPath("docaddhtml.pdf"));
theDoc.Clear();

ABCpdf 是一个商业软件名称，但是标准版通常可以通过特价免费获得。

score 1 · Accepted Answer

您可以创建 HTML 页面的位图，然后将位图插入 PDF，而不是将 HTML 直接解析为 PDF，例如使用iTextSharp。

这是如何获取 URL 的位图的代码。我在 SO 的某个地方找到了它，如果我找到源代码，我会链接它。

public System.Drawing.Bitmap HTMLToImage(String strHTML)
{
    System.Drawing.Bitmap myBitmap = null;

    System.Threading.Thread myThread = new System.Threading.Thread(delegate()
    {
        // create a hidden web browser, which will navigate to the page
        System.Windows.Forms.WebBrowser myWebBrowser = new System.Windows.Forms.WebBrowser();
        // we don't want scrollbars on our image
        myWebBrowser.ScrollBarsEnabled = false;
        // don't let any errors shine through
        myWebBrowser.ScriptErrorsSuppressed = true;
        // let's load up that page!    
        myWebBrowser.Navigate("about:blank");

        // wait until the page is fully loaded
        while (myWebBrowser.ReadyState != System.Windows.Forms.WebBrowserReadyState.Complete)
            System.Windows.Forms.Application.DoEvents();

        myWebBrowser.Document.Body.InnerHtml = strHTML;

        // set the size of our web browser to be the same size as the page
        int intScrollPadding = 20;
        int intDocumentWidth = myWebBrowser.Document.Body.ScrollRectangle.Width + intScrollPadding;
        int intDocumentHeight = myWebBrowser.Document.Body.ScrollRectangle.Height + intScrollPadding;
        myWebBrowser.Width = intDocumentWidth;
        myWebBrowser.Height = intDocumentHeight;
        // a bitmap that we will draw to
        myBitmap = new System.Drawing.Bitmap(intDocumentWidth - intScrollPadding, intDocumentHeight - intScrollPadding);
        // draw the web browser to the bitmap
        myWebBrowser.DrawToBitmap(myBitmap, new System.Drawing.Rectangle(0, 0, intDocumentWidth - intScrollPadding, intDocumentHeight - intScrollPadding));
    });
    myThread.SetApartmentState(System.Threading.ApartmentState.STA);
    myThread.Start();
    myThread.Join();

    return myBitmap;
}

score 1 · Accepted Answer

我发现并用于生成 javascript 和样式呈现的视图或 html 页面的 PDF 的最佳工具是phantomJS。

下载带有示例文件夹 exe 根目录中的 rasterize.js 函数的 .exe 文件并放入解决方案中。

它甚至允许您在不打开该文件的情况下以任何代码下载文件，它还允许在应用样式和特别是 jquery 时下载文件。

以下代码生成 PDF 文件：

public ActionResult DownloadHighChartHtml()
{
    string serverPath = Server.MapPath("~/phantomjs/");
    string filename = DateTime.Now.ToString("ddMMyyyy_hhmmss") + ".pdf";
    string Url = "http://wwwabc.com";

    new Thread(new ParameterizedThreadStart(x =>
    {
        ExecuteCommand(string.Format("cd {0} & E: & phantomjs rasterize.js {1} {2} \"A4\"", serverPath, Url, filename));
                           //E: is the drive for server.mappath
    })).Start();

    var filePath = Path.Combine(Server.MapPath("~/phantomjs/"), filename);

    var stream = new MemoryStream();
    byte[] bytes = DoWhile(filePath);

    Response.ContentType = "application/pdf";
    Response.AddHeader("content-disposition", "attachment;filename=Image.pdf");
    Response.OutputStream.Write(bytes, 0, bytes.Length);
    Response.End();
    return RedirectToAction("HighChart");
}



private void ExecuteCommand(string Command)
{
    try
    {
        ProcessStartInfo ProcessInfo;
        Process Process;

        ProcessInfo = new ProcessStartInfo("cmd.exe", "/K " + Command);

        ProcessInfo.CreateNoWindow = true;
        ProcessInfo.UseShellExecute = false;

        Process = Process.Start(ProcessInfo);
    }
    catch { }
}


private byte[] DoWhile(string filePath)
{
    byte[] bytes = new byte[0];
    bool fail = true;

    while (fail)
    {
        try
        {
            using (FileStream file = new FileStream(filePath, FileMode.Open, FileAccess.Read))
            {
                bytes = new byte[file.Length];
                file.Read(bytes, 0, (int)file.Length);
            }

            fail = false;
        }
        catch
        {
            Thread.Sleep(1000);
        }
    }

    System.IO.File.Delete(filePath);
    return bytes;
}

score 1 · Accepted Answer

您也可以检查Spire，它允许您HTML to PDF使用这段简单的代码进行创建

 string htmlCode = "<p>This is a p tag</p>";
 
//use single thread to generate the pdf from above html code
Thread thread = new Thread(() =>
{ pdf.LoadFromHTML(htmlCode, false, setting, htmlLayoutFormat); });
thread.SetApartmentState(ApartmentState.STA);
thread.Start();
thread.Join();
 
// Save the file to PDF and preview it.
pdf.SaveToFile("output.pdf");
System.Diagnostics.Process.Start("output.pdf");

score 1 · Accepted Answer

作为 HiQPdf Software 的代表，我相信最好的解决方案是HiQPdf HTML to PDF converter for .NET。它包含市场上最先进的 HTML5、CSS3、SVG 和 JavaScript 渲染引擎。还有一个免费版本的 HTML 到 PDF 库，您可以使用它免费生成多达 3 个 PDF 页面。从 HTML 页面生成 PDF 作为 byte[] 的最小 C# 代码是：

HtmlToPdf htmlToPdfConverter = new HtmlToPdf();

// set PDF page size, orientation and margins
htmlToPdfConverter.Document.PageSize = PdfPageSize.A4;
htmlToPdfConverter.Document.PageOrientation = PdfPageOrientation.Portrait;
htmlToPdfConverter.Document.Margins = new PdfMargins(0);

// convert HTML to PDF 
byte[] pdfBuffer = htmlToPdfConverter.ConvertUrlToMemory(url);

您可以在HiQPdf HTML to PDF Converter 示例存储库中找到更详细的 ASP.NET 和 MVC示例。

score 0 · Accepted Answer

试试这个PDF Duo .Net转换组件，用于将ASP.NET应用程序中的 HTML 转换为 PDF，而无需使用额外的 dll。

您可以传递 HTML 字符串或文件，或流以生成 PDF。使用下面的代码（示例 C#）：

string file_html = @"K:\hdoc.html";   
string file_pdf = @"K:\new.pdf";   
try   
{   
    DuoDimension.HtmlToPdf conv = new DuoDimension.HtmlToPdf();   
    conv.OpenHTML(file_html);   
    conv.SavePDF(file_pdf);   
    textBox4.Text = "C# Example: Converting succeeded";   
}

Info + C#/VB 示例可以在以下位置找到：http ://www.duodimension.com/html_pdf_asp.net/component_html_pdf.aspx

score 0 · Accepted Answer

使用Winnovative HTML 到 PDF转换器，您可以在一行中转换 HTML 字符串

byte[] outPdfBuffer = htmlToPdfConverter.ConvertHtml(htmlString, baseUrl);

基本 URL 用于解析 HTML 字符串中的相对 URL 引用的图像。或者，您可以使用 HTML 中的完整 URL 或使用 src="data:image/png" 作为图像标签嵌入图像。

在回答有关 Winnovative 转换器的“fubaar”用户评论时，需要进行更正。该转换器不使用 IE 作为渲染引擎。它实际上不依赖于任何已安装的软件，并且渲染与 WebKit 引擎兼容。

score 0 · Accepted Answer

如果您希望用户在浏览器中下载渲染页面的 pdf，那么最简单的问题解决方案是

window.print();

在客户端，它将提示用户保存当前页面的 pdf。您还可以通过链接样式自定义pdf的外观

<link rel="stylesheet" type="text/css" href="print.css" media="print">

print.css 在打印时应用于 html。

局限性

您不能将文件存储在服务器端。用户提示打印页面而不是他必须手动保存页面。页面必须在选项卡中呈现。

score 0 · Accepted Answer

如果您已经在使用 itextsharp dll，则无需添加第三方 dll（插件），我认为您使用的是 htmlworker 而不是使用 xmlworker，您可以轻松地将 html 转换为 pdf。

有些 css 不能工作，它们是受支持的 CSS
完整解释，带有示例参考点击这里

        MemoryStream memStream = new MemoryStream();
        TextReader xmlString = new StringReader(outXml);
        using (Document document = new Document())
        {
            PdfWriter writer = PdfWriter.GetInstance(document, memStream);
            //document.SetPageSize(iTextSharp.text.PageSize.A4);
            document.Open();
            byte[] byteArray = System.Text.Encoding.UTF8.GetBytes(outXml);
            MemoryStream ms = new MemoryStream(byteArray);
            XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, ms, System.Text.Encoding.UTF8);
            document.Close();
        }

        Response.ContentType = "application/pdf";
        Response.AddHeader("content-disposition", "attachment;filename=" + filename + ".pdf");
        Response.Cache.SetCacheability(HttpCacheability.NoCache);
        Response.BinaryWrite(memStream.ToArray());
        Response.End();
        Response.Flush();

score 0 · Accepted Answer

PDFmyURL 最近还发布了一个用于网页/HTML 到 PDF 转换的 .NET 组件。这有一个非常用户友好的界面，例如：

PDFmyURL pdf = new PDFmyURL("yourlicensekey");
pdf.ConvertURL("http://www.example.com", Application.StartupPath + @"\example.pdf");

文档：PDFmyURL .NET 组件文档

免责声明：我为拥有 PDFmyURL 的公司工作

score 0 · Accepted Answer

您可以使用 WebBrowser 控件的另一个技巧，下面是我的完整工作代码

在我的情况下将 URL 分配给文本框控件

  protected void Page_Load(object sender, EventArgs e)
{

   txtweburl.Text = "https://www.google.com/";

 }

下面是使用线程生成屏幕的代码

  protected void btnscreenshot_click(object sender, EventArgs e)
  {
    //  btnscreenshot.Visible = false;
    allpanels.Visible = true;
    Thread thread = new Thread(GenerateThumbnail);
    thread.SetApartmentState(ApartmentState.STA);
    thread.Start();
    thread.Join();

}

private void GenerateThumbnail()
{
    //  btnscreenshot.Visible = false;
    WebBrowser webrowse = new WebBrowser();
    webrowse.ScrollBarsEnabled = false;
    webrowse.AllowNavigation = true;
    string url = txtweburl.Text.Trim();
    webrowse.Navigate(url);
    webrowse.Width = 1400;
    webrowse.Height = 50000;

    webrowse.DocumentCompleted += webbrowse_DocumentCompleted;
    while (webrowse.ReadyState != WebBrowserReadyState.Complete)
    {
        System.Windows.Forms.Application.DoEvents();
    }
}

在下面的代码中，我在下载后保存 pdf 文件

        private void webbrowse_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    // btnscreenshot.Visible = false;
    string folderPath = Server.MapPath("~/ImageFiles/");

    WebBrowser webrowse = sender as WebBrowser;
    //Bitmap bitmap = new Bitmap(webrowse.Width, webrowse.Height);

    Bitmap bitmap = new Bitmap(webrowse.Width, webrowse.Height, PixelFormat.Format16bppRgb565);

    webrowse.DrawToBitmap(bitmap, webrowse.Bounds);


    string Systemimagedownloadpath = System.Configuration.ConfigurationManager.AppSettings["Systemimagedownloadpath"].ToString();
    string fullOutputPath = Systemimagedownloadpath + Request.QueryString["VisitedId"].ToString() + ".png";
    MemoryStream stream = new MemoryStream();
    bitmap.Save(fullOutputPath, System.Drawing.Imaging.ImageFormat.Jpeg);



    //generating pdf code 
     Document pdfDoc = new Document(new iTextSharp.text.Rectangle(1100f, 20000.25f));
     PdfWriter writer = PdfWriter.GetInstance(pdfDoc, Response.OutputStream);
     pdfDoc.Open();
     iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(fullOutputPath);   
     img.ScaleAbsoluteHeight(20000);
     img.ScaleAbsoluteWidth(1024);     
     pdfDoc.Add(img);
     pdfDoc.Close();
     //Download the PDF file.
     Response.ContentType = "application/pdf";
     Response.AddHeader("content-disposition", "attachment;filename=ImageExport.pdf");
     Response.Cache.SetCacheability(HttpCacheability.NoCache);
     Response.Write(pdfDoc);
     Response.End();


}

您还可以参考我最旧的帖子以获取更多信息：Navigation to the pages was cancelled getting message in asp.net web form

score 0 · Accepted Answer

另一个建议是通过https://grabz.it尝试解决方案。

他们提供了一个很好的 .NET API 来捕捉屏幕截图并以一种简单灵活的方式对其进行操作。

要在您的应用程序中使用它，您需要先获取密钥 + 秘密并下载.NET SDK（它是免费的）。

现在是一个使用它的简短示例。

要使用 API，您首先需要创建 GrabzItClient 类的实例，将应用程序密钥和应用程序机密从 GrabzIt 帐户传递给构造函数，如下例所示：

//Create the GrabzItClient class
//Replace "APPLICATION KEY", "APPLICATION SECRET" with the values from your account!
private GrabzItClient grabzIt = GrabzItClient.Create("Sign in to view your Application Key", "Sign in to view your Application Secret");

现在，要将 HTML 转换为 PDF，您只需：

grabzIt.HTMLToPDF("<html><body><h1>Hello World!</h1></body></html>");

您也可以转换为图像：

grabzIt.HTMLToImage("<html><body><h1>Hello World!</h1></body></html>");

接下来，您需要保存图像。您可以使用两种可用的保存方法之一，Save如果可公开访问的回调句柄可用，SaveTo否则。检查文档以获取详细信息。

c# - 在 .NET 中将 HTML 转换为 PDF

26 回答 26

2018年的更新，让我们使用标准的HTML+CSS=PDF等式！

使用wkhtmltopdf：

使用chrome headless：

使用Selenium WebDriver：

Related

Reference

使用`wkhtmltopdf`：

使用`chrome headless`：

使用`Selenium WebDriver`：