unicode - 如何在 itextSharp 中为 HTML 设置字体到 PDF

Question

我必须使用 itextSharp 在 VB.Net 和 MSSQL 2005 开发的 Web 应用程序中从 html 创建运行时 pdf。

HTML 保存在数据库中。其中包含古吉拉特语、印地语和英语内容。

谁能告诉我如何为 html 设置字体以及我应该使用哪些字体来显示英语、古吉拉特语和印地语我试过 Arial Unicode MS，但它不能准确地显示印地语。

先感谢您

这是我用来将html字符串转换为pdf文件的方法代码，用户可以将其保存在本地机器上。

Private Sub ExporttoPDF(ByVal FullHtml As String, ByVal fileName As String)
    Try
            Response.Clear()    ' Clear Response and set content type and disposition so that user get save file dialogue. 
            Response.ContentType = "application/pdf"
            Response.AddHeader("content-disposition", String.Format("attachment;filename={0}.pdf", fileName))
            Response.Cache.SetCacheability(HttpCacheability.NoCache)


            Dim sr As StringReader = New StringReader(FullHtml)
            Dim pdfDoc As iTextSharp.text.Document = New iTextSharp.text.Document(PageSize.A4.Rotate, 10, 10, 10, 10)
            Dim htmlparser As HTMLWorker = New HTMLWorker(pdfDoc)
            PdfWriter.GetInstance(pdfDoc, Response.OutputStream)
            pdfDoc.Open()
            Dim fontpath As String = System.Web.HttpContext.Current.Request.PhysicalApplicationPath + "\fonts\ARIALUNI.TTF" 
            '  "ARIALUNI.TTF" file copied from fonts folder and placed in the folder
            Dim bf As BaseFont = BaseFont.CreateFont(fontpath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED)

            FontFactory.RegisterDirectory( System.Web.HttpContext.Current.Request.PhysicalApplicationPath , True)

            FontFactory.Register(fontpath, "Arial Unicode MS")
            FontFactory.RegisterFamily("Arial Unicode MS", "Arial Unicode MS", fontpath)    

            'parse html from String reader "sr"
            htmlparser.Parse(sr)
            pdfDoc.Close()
            Response.Write(pdfDoc)
            Response.End()

    Catch ex As Exception
        Throw ex
    End Try
End Sub

这是我使用代码的方式

dim htmlstring as string = "<html><body encoding=""" + BaseFont.IDENTITY_H + """ style=""font-family:Arial Unicode MS;font-size:12;""> <h2> set Font in itextSharp for HTML to PDF  </h2> <span> I (aneel/અનિલ/अनिल) am facing problem to create a pdf from html that contains enlish, ગુજરાતી, हिंदी and other unicode characters.  </span> </body></html>"         
ExporttoPDF(htmlstring ,"sample.pdf")

在古吉拉特语的结果中，它显示 અનલિ 其中预期是 અનિલ

至于印地语，它显示 अनलि 应该是 अनिल 。

score 1 · Accepted Answer

尝试

pdfDoc.Add(New Header(iTextSharp.text.html.Markup.HTML_ATTR_STYLESHEET, “yourcssfile.css”)) // or path to your css file

然后

Dim styles As iTextSharp.text.html.simpleparser.StyleSheet
styles = New iTextSharp.text.html.simpleparser.StyleSheet
styles.LoadTagStyle("ol", "leading", "16")

您可以将所有内容添加到样式中。然后用这个替换 html.parse

HTMLWorker.ParseToList(New StreamReader("htmlpath.html", Encoding.Default), styles);

score 0 · Accepted Answer

不幸的是，你运气不好。见这里。基本上，iText 开发人员已经多次呼吁代码贡献以支持在 PDF 中正确显示印度语言所需的连字，但没有人自愿提供帮助。

unicode - 如何在 itextSharp 中为 HTML 设置字体到 PDF

2 回答 2

Related

Reference