1

我在使用podofo修改pdf文档时遇到了问题,如果你有时间,请帮我解决!

我在http://podofo.sourceforge.net/download.html上找到了podofo源码,在windows 7 x86上编译,发现podofo的功能非常强大。

但是当我在示例“helloworld.cpp”中更改某些内容时,只需更改一点代码即可修改pdf文档并以其他文件名保存!

当我将本地pdf文档文件(本地pdf文档是从使用office word 2007上的windows COM接口的Word文档中保存的)传入函数时,新文件输出成功,但输出文本垂直翻转,输出文本的Y pos 垂直翻转。

(有些人说nn这种情况你必须处理现有内容可能已经改变了图形状态的事实,例如改变了当前的变换矩阵,也许他是对的,但我找不到改变图形的函数状态并改变了当前的变换矩阵)

这是图片截图,不知道为什么输出的文字是垂直翻转的:

截屏

奇怪的是,当我通过示例“helloworld”创建的文档“output.pdf”时它运行良好。

如果您有时间,请帮我解决,非常感谢!

我更改的代码如下所示:

#define MEMDOCUMENT 1 // macro switch  
void HelloWorld( const char* pszFilename ) 
{
    /*
     * PdfStreamedDocument is the class that can actually write a PDF file.
     * PdfStreamedDocument is much faster than PdfDocument, but it is only
     * suitable for creating/drawing PDF files and cannot modify existing
     * PDF documents.
     *
     * The document is written directly to pszFilename while being created.
     */
#if MEMDOCUMENT
     PdfMemDocument document( pszFilename ); //open local pdf documet
#else
     PdfStreamedDocument document( pszFilename ); //create a new pdf documet
#endif
    /*
     * PdfPainter is the class which is able to draw text and graphics
     * directly on a PdfPage object.
     */
    PdfPainter painter;

    /*
     * This pointer will hold the page object later. 
     * PdfSimpleWriter can write several PdfPage's to a PDF file.
     */
    PdfPage* pPage;

    /*
     * A PdfFont object is required to draw text on a PdfPage using a PdfPainter.
     * PoDoFo will find the font using fontconfig on your system and embedd truetype
     * fonts automatically in the PDF file.
     */     
    PdfFont* pFont;

    try {
        /*
         * The PdfDocument object can be used to create new PdfPage objects.
         * The PdfPage object is owned by the PdfDocument will also be deleted automatically
         * by the PdfDocument object.
         *
         * You have to pass only one argument, i.e. the page size of the page to create.
         * There are predefined enums for some common page sizes.
         */
#if MEMDOCUMENT
        pPage = document.GetPage(0); //get the first page and modify it
#else
        pPage = document.CreatePage( PdfPage::CreateStandardPageSize( ePdfPageSize_A4 ) );
#endif
        /*
         * If the page cannot be created because of an error (e.g. ePdfError_OutOfMemory )
         * a NULL pointer is returned.
         * We check for a NULL pointer here and throw an exception using the RAISE_ERROR macro.
         * The raise error macro initializes a PdfError object with a given error code and
         * the location in the file in which the error ocurred and throws it as an exception.
         */
        if( !pPage ) 
        {
            PODOFO_RAISE_ERROR( ePdfError_InvalidHandle );
        }

        /*
         * Set the page as drawing target for the PdfPainter.
         * Before the painter can draw, a page has to be set first.
         */
        painter.SetPage( pPage );

        /*
         * Create a PdfFont object using the font "Arial".
         * The font is found on the system using fontconfig and embedded into the
         * PDF file. If Arial is not available, a default font will be used.
         *
         * The created PdfFont will be deleted by the PdfDocument.
         */
        pFont = document.CreateFont( "Arial" );

        /*
         * If the PdfFont object cannot be allocated return an error.
         */
        if( !pFont )
        {
            PODOFO_RAISE_ERROR( ePdfError_InvalidHandle );
        }

        /*
         * Set the font size
         */
        pFont->SetFontSize( 18.0 );

        /*
         * Set the font as default font for drawing.
         * A font has to be set before you can draw text on
         * a PdfPainter.
         */
        painter.SetFont( pFont );

        /*
         * You could set a different color than black to draw
         * the text.
         *
         * SAFE_OP( painter.SetColor( 1.0, 0.0, 0.0 ) );
         */

        /*
         * Actually draw the line "Hello World!" on to the PdfPage at
         * the position 2cm,2cm from the top left corner. 
         * Please remember that PDF files have their origin at the 
         * bottom left corner. Therefore we substract the y coordinate 
         * from the page height.
         * 
         * The position specifies the start of the baseline of the text.
         *
         * All coordinates in PoDoFo are in PDF units.
         * You can also use PdfPainterMM which takes coordinates in 1/1000th mm.
         *
         */

        painter.SetTransformationMatrix(1,0,0,-1,0,pPage->GetPageSize().GetHeight());

        painter.DrawText( 56.69, pPage->GetPageSize().GetHeight() - 56.69, "Hello World!" );

        painter.DrawText( 56.69, pPage->GetPageSize().GetHeight() - 96.69, "Hello World!" );

        /*
         * Tell PoDoFo that the page has been drawn completely.
         * This required to optimize drawing operations inside in PoDoFo
         * and has to be done whenever you are done with drawing a page.
         */
        painter.FinishPage();

        /*
         * Set some additional information on the PDF file.
         */
        document.GetInfo()->SetCreator ( PdfString("examplahelloworld - A PoDoFo test application") );
        document.GetInfo()->SetAuthor  ( PdfString("Dominik Seichter") );
        document.GetInfo()->SetTitle   ( PdfString("Hello World") );
        document.GetInfo()->SetSubject ( PdfString("Testing the PoDoFo PDF Library") );
        document.GetInfo()->SetKeywords( PdfString("Test;PDF;Hello World;") );

        /*
         * The last step is to close the document.
         */

#if MEMDOCUMENT
        document.Write("outputex.pdf"); //save page change
#else
        document.Close(); 
#endif


    } catch ( const PdfError & e ) {
        /*
         * All PoDoFo methods may throw exceptions
         * make sure that painter.FinishPage() is called
         * or who will get an assert in its destructor
         */
        try {
            painter.FinishPage();
        } catch( ... ) {
            /*
             * Ignore errors this time
             */
        }

        throw e;
    }
}
4

2 回答 2

2

对于那些难以理解为什么会发生这种情况的人来说,这是由于每个页面顶部的这个命令(在这个例子中,页面是 A4 大小)沿 y 轴翻转内容:

1 0 0 -1 0 841 cm

根据我的观察,这似乎很常见,存在于多个程序生成的 PDF 中。也有许多 PDF 根本不包含此内容。我怀疑这完全是由于在 cairo 1.15.4 中提交 1e07ce,请参阅https://cairographics.org/releases/ChangeLog.cairo-1.15.4

棘手的部分是此命令位于任何q(保存转换)、Q(恢复转换)命令之前,因此不可能使用简单的Q. 换句话说,返回已知转换的唯一方法是解析页面内容流并查看q/Q对之前的转换。然后,一旦知道该变换,就可以在任何新内容覆盖到现有内容之前应用逆变换。

要解析页面并在 any 之前获取转换q

PoDoFo::PdfPage* page = ...;
PoDoFo::PdfContentsTokenizer tokenizer(page);
const char* token = NULL;
PoDoFo::PdfVariant param;
PoDoFo::EPdfContentsType type;
std::vector<PoDoFo::PdfVariant> params;
double tf_a = 1,    tf_c = 0,   tf_e = 0;
double tf_b = 0,    tf_d = 1,   tf_f = 0;
            //0          //0         //1

while(tokenizer.ReadNext(type, token, param)){

    //Command
    if(type == PoDoFo::ePdfContentsType_Keyword){

        //First Save at page, we assume that it will eventually be paired with enough Restores to go back to the current transform
        if(strcmp(token, "q") == 0)
            break;

        //Transform before first q, must apply the inverse when overlaying dots
        else if(strcmp(token, "cm") == 0){
            if(params.size() == 6){
                tf_a = params[0].GetReal();
                tf_b = params[1].GetReal();
                tf_c = params[2].GetReal();
                tf_d = params[3].GetReal();
                tf_e = params[4].GetReal();
                tf_f = params[5].GetReal();
                invertTransform(tf_a, tf_b, tf_c, tf_d, tf_e, tf_f);
            }
            else
                std::cout << "Warning! Found transform before first q at page with wrong number of arguments!" << std::endl;
        }
        else
            std::cout << "Warning! Unrelated command at page before first q: " << token << std::endl;

        params.clear();
    }

    //Parameter for command
    else if(type == PoDoFo::ePdfContentsType_Variant)
        params.push_back(param);
}

哪里invertTransform()是一个小的效用函数:

void invertTransform(double& a, double& b, double& c, double& d, double& e, double& f){
    double m_11 = a,    m_12 = c,   m_13 = e;
    double m_21 = b,    m_22 = d,   m_23 = f;
         //m_31 = 0.0,  m_32 = 0.0, m_33 = 1.0;
    double det = m_11*(/*m_33**/m_22 /*- m_32*m_23*/) - m_21*(/*m_33**/m_12/* - m_32*m_13*/) /*+ m_31*(m_23*m_12 - m_22*m_13)*/;
    if(abs(det) < 1e-10){
        a = 1;  c = 0;  e = 0;
        b = 0;  d = 1;  f = 0;
          //0     //0     //1
    }
    else{
        double det_1 = 1.0/det;
        a = det_1*( /*m_33**/m_22 /*- m_32*m_23*/); c = det_1*(-/*m_33**/m_12 /*+ m_32*m_13*/); e = det_1*( m_23*m_12 - m_22*m_13);
        b = det_1*(-/*m_33**/m_21 /*+ m_31*m_23*/); d = det_1*( /*m_33**/m_11 /*- m_31*m_13*/); f = det_1*(-m_23*m_11 + m_21*m_13);
          //det_1*( m_32*m_21 - m_31*m_22)              det_1*(-m_32*m_11 + m_31*m_12)              det_1*( m_22*m_11 - m_21*m_12)
    }
}

cm然后,可以应用逆变换(如果在 first 之前没有,则只是恒等式q)并且可以在页面上绘制事物:

PoDoFo::PdfPainter painter;
painter.SetPage(page);
painter.Save();
painter.SetTransformationMatrix(tf_a, tf_b, tf_c, tf_d, tf_e, tf_f);

/* painter.Draw...() */

painter.Restore();
painter.FinishPage();

当然,整个解决方案假设cm在第一个变换之前可能有一个变换并且没有其他变换q

另一种更简单的解决方案是将一个放在q流中的所有内容之前,然后放在一个Q之后,然后是所需的内容,但我不确定使用 PoDoFo 是否简单。

于 2018-05-09T08:22:22.273 回答
0

感谢mkl,在mkl的帮助下,问题已经解决了。

问题是因为Reflection effect.podofo源代码有转换矩阵,你可以在pdf文档上添加文本或行之前更改它。

添加一些这样的代码://

        painter.SetTransformationMatrix(1,0,0,-1,0,pPage->GetPageSize().GetHeight()); // set Reflection effect
        painter.Save();

        painter.DrawText( 56.69, pPage->GetPageSize().GetHeight() - 56.69, "Hello World!" );

        painter.DrawText( 56.69, pPage->GetPageSize().GetHeight() - 96.69, "Hello World!"
于 2016-06-02T11:36:45.640 回答