java - 如何用 PDFBox 替换 PDF 中的居中文本

Question

我用这个PDFTextReplacement例子。它按预期进行替换，以防我的文本左对齐。但是，如果我的输入 pdf 文本居中，它会将文本替换为左对齐。好的，所以我必须重新计算正确的起点。

出于这个原因，我有两个目标或问题：

如何确定对齐方式？
如何计算正确的起点？

这是我的代码：

public PDDocument doIt(String inputFile, Map<String, String> text)
        throws IOException, COSVisitorException {
    // the document
    PDDocument doc = null;

    doc = PDDocument.load(inputFile);
    List pages = doc.getDocumentCatalog().getAllPages();
    for (int i = 0; i < pages.size(); i++) {
        PDPage page = (PDPage) pages.get(i);
        PDStream contents = page.getContents();

        PDFStreamParser parser = new PDFStreamParser(contents.getStream());
        parser.parse();
        List tokens = parser.getTokens();
        for (int j = 0; j < tokens.size(); j++) {
            Object next = tokens.get(j);

            if (next instanceof PDFOperator) {

                PDFOperator op = (PDFOperator) next;

                // Tj and TJ are the two operators that display
                // strings in a PDF

                String pstring = "";
                int prej = 0;
                if (op.getOperation().equals("Tj")) {
                    // Tj takes one operator and that is the string
                    // to display so lets update that operator
                    COSString previous = (COSString) tokens.get(j - 1);
                    String string = previous.getString();
                    // System.out.println(j + " " + string);
                    if (j == prej) {
                        pstring += string;
                    } else {
                        prej = j;
                        pstring = string;
                    }

                    previous.reset();
                    previous.append(string.getBytes("ISO-8859-1"));
                } else if (op.getOperation().equals("TJ")) {
                    COSArray previous = (COSArray) tokens.get(j - 1);
                    for (int k = 0; k < previous.size(); k++) {
                        Object arrElement = previous.getObject(k);
                        if (arrElement instanceof COSString) {
                            COSString cosString = (COSString) arrElement;
                            String string = cosString.getString();

                            if (j == prej) {
                                pstring += string;
                            } else {
                                prej = j;
                                pstring = string;
                            }

                            cosString.reset();
                            // cosString.append(string
                            // .getBytes("ISO-8859-1"));
                        }

                    }

                    COSString cosString2 = (COSString) previous
                            .getObject(0);

                    for (int t = 1; t < previous.size(); t++)
                        previous.remove(t);

                    // cosString2.setNeedToBeUpdate(true);

                    if (text.containsKey(pstring.trim())) {

                        String textValue = text.get(pstring.trim());
                        cosString2.append(textValue.getBytes("ISO-8859-1"));

                        for (int k = 1; k < previous.size(); k++) {
                            previous.remove(k);

                        }
                    }

                }
            }
        }
        // now that the tokens are updated we will replace the
        // page content stream.
        PDStream updatedStream = new PDStream(doc);
        OutputStream out = updatedStream.createOutputStream();
        ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
        tokenWriter.writeTokens(tokens);
        page.setContents(updatedStream);
    }
    return doc;
}

score 0 · Accepted Answer

你可以使用这个功能：

 public void doIt( String inputFile, String outputFile, String strToFind, String message)
            throws IOException, COSVisitorException
        {
            // the document
            PDDocument doc = null;
            try
            {
                doc = PDDocument.load( inputFile );
                List pages = doc.getDocumentCatalog().getAllPages();
                for( int i=0; i<pages.size(); i++ )
                {
                    PDPage page = (PDPage)pages.get( i );
                    PDStream contents = page.getContents();
                    PDFStreamParser parser = new PDFStreamParser(contents.getStream() );
                    parser.parse();
                    List tokens = parser.getTokens();
                    for( int j=0; j<tokens.size(); j++ )
                    {
                        Object next = tokens.get( j );
                        if( next instanceof PDFOperator )
                        {
                            PDFOperator op = (PDFOperator)next;
                            //Tj and TJ are the two operators that display
                            //strings in a PDF
                            if( op.getOperation().equals( "Tj" ) )
                            {
                                //Tj takes one operator and that is the string
                                //to display so lets update that operator
                                COSString previous = (COSString)tokens.get( j-1 );
                                String string = previous.getString();
                                string = string.replaceFirst( strToFind, message );
                                previous.reset();
                                previous.append( string.getBytes() );
                            }
                            else if( op.getOperation().equals( "TJ" ) )
                            {
                                COSArray previous = (COSArray)tokens.get( j-1 );
                                for( int k=0; k<previous.size(); k++ )
                                {
                                    Object arrElement = previous.getObject( k );
                                    if( arrElement instanceof COSString )
                                    {
                                        COSString cosString = (COSString)arrElement;
                                        String string = cosString.getString();
                                        string = string.replaceFirst( strToFind, message );
                                        cosString.reset();
                                        cosString.append( string.getBytes() );
                                    }
                                }
                            }
                        }
                    }
                    //now that the tokens are updated we will replace the
                    //page content stream.
                    PDStream updatedStream = new PDStream(doc);
                    OutputStream out = updatedStream.createOutputStream();
                    ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
                    tokenWriter.writeTokens( tokens );
                    page.setContents( updatedStream );
                }
                doc.save( outputFile );
            }
            finally
            {
                if( doc != null )
                {
                    doc.close();
                }
            }
        }

java - 如何用 PDFBox 替换 PDF 中的居中文本

1 回答 1

Related

Reference