linux - 从命令行在多页 pdf 的不同页面上插入不同的文本

Question

我有一个包含多个名称的文本文件，例如

John Doe
Jane Doe
Mike Miller
...

和一个 pdf 文件，其页数与文本文件中的名称一样多。

如何在第一页插入/粘贴名字，在第二页插入第二个名字等等？我必须从 Linux 服务器上的命令行执行此操作。

score 2 · Accepted Answer

这个程序对我有用，但我没有做过任何很好的测试。我使用了一个包含 2 个空白页的 PDF 文件和一个包含 2 行文本的文本文件。您将需要修改 InsertText 过程以移动到正确的位置。您还需要修改字体选择和大小（在程序中搜索 findfont）

这是一个示例用法：

gs -sDEVICE=pdfwrite    \
   -sOutputFile=out.pdf \
   -sPDF_File=test.pdf  \
   -sText_File=test.txt \
    textinsert.ps

请注意，这完全是 Ghostscript 特定的，绝对不适用于任何其他 PostScript/PDF 消费者。

%!PS
%% Copyright (C) Ken Sharp and Artifex Software Inc All rights reserved.
%% Permission granted to use, copy, modify and redistribute provided this
%% copyright remains intact. No warranty express or implied.
%%
%% Process a PDF file and a text file where one line of text from the text
%% file is drawn onto a page of the PDF file.
%% Redefine showpage so that we can draw the PDF page
%% without emitting it
%%s
/TextInsertDict 20 dict dup 3 1 roll def begin

/InsertText {
  %% Read a line of text from the text file
  TFile 256 string readline pop

  %% Move to the desired location and draw the text
  100 100 moveto show
} bind def

/Textpdfshowpage{
   save /PDFSave exch store
   /PDFdictstackcount countdictstack store
   /PDFexecstackcount count 2 sub store
   (before exec) VMDEBUG

   % set up color space substitution (this must be inside the page save)
   pdfshowpage_setcspacesub

  .writepdfmarks {

        % Copy the boxes.
    { /CropBox /BleedBox /TrimBox /ArtBox } {
      2 copy pget {
        % .pdfshowpage_Install transforms the user space do same here with the boxes
        oforce_elems
        2 { Page pdf_cached_PDF2PS_matrix transform 4 2 roll } repeat
        normrect_elems 4 index 5 1 roll fix_empty_rect_elems 4 array astore
        mark 3 1 roll /PAGE pdfmark
      } {
        pop
      } ifelse
    } forall

        % Copy annotations and links.
    dup /Annots knownoget {
      0 1 2 index length 1 sub
       { 1 index exch oget
         dup type /dicttype eq {
           dup /Subtype oget annottypes exch .knownget { exec } { pop } ifelse
         } {
           pop
         } ifelse
       }
      for pop
    } if

  } if      % end .writepdfmarks

        % Display the actual page contents.
   8 dict begin
   /BXlevel 0 def
   /BMClevel 0 def
   /OFFlevels 0 dict def
   /BGDefault currentblackgeneration def
   /UCRDefault currentundercolorremoval def
        %****** DOESN'T HANDLE COLOR TRANSFER YET ******
   /TRDefault currenttransfer def
  matrix currentmatrix 2 dict
  2 index /CropBox pget {
    oforce_elems normrect_elems boxrect
    4 array astore 1 index /ClipRect 3 -1 roll put
  } if
  dictbeginpage setmatrix
  /DefaultQstate qstate store

  count 1 sub /pdfemptycount exch store
        % If the page uses any transparency features, show it within
        % a transparency group.
  dup pageusestransparency dup /PDFusingtransparency exch def {
    % Show the page within a PDF 1.4 device filter.
    0 .pushpdf14devicefilter {
      /DefaultQstate qstate store       % device has changed -- reset DefaultQstate
      % If the page has a Group, enclose contents in transparency group.
      % (Adobe Tech Note 5407, sec 9.2)
      dup /Group knownoget {
        1 index /CropBox pget {
          /CropBox exch
        } {
          1 index get_media_box pop /MediaBox exch
        } ifelse
        oforce_elems normrect_elems fix_empty_rect_elems 4 array astore .beginformgroup 
        showpagecontents
        .endtransparencygroup
      } {
        showpagecontents
      } ifelse
    } stopped {
      % todo: discard
      .poppdf14devicefilter
      /DefaultQstate qstate store   % device has changed -- reset DefaultQstate
      stop
    } if .poppdf14devicefilter
    /DefaultQstate qstate store % device has changed -- reset DefaultQstate
  } {
    showpagecontents
  } ifelse
  .free_page_resources

  InsertText

  % todo: mixing drawing ops outside the device filter could cause
  % problems, for example with the pnga device.
  endpage
  end           % scratch dict
  % Some PDF files don't have matching q/Q (gsave/grestore) so we need
  % to clean up any left over dicts from the dictstack

  PDFdictstackcount //false
  { countdictstack 2 index le { exit } if
    currentdict /n known not or
    end
  } loop {
    StreamRunAborted not {
      (   **** Warning: File has unbalanced q/Q operators \(too many q's\)\n)
      pdfformaterror
    } if
  } if
  pop
  count PDFexecstackcount sub { pop } repeat
  (after exec) VMDEBUG
  Repaired      % pass Repaired state around the restore
  PDFSave restore
  /Repaired exch def
} bind def

%% Check both our arguments are defined leaves true on the
%% stack if so, false otherwise
%%
{PDF_File} stopped
{
  (No PDF_File defined\n) print
  false
}
{
  {Text_File} stopped
  {
    pop (No Text-File defined\n) print
    false
  }
  {
    pop pop true
  }ifelse
}ifelse

{

  %% First find the number of lines of text in the text file
  %%
  /TextLineCount 0 def
  Text_File (r) file dup
  {
    dup 256 string readline
    {
      pop /TextLineCount TextLineCount 1 add def
    }
    {
      pop exit
    } 
    ifelse
  } loop
  closefile

  %% First find the number of pages in the PDF file
  %%
  PDF_File (r) file
  runpdfbegin pdfpagecount TextLineCount eq {
    runpdfend true
  }
  {
    runpdfend false
  } ifelse
  exch
  closefile

  %% If the number of pages is the same as the number of lines
  %% the we process the files, otherwise warn and exiot
  %%
  {
    %% Select font and size
    %%
    /Times-Roman findfont 20 scalefont setfont

    %% Open the text file agaiin
    %%
    /TFile Text_File (r) file def

    %% Open the PDF file and begin PDF processing
    PDF_File (r) file
    runpdfbegin

    %% For each page....
    %%
    1 1 TextLineCount {
      %% draw the content of this page
      %%
      pdfgetpage dup /Page exch store
      pdfshowpage_init
      pdfshowpage_setpage
      Textpdfshowpage
    } for

    %% Terminate PDF processing
    %%
    runpdfend

    %% Close the text file
    %%
    TFile closefile
  } 
  {
    (Warning, Number of pages not equal to the number of text lines, aborting!\n) print flush
  }ifelse
}
{
  (Incorrect usage\n) print
  (Usage: \n) print
  (gs -sDEVICE=pdfwrite -o <outputfile> -sPDF-File=<PDF file> -sText_File=<Text file> textinsert.ps\n) print
  (NB all switches are case-sensitive\n) print
} ifelse
end

score 1 · Accepted Answer

您可以通过使用 Ghostscript 在 PostScript 中编程来实现。您需要先找到 PDF 文件中的页数或文本文件中的行数，您可能想检查它们是否相同。

使用 Ghostscript pdfwrite 设备，从 PDF 文件执行页面描述，然后从文本文件中读取文本。在内容上正确定位当前点，选择合适的字体和大小，并显示文本。然后执行 showpage 来渲染页面。

您可以获取一个包含所有页面的大型 PDF 文件，也可以获取每页一个 PDF 文件。

请注意，对于不熟悉 PostScript 编程的人来说，这不是一项任务。

linux - 从命令行在多页 pdf 的不同页面上插入不同的文本

2 回答 2

Related

Reference