我无法检测到 pdf 文件中的空白页。我已经在互联网上搜索了它,但找不到一个好的解决方案。
使用 Itextsharp 我尝试了页面大小,Xobjects。但他们没有给出确切的结果。
我试过了
if(xobjects==null || textcontent==null || size <20 bytes )
then "blank"
else
not blank
但它返回错误答案的最长时间。我用过Itextsharp
代码如下...我正在使用Itextsharp库
对于 xobjects
PdfDictionary xobjects = resourceDic.GetAsDict(PdfName.XOBJECT);
//here resourceDic is PdfDictionary type
//I know that if Xobjects is null then page is blank. But sometimes blank page gives xobjects which is not null.
对于内容流
RandomAccessFileOrArray f = reader.SafeFile;
//here reader = new PdfReader(filename);
byte[] contentBytes = reader.GetPageContent(pageNum, f);
//I have measured the size of contentbytes but sometimes it gives more than 20 bytes for blank page
对于文本内容
String extractedText = PdfTextExtractor.GetTextFromPage(reader, pageNum, new LocationTextExtractionStrategy());
// sometimes blank page give a text more than 20 char length .