2

tess4j 是一个自带库的OCR,我做了一个maven项目来测试它,我确实在eclipse中添加了maven的安装路径。我添加了 M2_HOME、MAVEN_HOME 和 JAVA_HOME 环境变量,

这是我的父母 pom

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                             http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>fr.mssb.ongoing</groupId>
    <artifactId>ongoing-parent</artifactId>
    <packaging>pom</packaging>
    <version>1.0</version>
    <name>ongoing</name>

    <modules>
        <module>capcha-solver</module>
    </modules>

    <build>
        <pluginManagement>
            <plugins>
                <!-- All project will be interpreted (source) and compiled (target) in java 7 -->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <configuration>
                        <source>1.7</source>
                        <target>1.7</target>
                    </configuration>
                </plugin>
                <!-- this will make eclipse:eclipse goal work and make the project Eclipse compatible -->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-eclipse-plugin</artifactId>
                    <version>2.5.1</version>
                    <configuration>
                        <downloadSources>true</downloadSources>
                        <downloadJavadocs>true</downloadJavadocs>
                        <classpathContainers>
                            <classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-1.7</classpathContainer>
                        </classpathContainers>
                        <additionalBuildcommands>
                            <buildcommand>net.sf.eclipsecs.core.CheckstyleBuilder</buildcommand>
                        </additionalBuildcommands>
                        <additionalProjectnatures>
                            <projectnature>net.sf.eclipsecs.core.CheckstyleNature</projectnature>
                        </additionalProjectnatures>
                    </configuration>
                </plugin>
            </plugins>
        </pluginManagement>
    </build>

    <!-- All child pom will inherit those dependancies -->
    <dependencies>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
</project>

这是我的孩子 pom

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                             http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <parent>
        <groupId>fr.mssb.ongoing</groupId>
        <artifactId>ongoing-parent</artifactId>
        <version>1.0</version>
    </parent>

    <groupId>fr.mssb.ongoing</groupId>
    <artifactId>capcha-solver</artifactId>
    <version>1.0</version>
    <packaging>jar</packaging> <!-- I think this is useless -->

    <name>A capcha solver based on terassec ocr</name>

    <build>
        <plugins>
            <!-- autorun unit tests during maven compilation -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <configuration>
                    <argLine>-Xmx1024m -XX:MaxPermSize=256m -XX:-UseSplitVerifier</argLine>
                    <skipTests>-DskipTests</skipTests>
                </configuration>
            </plugin>

            <!--  this should make the tesseract ocr native dll work without doing anything -->
            <plugin>
                <groupId>com.googlecode.mavennatives</groupId>
                <artifactId>maven-nativedependencies-plugin</artifactId>
                <version>0.0.7</version>
                <executions>
                    <execution>
                        <id>unpacknatives</id>
                        <goals>
                            <goal>copy</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

    <dependencies>
        <!-- 
        Log4j 2 is broken up in an API and an implementation (core), where the API 
        provides the interface that applications should code to. Strictly speaking 
        Log4j core is only needed at runtime and not at compile time.
        However, below we list Log4j core as a compile time dependency to improve 
        the startup time for custom plugins. 
        -->
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api</artifactId>
            <version>2.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.1</version>
        </dependency>
        <!--
        Integration of tesseract OCR
        -->
        <dependency>
            <groupId>net.sourceforge.tess4j</groupId>
            <artifactId>tess4j</artifactId>
            <version>1.4.1</version>
        </dependency>
    </dependencies>

</project>

当然,代码(取自 tess4j 示例)

package test;

import java.io.File;

import net.sourceforge.tess4j.Tesseract;
import net.sourceforge.tess4j.TesseractException;

/**
 * Classe d'exemple.
 */
public class TesseractExample {

    public static void main(String[] args) {
        File imageFile = new File("C:\\DEV\\repo\\ongoing\\capcha-solver\\src\\test\\resources\\random.jpg");
        Tesseract instance = Tesseract.getInstance();  // JNA Interface Mapping
        // Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping

        try {
            String result = instance.doOCR(imageFile);
            System.out.println(result);
        } catch (TesseractException e) {
            System.err.println(e.getMessage());
        }
    }
}

当我启动它时,我得到了这个例外

Exception in thread "main" java.lang.NoSuchFieldError: RESOURCE_PREFIX
    at net.sourceforge.tess4j.util.LoadLibs.<clinit>(LoadLibs.java:60)
    at net.sourceforge.tess4j.TessAPI.<clinit>(TessAPI.java:40)
    at net.sourceforge.tess4j.Tesseract.init(Tesseract.java:303)
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:239)
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:188)
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:172)
    at test.TesseractExample.main(TesseractExample.java:19)

我不知道这是否与 tess4j 相关或 JNA/JNI 问题,如您所见,我有一个“应该”(以前从未使用过 DLL)使它们工作的插件。

同样在父pom中,我的插件位于插件管理标签之间,我想我应该将它们放在构建标签之间,不是吗?

任何想法?

谢谢。

4

4 回答 4

1

有2个问题

1/ 必须将 tess4j 中的一些 dll 和文件复制到项目根目录

2/ tess4j 对 com.sun.jna:jna:jar:3.0.9 具有传递依赖,与 net.java.dev.jna:jna:jar:4.1.0(也来自 tess4j)冲突,不包括 3.0.9 版本一切正常,RESSOURCE_PREFIX 错误来自那个

pom.xml 用于 32 位版本(您需要安装 32 位 JVM),它负责这两件事,如果您想在 64 位中使用它,请将 win32-x86 更改为 win32-x86-64

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                             http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>fr.mssb.ocr</groupId>
    <artifactId>tesseractOcr</artifactId>
    <version>1.0</version>
    <packaging>jar</packaging>

    <name>tesseract ocr project</name>

    <build>
        <plugins>
            <!--  
            this extract the 32 bits dll and the tesseractdata folder to 
            the project root from tess4j.jar  
            -->
            <plugin>
                <groupId>org.apache.portals.jetspeed-2</groupId>
                <artifactId>jetspeed-unpack-maven-plugin</artifactId>
                <version>2.2.2</version>
                <dependencies>
                    <dependency>
                      <groupId>net.sourceforge.tess4j</groupId>
                      <artifactId>tess4j</artifactId>
                      <version>1.4.1</version>
                    </dependency>
                </dependencies>
                <executions>
                    <execution>
                        <id>unpack-step</id>
                        <phase>compile</phase>
                        <goals>
                            <goal>unpack</goal>
                        </goals>
                        <configuration>
                            <unpack>
                                <artifact>net.sourceforge.tess4j:tess4j:jar</artifact>
                                <overwrite>true</overwrite>
                                <resources combine.children="append">
                                    <resource>
                                        <path>win32-x86</path>
                                        <destination>../</destination>
                                        <overwrite>true</overwrite>
                                        <flat>true</flat>
                                        <include>*</include>
                                    </resource>
                                    <resource>
                                        <path>tessdata</path>
                                        <destination>../tessdata</destination>
                                        <overwrite>true</overwrite>
                                        <flat>true</flat>
                                        <include>*</include>
                                    </resource>
                                    <resource>
                                        <path>tessdata/configs</path>
                                        <destination>../tessdata/configs</destination>
                                        <overwrite>true</overwrite>
                                        <flat>true</flat>
                                        <include>*</include>
                                    </resource>
                                </resources>
                            </unpack>
                            <verbose>true</verbose>
                        </configuration>
                        </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

    <dependencies>
        <dependency>
            <groupId>net.sourceforge.tess4j</groupId>
            <artifactId>tess4j</artifactId>
            <version>1.4.1</version>
              <exclusions>
                <exclusion>
                    <groupId>com.sun.jna</groupId>
                    <artifactId>jna</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
    </dependencies>

</project>
于 2015-02-04T18:02:58.410 回答
1

可以轻松构建子 pom 没有任何问题并手动复制库,这与 TESS4J 无关。无论如何,如果不再需要,可以删除 jna 3.0.9:https ://github.com/nguyenq/tess4j/issues/8

尽管如此,运行 tess4j 所需要做的只是 maven 依赖项:

<dependency>
    <groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
    <version>1.4.1</version>
</dependency>

以及正确使用 TESS4J-API,例如:

File imageFile = new File("C:\\random.png");
Tesseract instance = Tesseract.getInstance();

//In case you don't have your own tessdata, let it also be extracted for you
File tessDataFolder = LoadLibs.extractTessResources("tessdata");

//Set the tessdata path
instance.setDatapath(tessDataFolder.getAbsolutePath());

    try {
        String result = instance.doOCR(imageFile);
        System.out.println(result);
    } catch (TesseractException e) {
        System.err.println(e.getMessage());
    }

而已!

于 2015-02-06T11:34:45.783 回答
1

该问题是由 net.java.dev.jna:jna 和 com.sun.jna:jna 之间的冲突引起的。两个 jar 都包含一个类 com.sun.jna.Platform。两个 jar 都被声明为 tess4j 依赖项。要解决这个问题,您可以省略 pom 中的第二个依赖项:

<dependency>
    <groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
    <version>1.4.1</version>
    <exclusions>
        <exclusion>
            <groupId>com.sun.jna</groupId>
            <artifactId>jna</artifactId>
        </exclusion>
    </exclusions>
</dependency>    
于 2015-08-04T06:10:04.323 回答
0

因为 JNA 版本不匹配。您在类路径库中使用了多个版本。只需使用一个版本的 JNA。

于 2015-07-24T18:35:46.483 回答