nlp - 如何在 Python 中运行 JAVA API

Question

我有一个用于词干的 Java API，但我无法运行它。我正在 PYTHON 3.x 中的一个 NLP 项目中工作，我从文档中读取所有文本并将其转换为单词。我想使用这个 Java API 来阻止我的话，然后进一步处理。我正在探索如何使用不同的库将 Java API 直接运行到 Python 程序中，并且我阅读了一些关于PY4J的信息，但无法运行它。谁能指导我如何在 Python 中使用这个 API，或者如果这不可能，那么如何在 ECLIPSE 中使用它。

词干 API 说明：

描述：Word Stemmer API 是一个 Java 应用程序，它提供了一个接口来提取词的词干、前缀和后缀。

设置：将 Data 文件夹复制到您的项目目录中，并将提供的 JAR 文件添加到您的项目中。

用法：

    1. loadRules()
        - Purpose:      This function loads the stemming rules from the ./Data/Rules.txt into the program.
        - Syntax:       void loadRules();
        - Parameters:   None
        - Return type:  Void


    2. stemWord()
        - Purpose:      This function accepts as input a single word and returns a HashMap containing its stem, prefix, and postfix.
        - Syntax:       HashMap<String, String> stemWord(String word);
        - Parameters:   String word to be stemmed
        - Return type:  HashMap with the following keys: "stem", "prefix", "postfix"

    3. stemFile()
        - Purpose:      This function acecpts as input the path to a UTF-8 text file and writes a new file to the same directory with the suffix "_stemmed".
        - Syntax:       void stemFile(String path);
        - Parameters:   String path to text file
        - Return type:  Void

例子：

    UStemmer stmr = new UStemmer();

    stmr.loadRules();

    stmr.stemFile(String path);

    HashMap<String, String> stemmed = stmr.stemWord(String word);

    String stem = stemmed.get("stem");
    String prefix = stemmed.get("prefix");
    String postfix = stemmed.get("postfix");

PS：我拥有的 API 文件夹包含一个文件 UStemmer.JAR 和两个文件夹，第一个是 Data 有 Rules.txt 文件，第二个文件夹是 UStemmer 有两个文件，一个是 UStemmer.class （无法打开或读取）和另一个是 MANIFEST.MF PPS：我不能使用任何可用的词干分析器，因为它们不支持我正在使用的语言。（乌尔都语-巴基斯坦）

nlp - 如何在 Python 中运行 JAVA API

0 回答 0

Related

Reference