java - 借助 NLP 分析句子并提取人名、组织和位置

Question

我需要使用 NLP 解决以下问题，你能给我一些关于如何使用 OpenNLP API 来实现这一点的指导吗

一个。如何判断一个句子是否暗示了过去、现在或将来的某个动作。

(e.g.) I was very sad last week - past
       I feel like hitting my neighbor - present
       I am planning to go to New York next week - future

湾。如何找到与个人或公司或国家相对应的单词

(e.g.) John is planning to specialize in Electrical Engineering in UC Berkley and pursue a career with IBM).

人=约翰

公司 = IBM

位置 = 伯克利

谢谢

score 8 · Accepted Answer

我可以提供解决方案

b的解决方案。

这是代码：

    public class tikaOpenIntro {

    public String Tokens[];

    public static void main(String[] args) throws IOException, SAXException,
            TikaException {

        tikaOpenIntro toi = new tikaOpenIntro();


        String cnt;

        cnt="John is planning to specialize in Electrical Engineering in UC Berkley and pursue a career with IBM.";

                toi.tokenization(cnt);

        String names = toi.namefind(toi.Tokens);
        String org = toi.orgfind(toi.Tokens);

                System.out.println("person name is : "+names);
        System.out.println("organization name is: "+org);

    }
        public String namefind(String cnt[]) {
        InputStream is;
        TokenNameFinderModel tnf;
        NameFinderME nf;
        String sd = "";
        try {
            is = new FileInputStream(
                    "/home/rahul/opennlp/model/en-ner-person.bin");
            tnf = new TokenNameFinderModel(is);
            nf = new NameFinderME(tnf);

            Span sp[] = nf.find(cnt);

            String a[] = Span.spansToStrings(sp, cnt);
            StringBuilder fd = new StringBuilder();
            int l = a.length;

            for (int j = 0; j < l; j++) {
                fd = fd.append(a[j] + "\n");

            }
            sd = fd.toString();

        } catch (FileNotFoundException e) {

            e.printStackTrace();
        } catch (InvalidFormatException e) {

            e.printStackTrace();
        } catch (IOException e) {

            e.printStackTrace();
        }
        return sd;
    }

    public String orgfind(String cnt[]) {
        InputStream is;
        TokenNameFinderModel tnf;
        NameFinderME nf;
        String sd = "";
        try {
            is = new FileInputStream(
                    "/home/rahul/opennlp/model/en-ner-organization.bin");
            tnf = new TokenNameFinderModel(is);
            nf = new NameFinderME(tnf);
            Span sp[] = nf.find(cnt);
            String a[] = Span.spansToStrings(sp, cnt);
            StringBuilder fd = new StringBuilder();
            int l = a.length;

            for (int j = 0; j < l; j++) {
                fd = fd.append(a[j] + "\n");

            }

            sd = fd.toString();

        } catch (FileNotFoundException e) {

            e.printStackTrace();
        } catch (InvalidFormatException e) {

            e.printStackTrace();
        } catch (IOException e) {

            e.printStackTrace();
        }
        return sd;

    }


    public void tokenization(String tokens) {

        InputStream is;
        TokenizerModel tm;

        try {
            is = new FileInputStream("/home/rahul/opennlp/model/en-token.bin");
            tm = new TokenizerModel(is);
            Tokenizer tz = new TokenizerME(tm);
            Tokens = tz.tokenize(tokens);
            // System.out.println(Tokens[1]);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

并且您还想要位置，然后导入位置模型也可以在openNLP 源 Forge上找到。您可以下载并使用它们。

我不确定名称、位置和组织提取的概率是多少，但它几乎可以识别所有名称、位置、组织。

如果发现 openNLP 不够，则使用 Stanford Parser 进行名称实体识别。

score 0 · Accepted Answer

找到句子的字面时态并非易事，但在某些情况下是可行的。OpenNLP 解析器将创建一个句子结构，您可以尝试从中提取中心动词，并且进行一些形态分析会告诉您动词是现在还是过去（用英语），并且对模型进行更多处理“ will" 在某些情况下会给你将来时。但这并不总是那么简单。例如，在“去巴黎耗尽了我的银行账户”中，你有一个过去发生的嵌入事件（去巴黎），但要弄清楚这一点很棘手。而你未来的例子（“我正在计划......”）需要对“计划”这个词的含义有一些实际的理解，这非常复杂。

java - 借助 NLP 分析句子并提取人名、组织和位置

2 回答 2

Related

Reference