python - 使用 Stanford NLP for python 进行信息提取和关系提取

Question

如何使用 Python 的斯坦福核心 NLP 从一堆文档中提取一些公司的名称？

这是我的数据示例：

'3Trucks Inc（'3Trucks' 或公司）是一个技术支持的长途 B2B 数字平台，通过其内部开发的数字平台，将货主与长途货运需求和可以为其提供服务的卡车所有者相匹配。成立于 2016 年, 3Trucks 总部位于加利福尼亚州，并在波士顿和佛罗里达州租用了办事处。他们的一些顶级客户是谷歌、IBM 和诺基亚

3Trucks 成立于 2010 年，由 Mark Robert 先生担任首席执行官，John Mclean 担任合伙人兼首席技术官。

我想输出信息提取：

3Truck

我想输出关系提取：

('3truck', founded '2010'),
('John Mclean', 'Partner')
('3truck',client 'Google')

score 1 · Accepted Answer

这很简单，您可以使用 Spacy NER（自然语言实体识别）来完成您的任务。它有一组预训练模型来识别不同的实体。

score 1 · Accepted Answer

normally Named entity recognition will be used for such applications, but NER can only classify into some categories.

from nltk import word_tokenize, pos_tag, ne_chunk
from nltk.chunk import tree2conlltags

sentence = "Mark and John are working at Google."
print(tree2conlltags(ne_chunk(pos_tag(word_tokenize(sentence))
"""[('Mark', 'NNP', 'B-PERSON'), 
    ('and', 'CC', 'O'), ('John', 'NNP', 'B-PERSON'), 
    ('are', 'VBP', 'O'), ('working', 'VBG', 'O'), 
    ('at', 'IN', 'O'), ('Google', 'NNP', 'B-ORGANIZATION'), 
    ('.', '.', 'O')] """

For your application you have to train the Named entity recognition with respect to data , you are going to ask Training NER

python - 使用 Stanford NLP for python 进行信息提取和关系提取

2 回答 2

Related

Reference