0

有没有办法我们可以在一个单词之后提取任何东西作为一个实体?例如:

我想在实体之后about或之后提取go to任何内容。learn

##intent:navigate
-I want to learn about linear regression
-I want to read about SVM
-I want to go to Python 2.6
-Take me to logistic regression: eval

##regex:topic
-^[A-Za-z0-9 :_ -][A-Za-z0-9 :_ -][A-Za-z0-9 :_ -]$
4

2 回答 2

0

是的,你可以,你必须在你的训练数据中定义实体,它会被模型提取出来。例如,在您的示例中,训练数据应该是这样的。

##intent:navigate
- I want to learn about [linear regression](topic)
- I want to talk about [RasaNLU](topic) for the rest of the day.
- I want to go to [Berlin](topic) for a specific work.
- I want to read about [SVM](topic)
- I want to go to [Python 2.6](topic)
- Take me to logistic regression: eval

在模型训练之后,我尝试了一个例子

Enter a message: I want to talk about SVM     
{
  "intent": {
    "name": "navigate",
    "confidence": 0.9576369524002075
  },
  "entities": [
    {
      "start": 21,
      "end": 24,
      "value": "SVM",
      "entity": "topic",
      "confidence": 0.8241770362411013,
      "extractor": "CRFEntityExtractor"
    }
  ]
}

但是要使此方法有效,您将必须定义更多具有所有可能模式的示例。就像示例“我想在剩下的时间里谈论 RasaNLU”。建议要提取的实体不必是句子的最后一个单词的模型(其余示例都是这种情况)。

于 2019-05-13T01:07:10.990 回答
0

天真的方法可能非常简单 - 使用拆分字符串方法,例如

sentences = ["I want to learn about linear regression", "I want to read about SVM", "I want to go to Python 2.6",
 "Take me to logistic regression: eval"]

split_terms = ["about", "go to", "learn"]

for sentence in sentences:
    for split_term in split_terms:
        try:
            print(sentence.split(split_term)[1])
        except IndexError:
            pass # split_term was not found in a sentence

结果:

 linear regression
 about linear regression
 SVM
 Python 2.6

一个更聪明的方法可能是首先找到最后一个“拆分术语”来解决问题,学习 - 了解 - 关于

for sentence in sentences:
    last_split_term_index = 0
    last_split_term = ""
    for split_term in split_terms:
        last_split_term_index_candidate = sentence.find(split_term)
        if last_split_term_index_candidate > last_split_term_index:
            last_split_term_index = last_split_term_index_candidate
            last_split_term = split_term
    try:
        print(sentence.split(last_split_term)[1])

    except:
        continue

结果:

 linear regression
 SVM
 Python 2.6
于 2019-05-10T11:13:56.113 回答