rasa-nlu - Rasa nlu 解析请求给出错误的意图结果

Question

Rasa NLU 版本（例如 0.7.3）：0.10.0a6

使用的后端/管道（mitie，spacy_sklearn，...）：["nlp_spacy", "tokenizer_spacy", "intent_featurizer_spacy","ner_crf", "ner_synonyms", "intent_classifier_sklearn","ner_spacy"]

操作系统（windows、osx、...）：Windows server 2012 R2

问题：我已经安装了 Rasa nlu 0.10.0a6 版本。我的confi_spacy 文件看起来像。

{

"project":"Project",
"pipeline" : ["nlp_spacy", "tokenizer_spacy", "intent_featurizer_spacy","ner_crf", "ner_synonyms", "intent_classifier_sklearn","ner_spacy"],
"path" : "./projects",

"cors_origins":["*"],
"data" : "./data/examples/rasa/People.json"
}

我的数据文件看起来像。

{
  "rasa_nlu_data": {
    "regex_features": [
      {
        "name": "zipcode",
        "pattern": "[0-9]{5}"
      }
    ],
    "entity_synonyms": [
      {
        "value": "chinese",
        "synonyms": ["Chinese", "Chines", "chines"]
      },
      {
        "value": "vegetarian",
        "synonyms": ["veggie", "vegg"]
      }
    ],
    "common_examples": [
      {
        "text": "hey", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "howdy", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hey there",
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hello", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hi", 
        "intent": "greet", 
        "entities": []
      },
      {
        "text": "good morning",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "good evening",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "dear sir",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "yes", 
        "intent": "affirm", 
        "entities": []
      }, 
      {
        "text": "yep", 
        "intent": "affirm", 
        "entities": []
      }, 
      {
        "text": "yeah", 
        "intent": "affirm", 
        "entities": []
      },
      {
        "text": "indeed",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "that's right",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "ok",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "great",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "right, thank you",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "correct",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "great choice",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "sounds really good",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "i'm looking for a place to eat",
        "intent": "restaurant_search",
        "entities": []
      },
      {
        "text": "I want to grab lunch",
        "intent": "restaurant_search",
        "entities": []
      },
      {
        "text": "I am searching for a dinner spot",
        "intent": "restaurant_search",
        "entities": []
      },
      {
        "text": "i'm looking for a place in the north of town",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 31,
            "end": 36,
            "value": "north",
            "entity": "location"
          }
        ]
      },
      {
        "text": "show me chinese restaurants",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 8,
            "end": 15,
            "value": "chinese",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "show me chines restaurants",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 8,
            "end": 14,
            "value": "chinese",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "show me a mexican place in the centre", 
        "intent": "restaurant_search", 
        "entities": [
          {
            "start": 31, 
            "end": 37, 
            "value": "centre", 
            "entity": "location"
          }, 
          {
            "start": 10, 
            "end": 17, 
            "value": "mexican", 
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "i am looking for an indian spot called olaolaolaolaolaola",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 20,
            "end": 26,
            "value": "indian",
            "entity": "cuisine"
          }
        ]
      },     {
        "text": "search for restaurants",
        "intent": "restaurant_search",
        "entities": []
      },
      {
        "text": "anywhere in the west",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 16,
            "end": 20,
            "value": "west",
            "entity": "location"
          }
        ]
      },
      {
        "text": "anywhere near 18328",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 14,
            "end": 19,
            "value": "18328",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I am looking for asian fusion food",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 17,
            "end": 29,
            "value": "asian fusion",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "I am looking a restaurant in 29432",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 29,
            "end": 34,
            "value": "29432",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I am looking for mexican indian fusion",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 17,
            "end": 38,
            "value": "mexican indian fusion",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "central indian restaurant",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 0,
            "end": 7,
            "value": "central",
            "entity": "location"
          },
          {
            "start": 8,
            "end": 14,
            "value": "indian",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "bye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "goodbye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "good bye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "stop", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "end", 
        "intent": "goodbye", 
        "entities": []
      },
      {
        "text": "farewell",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "Bye bye",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "have a good one",
        "intent": "goodbye",
        "entities": []
      }
    ]
  }
}

使用上面的配置和 json 数据，我已经使用低于 HTTP 端点训练了 Rasa

/train?project=项目

在使用训练数据创建的一个项目文件夹中训练数据后。

我用下面的命令启动了 Rasa 服务器。

python -m rasa_nlu.server -c config_spacy.json

现在服务器从端口 5000 启动。

当我可以执行 '/parse?q=hello&project=Project' 解析端点得到低于响应。

{
  "intent": {
    "name": "greet",
    "confidence": 0.6409561289105246
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "greet",
      "confidence": 0.6409561289105246
    },
    {
      "name": "goodbye",
      "confidence": 0.16788352870824252
    },
    {
      "name": "restaurant_search",
      "confidence": 0.10908268742176423
    },
    {
      "name": "affirm",
      "confidence": 0.08207765495946878
    }
  ],
  "text": "hello"
}

当我可以执行 '/parse?q=Great choice&project=Project' 时，解析端点低于响应。

{
  "intent": {
    "name": "affirm",
    "confidence": 0.7718580601897227
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "affirm",
      "confidence": 0.7718580601897227
    },
    {
      "name": "goodbye",
      "confidence": 0.11611828257295627
    },
    {
      "name": "greet",
      "confidence": 0.07060567364272623
    },
    {
      "name": "restaurant_search",
      "confidence": 0.04141798359459499
    }
  ],
  "text": "Great choice"
}

当我可以执行 '/parse?q=Great choice&Project=Project' 时，解析端点低于响应。

{
  "intent": {
    "name": "None",
    "confidence": 1
  },
  "entities": [],
  "text": "Great choice"
}

当我可以执行“/parse?q=Book a cab&project=Project”时，解析端点低于响应。

{
  "intent": {
    "name": "goodbye",
    "confidence": 0.40930529216955336
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "goodbye",
      "confidence": 0.40930529216955336
    },
    {
      "name": "restaurant_search",
      "confidence": 0.31818118919270977
    },
    {
      "name": "greet",
      "confidence": 0.20524111006007764
    },
    {
      "name": "affirm",
      "confidence": 0.06727240857765926
    }
  ],
  "text": "Book a cab"
}

以这种方式，对于每个请求，它有时会以正确的结果响应，有时则不会。如果您可以在这两个响应中观察 Parse_reponce2.txt 和 Parse_reponce3.txt，由于此更改，我刚刚从小“p”更改为“项目”中的大写“P”，每个请求我得到不同的结果。

在经过训练的数据中，没有“预订出租车”文本或任何相关意图。但是，当我使用此文本发送解析时，我没有得到无意图，它正在返回意图结果。对于任何随机的任何解析请求都没有得到无意图。

这是我的培训问题还是出了什么问题。请让我知道如何获得正确的意图结果以及实体结果。

配置文件的内容（如果使用且相关）：

{

    "project":"Project",
    "pipeline" : ["nlp_spacy", "tokenizer_spacy", "intent_featurizer_spacy","ner_crf", "ner_synonyms", "intent_classifier_sklearn","ner_spacy"],
    "path" : "./projects",
    "cors_origins":["*"],
    "data" : "./data/examples/rasa/People.json"

}

score 0 · Accepted Answer

我在使用 Rasa NLU 时遇到了同样的问题，我有大约 120 个示例用于 5 个不同的意图，以及 5-7 个实体。在这里，您似乎已经使用了spacy-sklearn管道。 spaCy通常需要更多数据来训练和检测意图和（更多）实体。文档说 500-1000 个示例对于图书馆来说将被认为是“体面和好的”。

在我的情况下，我将管道更改为MITIE-sklearn，我得到了一个体面的模型，只用了 80 个示例和与以前相同数量的意图进行了训练。正如您所注意到的，spaCy 往往更快，但MITIE对于 80 个示例模型确实需要大约 6 分钟。

score 0 · Accepted Answer

URL 参数区分大小写，这就是两个great choice示例具有不同输出的原因。在第二种情况下，Rasa 没有找到要解析的项目/模型。

Rasa NLU 将始终返回匹配的意图。因此，在最后一个示例中，您可以看到它返回了一个意图，但置信度很低。处理这就是所谓的回退或超出范围。讨论处理回退的两种主要方法是实现在置信度低于某个阈值时接管的逻辑，或者训练具有您想要捕获的所有非意图示例的实际回退意图。

rasa-nlu - Rasa nlu 解析请求给出错误的意图结果

2 回答 2

Related

Reference