Ctakes 能够识别“肺癌”、“基底细胞癌”等 - 即它提供了正确的 SNOMED UMLS 标识符。但如果句子包含结肠直肠癌,它只会返回“恶性肿瘤”
我尝试过使用不同窗口大小的 NeContextsSubPipe 并使用 ContextDependentTokenizerAnnotator ;但 Ctakes 从未识别出“结肠直肠癌”。
// Load a simple token processing pipeline from another pipeline file
load DefaultTokenizerPipeline.piper
// Add non-core annotators
add ContextDependentTokenizerAnnotator
// The POSTagger has a -complex- startup, but it can create its own description to handle it
addDescription POSTagger
//addDescription LvgAnnotator
addDescription ThreadSafeLvg
add DefaultJCasTermAnnotator
// Add Named Entity Context Entity Attribute annotators
load NeContextsSubPipe.piper
// Collect discovered Entity information for post-run access
collectEntities
DiseaseDisorderMention': {'Malignant Neoplasms'} <- 当我对“结直肠癌”做 Ctakes 时
{'DiseaseDisorderMention':{'基底细胞癌'、'恶性肿瘤'、'前列腺恶性肿瘤'} <- 当我对“基底细胞癌或前列腺癌”执行此操作时