有谁知道像chronic这样的库,但对于python?
谢谢!
你试过parsedatetime吗?
你可以试试斯坦福 NLP 的 SUTime。相关的 Python 绑定在这里:https ://github.com/FraBle/python-sutime
确保安装了所有 Java 依赖项。
我正在和慢性病的斯蒂芬·拉塞特交谈。在他建议标记化之后,我想出了一个 Python 示例。
这是 Python 示例。您将输出运行为慢性。
import nltk
import MySQLdb
import time
import string
import re
#tokenize
sentence = 'Available June 9 -- August first week'
tokens = nltk.word_tokenize(sentence)
parts_of_speech = nltk.pos_tag(tokens)
print parts_of_speech
#allow white list
white_list = ['first']
#allow only prepositions
#NNP, CD
approved_prepositions = ['NNP', 'CD']
filtered = []
for word in parts_of_speech:
if any(x in word[1] for x in approved_prepositions):
filtered.append(word[0])
elif any(x in word[0] for x in white_list):
#if word in white list, append it
filtered.append(word[0])
print filtered
#normalize to alphanumeric only
normalized = re.sub(r'\s\W+', ' ', ' '.join(filtered))
print filtered