我需要清理一些文本,如下面的代码所示:
import re
def clean_text(text):
text = text.lower()
#foction de replacement
text = re.sub(r"i'm","i am",text)
text = re.sub(r"she's","she is",text)
text = re.sub(r"can't","cannot",text)
text = re.sub(r"[-()\"#/@;:<>{}-=~|.?,]","",text)
return text
clean_questions= []
for question in questions:
clean_questions.append(clean_text(question))
这段代码必须给我清空questions
列表,但我questions
清空了。我重新打开了spyder,列表已满,但没有被清理,然后重新打开它,我把它弄空了..控制台错误说:
In [10] :clean_questions= []
...: for question in questions:
...: clean_questions.append(clean_text(question))
Traceback (most recent call last):
File "<ipython-input-6-d1c7ac95a43f>", line 3, in <module>
clean_questions.append(clean_text(question))
File "<ipython-input-5-8f5da8f003ac>", line 16, in clean_text
text = re.sub(r"[-()\"#/@;:<>{}-=~|.?,]","",text)
File "C:\Users\hp\Anaconda3\lib\re.py", line 192, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "C:\Users\hp\Anaconda3\lib\re.py", line 286, in _compile
p = sre_compile.compile(pattern, flags)
File "C:\Users\hp\Anaconda3\lib\sre_compile.py", line 764, in compile
p = sre_parse.parse(p, flags)
File "C:\Users\hp\Anaconda3\lib\sre_parse.py", line 930, in parse
p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
File "C:\Users\hp\Anaconda3\lib\sre_parse.py", line 426, in _parse_sub
not nested and not items))
File "C:\Users\hp\Anaconda3\lib\sre_parse.py", line 580, in _parse
raise source.error(msg, len(this) + 1 + len(that))
error: bad character range }-=
我正在使用 Python 3.6,特别是 Anaconda 构建 Anaconda3-2018.12-Windows-x86_64。