4

由于 Django 不处理过滤脏话 - 有人对在 django 中实现某种自然语言处理/过滤脏话的简单方法有任何建议吗?

4

2 回答 2

7

Django 确实处理过滤脏话。

来自https://docs.djangoproject.com/en/1.4/ref/settings/#profanities-list

PROFANITIES_LIST

默认值:()(空元组)

一个不雅语的元组,作为字符串,在 is 时将被禁止在评论 COMMENTS_ALLOW_PROFANITIESFalse

也就是说,您仍然需要填充该列表。一些 开始链接_

我也会熟悉斯肯索普问题

于 2012-09-15T17:12:02.920 回答
2

我个人说...不要打扰。如果您创建更好的过滤器,他们只会以不同的方式输入...

但是,这里有一个简单的例子:

import re
bad_words = ['spam', 'eggs']
# The \b gives a word boundary so you don't have the Scunthorpe problem: http://en.wikipedia.org/wiki/Scunthorpe_problem
pattern = re.compile(
    r'\b(%s)\b' % '|'.join(bad_words),
    re.IGNORECASE,
)

some_text = 'This text contains some profane words like spam and eggs. But it wont match spammy stuff.'
print some_text
# This text contains some profane words like spam and eggs. But it wont match spammy stuff.

clean_text = pattern.sub('XXX', some_text)
print clean_text
# This text contains some profane words like XXX and XXX. But it wont match spammy stuff.
于 2012-09-15T17:12:14.013 回答