6

我想做的事情(在 Clojure 中):

例如,我有一个需要删除的单词向量:

(def forbidden-words [":)" "the" "." "," " " ...many more...])

...和一个字符串向量:

(def strings ["the movie list" "this.is.a.string" "haha :)" ...many more...])

因此,应该从每个字符串中删除每个禁用词,在这种情况下,结果将是:[“movie list”“thisisastring”“haha”]。

这该怎么做 ?

4

3 回答 3

7
(def forbidden-words [":)" "the" "." ","])
(def strings ["the movie list" "this.is.a.string" "haha :)"])
(let [pattern (->> forbidden-words (map #(java.util.regex.Pattern/quote %)) 
                (interpose \|)  (apply str))]
  (map #(.replaceAll % pattern "") strings))
于 2010-04-01T07:17:41.567 回答
1
(use 'clojure.contrib.str-utils)
(import 'java.util.regex.Pattern)
(def forbidden-words [":)" "the" "." "," " "])
(def strings ["the movie list" "this.is.a.string" "haha :)"])
(def regexes (map #(Pattern/compile % Pattern/LITERAL) forbidden-words))
(for [s strings] (reduce #(re-gsub %2 "" %1) s regexes))
于 2010-03-31T17:48:47.887 回答
0

使用函数组合和->宏可以很简单:

(for [s strings] 
  (-> s ((apply comp 
           (for [s forbidden-words] #(.replace %1 s ""))))))

如果您想更“惯用”,可以使用replace-strfrom clojure.contrib.string,而不是#(.replace %1 s "").

这里不需要使用正则表达式。

于 2010-03-31T21:47:21.310 回答