1

如果我有两个这样的字符串

s1 = "This is a foo bar sentence ."
s2 = "This sentence is similar to a foo bar sentence ."

我想将字符串拆分为这种格式

x1 = ["This":1,"is":1,"a":1,"bar":1,"sentence":1,"foo":1]
x2 = ["This":1,"is":1,"a":1,"bar":1,"sentence":2,"similar":1,"to":1,"foo":1]

它将字符串单词拆分并计数,成对,其中每个字符串代表一个单词,数字表示该单词在字符串中的计数。

4

1 回答 1

8

删除标点符号、规范化空格、小写、在空格处拆分、使用循环将单词出现次数统计到索引对象中。

function countWords(sentence) {
  var index = {},
      words = sentence
              .replace(/[.,?!;()"'-]/g, " ")
              .replace(/\s+/g, " ")
              .toLowerCase()
              .split(" ");

    words.forEach(function (word) {
        if (!(index.hasOwnProperty(word))) {
            index[word] = 0;
        }
        index[word]++;
    });

    return index;
}

或者,在 ES6 箭头函数样式中:

const countWords = sentence => sentence
  .replace(/[.,?!;()"'-]/g, " ")
  .replace(/\s+/g, " ")
  .toLowerCase()
  .split(" ")
  .reduce((index, word) => {
    if (!(index.hasOwnProperty(word))) index[word] = 0;
    index[word]++;
    return index;
  }, {});
于 2013-08-17T10:31:31.673 回答