为了有效地做到这一点,您必须按单词索引您的文本。
换句话说,foo
授权的对象MapReduce: Simplified Data Processing on Large Clusters
将被映射到以下键:
MapReduce: Simplified Data Processing on Large Clusters
,
Simplified Data Processing on Large Clusters
,
Data Processing on Large Clusters
,
Processing on Large Clusters
,
on Large Clusters
,
Large Clusters
,
Clusters
.
如果文本太长,您可以将键截断为给定数量的字符(例如24
)。
这是 CouchDB 的代码示例:
function map(o) {
const SIZE = 24;
function format(text, begin) {
return text.substr(begin, SIZE).toLowerCase();
}
const WORD_MATCHER = /\S+/g;
while ((match = WORD_MATCHER.exec(o.title))) {
var begin = match.index;
emit(format(o.title, begin), {position: begin});
}
}
然后,如果您要求 和 之间的键data process
,data processZ
您将得到:
{"key": "data processing on large clusters", "id": "foo", "value":{"position": 22}}