28

是否可以过滤以数组为值的交叉过滤器数据集?

例如,假设我有以下数据集:

var data = [
  {
    bookname: "the joy of clojure",
    authors: ["Michael Fogus", "Chris Houser"],
    tags: ["clojure", "lisp"]
  },
  {
    bookname: "Eloquent Ruby",
    authors: ["Russ Olsen"],
    tags: ["ruby"]
  },
  {
    bookname: "Design Patterns in Ruby",
    authors: ["Russ Olsen"],
    tags: ["design patterns", "ruby"]
  }
];

是否有一种简单的方法可以访问由特定标签标记的书籍?还有那些有特定作者的书?到目前为止,我理解如何使用交叉过滤器的方式让我做了这样的事情:

var filtered_data = crossfilter(data);
var tags = filtered_data.dimension(function(d) {return d.tags});
var tag = tags.group();

然后当我访问分组时(像这样):

tag.all()

我明白了:

[{key: ["clojure", "lisp"], value: 1}, 
 {key: ["design patterns", "ruby"], value: 1}, 
 {key: ["ruby"], value: 1}]

当我宁愿有这个时:

[{key: "ruby", value: 2}, 
 {key: "clojure", value: 1}, 
 {key: "lisp", value: 1},
 {key: "design patterns", value: 1}]
4

2 回答 2

31

我在下面的代码中添加了注释。大图:使用reduce函数。

var data = ...
var filtered_data = crossfilter(data);
var tags = filtered_data.dimension(function(d) {return d.tags});

tags.groupAll().reduce(reduceAdd, reduceRemove, reduceInitial).value()

请注意我是如何使用groupAll()而不是group() b/c 我们希望我们的 reduce 函数(定义如下)对一组而不是 3 个组进行操作。

现在 reduce 函数应该如下所示:

/*
 v is the row in the dataset

 p is {} for the first execution (passed from reduceInitial). 
 For every subsequent execution it is the value returned from reduceAdd of the prev row
*/
function reduceAdd(p, v) {
  v.tags.forEach (function(val, idx) {
     p[val] = (p[val] || 0) + 1; //increment counts
  });
  return p;
}

function reduceRemove(p, v) {
   //omitted. not useful for demonstration
}

function reduceInitial() {
  /* this is how our reduce function is seeded. similar to how inject or fold 
   works in functional languages. this map will contain the final counts 
   by the time we are done reducing our entire data set.*/
  return {};  
}
于 2012-08-30T07:15:14.767 回答
2

我从未使用过“crossfilter”(我假设这是一个 JS 库)。不过,这里有一些纯 JS 方法。

这...

data.filter(function(d) {
  return d.authors.indexOf("Michael Fogus") !== -1;
})

返回这个:

[{bookname:"the joy of clojure", authors:["Michael Fogus", "Chris Houser"], tags:["clojure", "lisp"]}]

这...

var res = {};
data.forEach(function(d) {
  d.tags.forEach(function(tag) {
    res.hasOwnProperty(tag) ? res[tag]++ : res[tag] = 1
  });
})

返回这个:

({clojure:1, lisp:1, ruby:2, 'design patterns':1})

对于其中任何一个,您都可以申请d3.entries获取您的{key:"ruby", value: 2}格式。

于 2012-08-20T22:06:55.657 回答