4

I have a large multi column data file, but for this question it can simplified as follows:

data = {{"a", 2000}, {"a", 2010}, {"b", 1999}, {"b", 2004}, {"b", 
2006}, {"c", 2012}, {"c", 2014}};

I then have a list of items for which I want to extract the year value from data, e.g:

selectedList = {"b", "c"};

I can do it by using Select[] and then iterating through the selectedList:

Table[
        Select[data, #[[1]] == selectedList[[i]] &][[All, 2]],

       {i, 1, Length[selectedList]}  ]

However I want to use Map, which should be faster than Table. I can do this:

func[dat_, x_] := Select[dat, #[[1]] == x &][[All, 2]]

and then :

func[data, #] & /@ selectedList

I am looking for a more elegant way to do this in one step, preferably mapping Select directly onto selectedList

4

5 回答 5

6
Cases[data, {#, x_} :> x] & /@ selectedList
于 2012-04-30T11:17:43.087 回答
3

另一种方法是使用Position

Map[Function[x, data[[Position[data, x][[All, 1]], 2]]], selectedList]

(* {{1999, 2004, 2006}, {2012, 2014}} *)
于 2012-04-30T09:23:35.957 回答
3

我会使用地图和案例:

data = {{"a", 2000}, {"a", 2010}, {"b", 1999}, {"b", 2004},
   {"b", 2006}, {"c", 2012}, {"c", 2014}};
selectedList = {"b", "c"};

Map[Part[Cases[data, {#, _}], All, 2] &, selectedList]

{{1999, 2004, 2006}, {2012, 2014}}

但是,如果您真的想使用 Select,您可以按如下方式进行。函数用于避免混淆匿名槽。我逐渐建立了功能来说明:-

Select[data, First[#] == "b" &] (* Basic start *)

{{"b", 1999}, {"b", 2004}, {"b", 2006}}

Select[data, Function[x, First[x] == "b"]] (* Implement with Function *)

{{"b", 1999}, {"b", 2004}, {"b", 2006}}

Part[Select[data, Function[x, First[x] == "b"]], All, 2]

{1999、2004、2006}

Map[Part[Select[data,
    Function[x, First[x] == #]], All, 2] &, selectedList]

{{1999, 2004, 2006}, {2012, 2014}}

于 2012-04-30T08:52:13.293 回答
2

对于变化,这里是另一个基于选择的:

Last[#\[Transpose]] & /@ (Select[data, Function[x, First[x] == #1]] & ) 
/@ selectedList
于 2012-04-30T11:08:55.307 回答
1

我会使用:

Reap[Sow[#2, #] & @@@ data, selectedList][[2, All, 1]]
{{1999, 2004, 2006}, {2012, 2014}}

这很容易适应其他结构,例如第 10 列: Sow[#10, #]

在大型数据集和 long 上,selectedList这将比Cases因为没有为每个选择元素重新扫描数据而更快。

例子:

data = {RandomChoice[CharacterRange["a", "z"], 50000], 
        RandomInteger[100000, 50000]}\[Transpose];

selectedList = RandomSample @ CharacterRange["a", "z"];

Reap[Sow[#2, #] & @@@ data, selectedList][[2, All, 1]]; //AbsoluteTiming

Cases[data, {#, x_} :> x] & /@ selectedList;            //AbsoluteTiming
{0.0210012, Null}

{0.1010057, Null}
于 2012-05-01T23:08:30.630 回答