python - Dictionaries containing the biggest values

Question

I have this scipy csr_matrix:

  (0, 12114) 0.272571581001
  (0, 12001) 0.0598986479579
  (0, 11998) 0.137415042369
  (0, 11132) 0.0681428952502
  (0, 10412) 0.0681428952502
  (1, 10096) 0.0990242494495
  (1, 10085) 0.216197045661
  (1, 9105) 0.1362857905
  (1, 8925) 0.042670696769
  (1, 8660) 0.0598986479579
  (2, 6577) 0.119797295916
  (2, 6491) 0.0985172979468
  (3, 6178) 0.1362857905
  (3, 5286) 0.119797295916
  (3, 5147) 0.270246307076
  (3, 4466) 0.0540492614153
  (4, 3810) 0.0540492614153
  (4, 3773) 0.0495121247248

and I would like to find a way to create (in this case 4) dictionaries where each dictionary contains the 2 biggest values for each row..

So for example, for row 0 my dictionary would be:

dict0 = {12114: '0.27257158100111998', 11998: '0.137415042369'}

and for row 1:

dict1 = {10085: '0.216197045661', 9105: '0.1362857905'}

score 1 · Accepted Answer

由于csr_matrix没有sort()方法，所以先将需要的行转换为数组很方便：

a = m[i,:].toarray().flatten()

要获取已排序列的位置：

argsa = a.argsort()

最大值位于的最后一列argsa，因此要获得两个最大值的列是：

argsa[-2:]

要获得这对column, value：

argsa[-2:], a[ argsa[-2:] ]

这可以在字典中转换：

dict( zip( argsa[-2:], a[ argsa[-2:] ] ) )

您的最终功能可以是：

def get_from_m(m, i, numc=2):
    a = m[i,:].toarray().flatten()
    argsa = a.argsort()
    return dict( zip( argsa[-numc:], a[ argsa[-numc:] ] ) )

python - Dictionaries containing the biggest values

1 回答 1

Related

Reference