python - Map Reduce-将mapper的输出作为字典中的字典传递给reducer

翻译自：https://stackoverflow.com/questions/19606545 2013-10-26T12:11:13.503

1629 次

我的映射器 python 脚本产生输出，[2323223,[{'word':'hi'},{'charcount':'2'}] 在输出2323223中，键和其余都是键的值2323223，但在值中，字典中有多个键值对。

我的映射器脚本的一部分

 analysis = {}
 wordanalysis = {}
 found = True

 for line in sys.stdin:
 (n1,n6) = re.split("\t+",line.strip())
 #n6 are the words and n1 is id
 words= re.split("\s+", n6.strip().lower())


 for i in range(0, len(words)):
  #adding id to a dictionary
   #n1 is id
    analysis[n1] = wordanalysis
    wordanalysis["word"] = words[i]
    wordanalysis["charcount"]= len(words[i])

    if len(words[i]) > 7:
       wordanalysis["longword"] = found
    else:
       wordanalysis["longword"] = not(found)

类似的东西。我的减速器应该像计算单词的数量等，但它将如何解释已经存在的字典......就像在减速器中：对于 sys.stdin 中的行：

映射器的输出：

['34324242'] [{'word': 'hi','charcount': 2}]
['897924242'] [{'word': 'hello','charcount': 5}]

这是我的输出。我将这个值从映射器脚本传递到减速器脚本。reducer 把上面的 o/p 作为输入，做数据分析，比如 total of charcount。知道怎么做吗？
主要挑战是从映射器输出中获取字典值，以及如何根据字典中的键检索它们。

谢谢。</p>

python - Map Reduce-将mapper的输出作为字典中的字典传递给reducer

0 回答 0

Related

Reference