reducer(带有 Text 键和 Iterable MapWritable 值)如何将其所有 Map 输出到序列文件以保留对其键的分组?例如,假设映射器将记录发送到减速器,如下所示:
<"dog", {<"name", "Fido">, <"pure bred?", "false">, <"type", "mutt">}>
<"cat", {<"name", "Felix">, <"color", "black">, <"origin", "film">, <"date", "1919">}>
<"dog", {<"name", "Lassie">, <"type", "collie">, <"origin", " short story">}>
我希望将序列文件写为:
key = "dog"
value = {
{<"name", "Fido">, <"pure bred?", "false">, <"type", "mutt">},
{<"name", "Lassie">, <"type", "collie">, <"origin", "short story">}
}
key = "cat"
value = {
{<"name", "Felix">, <"color", "black">, <"origin", "film">, <"date", "1919">}
}
我猜我需要创建一个实现 Writable 的自定义值输出类,但我不确定如何执行此操作,因为据我所知,Collections 并不真正适用于序列文件。我想这样做,以便下一个 map/reduce 阶段将读取与每个键关联的所有映射作为一个单元。
TIA,