0

so, what I do is, I map a text file as a whole record (wholefileinputformat), do some processing, then write the output line by line with context.write. The problem is, it turns out to be not efficient at all. One map task produces several millions of outputs and I get Heap Memory error. Is there any other way of doing this?

   map-input(file_name, file_contents_as_Text)  
   String output = process(file_contents)   
   for(each line in output)   
        context.write(line, some_value)  
4

0 回答 0