I am a newbie to Hadoop. I have managed to develop a simple Map/Reduce application that works fine in 'pseudo distributed mode'.I want to test that in 'fully distributed mode'. I have few questions regarding that;
- How many machines(nodes) do I need (minimum & recommended) for processing a file size of 1-10GB?
- what are the hardware requirements(mainly, I want to know the # of cores, Memory space and disk space)?