I am trying to run Kmeans using Hadoop. I want to save the centroids of the clusters calculated in the cleanup method of the Reducer to some file say centroids.txt
. Now, I would like to know what will happen if multiple reducers' cleanup method starts at the same time and all of them try to write to this file simultaneously. Will it be handled internally? If not is there a way to synchronize this task?
Note that this is not my output file of reducer. It is an additional file that I am maintaining to keep track of the centroids. I am using BufferedWriter from the reducer's cleanup method to do this.