I am using seq2sparse to convert sequence files to sparse vectors. Is it normal for this to take so long? Mine has been stuck on a monitorAndPrintJob for thirty minutes now. It seems to be doing something because any attempt to do anything else on the machine is very very sluggish. But it'd be nice to know if I should stop it and try again or if I should just wait.
Here's the command I used:
mahout home/bin/mahout seq2sparse -o outputDirectory -i inputDirectory -ml 10 -ng 2 -seq
Should I tweak some of the switches to help it run more efficiently?
This is on one local machine. The sequence file is 151MB.
Edit: it is no longer sluggish to do other things, but htop shows java is doing things to do with hadoop and mahout so I guess I should leave it? It's been at this one part of the process for forty minutes now.
Edit 2: ah not to worry, I went out and came back and it had finished without crashing or anything. Cheers anyway if you read all this!