I want to use Apache Storm's TridentTopology in a project. I am finding it difficult to understand the .each() function from the storm.trident.Stream class. Below is the example code given in their tutorial for reference:
TridentTopology topology = new TridentTopology();
TridentState wordCounts =
topology.newStream("spout1", spout)
.each(new Fields("sentence"), new Split(), new Fields("word"))
.groupBy(new Fields("word"))
.persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))
.parallelismHint(6);
I didn't understand the signature of the method .each(). Below is what I understood. Please correct me if I am wrong and also give some more information for my knowledge.
.each()
- The first parameter takes the fields which are correlated keys to the emitted values from spout and returned from the getOutputFields() method in the spout. I still don't know why is that parameter used for.
- The second parameter is the class extending the BaseFunction. It processes the tuple.
- The third parameter understanding is similar to the first parameter.