I am trying to do a basic twitter sentiment analysis, by using apache spark.
The below page explains on Naive Bayes function used at apache spark which would be a candidate for the above problem. http://spark.apache.org/docs/1.0.0/mllib-naive-bayes.html
when you check at the java example, the training and test set are given as
JavaRDD<LabeledPoint> training = ... // training set
JavaRDD<LabeledPoint> test = ... // test set
I dont have any clue what datatype they are, but i can understand that they are some non english inputs.
I have a list of tweets say.
"I love my country."
"Great day at office."
"Google Chrome sucks!"
How do i use the naive bayes function to process the text ?
any insights on this would be helpful.