1

I have a Map[String,String] wherein the last key,value pair is "Text"->The text of the documents. I wish to calculate the count of each word in the document and I was thinking of having another map which has a count of words in each document. I have a map like Map("id"->12,"text"->"The dog likes the cat") and I am trying to get another map which is Map("The"->2,"dog"->1,"likes"->1,"cat"->1) I have the following code:

val Counts = mutable.Map[String, Int]().withDefault(x=>0)
var tfCounts:Map[String,Int]()
for(i<-1 to newsMap.size){
    val tfMap = newsMap.get("newsText").slice(i-1,i).map(x => x.split("\\s+")).toList
    for(token<-tfMap)
        counts(token) +=1 
    tfCounts = tfCounts++ counts
}

I don't know how to reset the counts map because I want separate counts of words for each document.

4

1 回答 1

3
scala> val document = Map("id"->12,"text"->"The dog likes the cat")
document: scala.collection.immutable.Map[String,Any] = Map(id -> 12, text -> The dog likes the cat)

scala> document("text").asInstanceOf[String].split(" ").groupBy(_.toLowerCase).mapValues(_.size)
res3: scala.collection.immutable.Map[String,Int] = Map(cat -> 1, dog -> 1, likes -> 1, the -> 2)
于 2013-03-24T08:10:45.860 回答