2

I am new to scala Saddle, I have three column (customer name, age and Status) in a frame. I have to apply filter in column (age). If any customer age having more than 18 I need to set the Status is "eligible" other wise I need to put "noteligible".

Code:

f.col("age").filterAt(x => x > 18)  //but how to update Status column
4

1 回答 1

0

框架是不可变的容器,因此最好使用完全初始化的值来构建框架,而不是从部分初始化的框架开始。

import org.saddle._

object Test {
  def main(args: Array[String]): Unit = {
    val names: Vec[Any] = Vec("andy", "bruce", "cheryl", "dino", "edgar", "frank", "gollum", "harvey")
    val ages:  Vec[Any] = Vec(4, 89, 7, 21, 14, 18, 23004, 65)

    def status(age: Any): Any = if (age.asInstanceOf[Int] >= 18) "eligible" else "noteligible"

    def mapper(indexAge: (Int, Any)): (Int, _) = indexAge match {
      case (index, age) => (index, status(age))
      }

    val nameAge:  Frame[Int, String, Any] = Frame("name" -> names, "age" -> ages)
    val ageCol:   Series[Int, Any]        = nameAge.colAt(1)
    val eligible: Series[Int, Any]        = ageCol.map( mapper )

    println("" + nameAge)
    println("" + eligible)

    val nameAgeStatus: Frame[Int, String, _] = nameAge.joinSPreserveColIx(eligible, how=index.LeftJoin, "status")

    println("" + nameAgeStatus)
  }
}

如果您确实需要从部分初始化的 Frame 开始,您始终可以删除未初始化的列并将其添加回正确计算的值。

虽然我更喜欢强类型数据列,但我认为 Frame 只包含一种类型的数据,“Int”和“String”的常见类型是“Any”。这也会影响方法的类型签名,尽管您可能希望在没有类型信息的情况下内联它们。

我发现查看scaladoc有很大帮助。

这是最终 println 调用的输出:

[8 x 3]
       name   age      status 
     ------ ----- ----------- 
0 ->   andy     4 noteligible 
1 ->  bruce    89    eligible 
2 -> cheryl     7 noteligible 
3 ->   dino    21    eligible 
4 ->  edgar    14 noteligible 
5 ->  frank    18    eligible 
6 -> gollum 23004    eligible 
7 -> harvey    65    eligible 
于 2015-04-20T11:43:18.073 回答