2

有谁知道在创建模式时如何比较烫伤的连续记录。我正在查看教程 6,并假设如果记录 #2 中的数据大于记录 #1(对于所有记录),我想打印此人的年龄

例如:

R1: John 30
R2: Kim 55
R3: Mark 20 

if Rn.age > R(n-1).age the output ... which will result to R2: Kim 55

编辑:查看代码我刚刚意识到它是一个 Scala 枚举,所以我的问题是如何比较 scala 枚举中的记录?

class Tutorial6(args : Args) extends Job(args) {
  /** When a data set has a large number of fields, and we want to specify those fields conveniently
    in code, we can use, for example, a Tuple of Symbols (as most of the other tutorials show), or a List of Symbols.
    Note that Tuples can only be used if the number of fields is at most 22, since Scala Tuples cannot have more
    than 22 elements. Another alternative is to use Enumerations, which we show here **/

  object Schema extends Enumeration {
    val first, last, phone, age, country = Value // arbitrary number of fields
  }

  import Schema._

  Csv("tutorial/data/phones.txt", separator = " ", fields = Schema)
    .read
    .project(first,age)
    .write(Tsv("tutorial/data/output6.tsv"))
}
4

1 回答 1

2

似乎缺少来自 Enumeration#Value 的隐式转换,因此您可以自己定义它:

import cascading.tuple.Fields
implicit def valueToFields(v: Enumeration#Value): Fields = v.toString

object Schema extends Enumeration {
  val first, last, phone, age, country = Value // arbitrary number of fields
}

import Schema._

var current = Int.MaxValue

Csv("tutorial/data/phones.txt", separator = " ", fields = Schema)
  .read
  .map(age -> ('current, 'previous)) { a: String =>
    val previous = current
    current = a.toInt
    current -> previous
  }
  .filter('current, 'previous) { age: (Int, Int) => age._1 > age._2 }
  .project(first, age)
  .write(Tsv("tutorial/data/output6.tsv"))

最后,我们期望结果与以下结果相同:

Csv("tutorial/data/phones.txt", separator = " ", fields = Schema)
  .read
  .map((new Fields("age"), (new Fields("current", "previous"))) { a: String =>
    val previous = current
    current = a.toInt
    current -> previous
  }
  .filter(new Fields("current", "previous")) { age: (Int, Int) =>
    age._1 > age._2
  }
  .project(new Fields("first", "age"))
  .write(Tsv("tutorial/data/output6.tsv"))

scalding 提供的隐式转换允许您编写这些new Fields(...).

隐式转换只是一个视图,当您传递不属于预期类型的​​参数时,编译器将使用该视图,但可以通过此视图将其转换为适当的类型。例如,因为在您传递一对 Symbols 时需要一对,所以 Scala 将搜索从map()to的隐式转换。可以在此处找到有关视图的简短说明。FieldsSymbol -> SymbolFields -> Fields

Scalding 0.8.5 引入了从 aEumeration#Value到 a 的Fields转换,但缺少一对值的转换。该develop分支现在也提供后者。

于 2013-06-19T07:42:27.403 回答