2

我想使用基于 SQL 函数的语言集成 SQL 来过滤 schemaRDD。例如我想跑

SELECT name FROM people WHERE name LIKE '%AHSAN%' AND name regexp '^[A-Z]{20}$'

如何在 people.where() 中使用此类 SQL 函数?

参考:

对于语言集成 SQL,我将按照此处给出的示例进行操作。

val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext._
val people: RDD[Person] = ... // An RDD of case class objects, from the first example.
// The following is the same as 'SELECT name FROM people WHERE age >= 10 AND age <= 19'
val teenagers = people.where('age >= 10).where('age <= 19).select('name)
teenagers.map(t => "Name: " + t(0)).collect().foreach(println)

提前致谢!

4

1 回答 1

0

您可以使用 SQL 函数,如数字运算符。例如,

people.where('name like "%AHSAN%").where('name rlike "^[A-Z]{20}$").select('name)

Spark SQL中没有regexp,但它与rlike.

于 2014-12-03T04:11:23.940 回答