我通过 Spark 将一些日志文件放入 sql 表中,我的架构如下所示:
|-- timestamp: timestamp (nullable = true)
|-- c_ip: string (nullable = true)
|-- cs_username: string (nullable = true)
|-- s_ip: string (nullable = true)
|-- s_port: string (nullable = true)
|-- cs_method: string (nullable = true)
|-- cs_uri_stem: string (nullable = true)
|-- cs_query: string (nullable = true)
|-- sc_status: integer (nullable = false)
|-- sc_bytes: integer (nullable = false)
|-- cs_bytes: integer (nullable = false)
|-- time_taken: integer (nullable = false)
|-- User_Agent: string (nullable = true)
|-- Referrer: string (nullable = true)
如您所见,我创建了一个时间戳字段,我读到的 Spark 支持该字段(据我了解,日期不起作用)。我很想使用像“where timestamp>(2012-10-08 16:10:36.0)”这样的查询,但是当我运行它时,我不断收到错误。我尝试了以下 2 种 sintax 形式:对于第二种形式,我解析一个字符串,所以我确定我实际上以时间戳格式传递它。我使用 2 个函数:parse和 date2timestamp。
关于我应该如何处理时间戳值的任何提示?
谢谢!
1) scala> sqlContext.sql("SELECT * FROM Logs as l where l.timestamp=(2012-10-08 16:10:36.0)").collect
java.lang.RuntimeException: [1.55] failure: ``)'' expected but 16 found
SELECT * FROM Logs as l where l.timestamp=(2012-10-08 16:10:36.0)
^
2) sqlContext.sql("SELECT * FROM Logs as l where l.timestamp="+date2timestamp(formatTime3.parse("2012-10-08 16:10:36.0"))).collect
java.lang.RuntimeException: [1.54] failure: ``UNION'' expected but 16 found
SELECT * FROM Logs as l where l.timestamp=2012-10-08 16:10:36.0
^