在 Spark 2.2 中,无法从 unix_timestamp 输入数据中提取日期:
+-------------------------+
|UPDATE_TS |
+-------------------------+
|26NOV2009:03:27:01.154410|
|24DEC2012:00:47:46.805710|
|02MAY2013:00:45:33.233844|
|21NOV2014:00:33:39.350140|
|10DEC2013:00:30:30.532446|
我尝试了以下方法,但输出 Im getting as null
查询累了:
火花 sql
sqlContext.sql("select from_unixtime(unix_timestamp(UPDATE_TS,'ddMMMyyyy:HH:MM:SS.ssssss'), 'yyyy') as new_date from df_vendor_tab").show()
DSL:
df_id.withColumn('part_date', from_unixtime(unix_timestamp(df_id.UPDATE_TS, "ddMMMyyyy:HH:MM:SS.sss"), "yyyy"))
预期输出:
2009
2012
2013
2014
2013