12

I have a table in Databricks delta which is partitioned by transaction_date. I want to change the partition column to view_date. I tried to drop the table and then create it with a new partition column using PARTITIONED BY (view_date).

However my attempt failed since the actual files reside in S3 and even if I drop a hive table the partitions remain the same. Is there any way to change the partition of an existing Delta table? Or the only solution will be to drop the actual data and reload it with a newly indicated partition column?

4

1 回答 1

22

实际上没有必要删除表或删除文件。您需要做的就是读取当前表,覆盖内容和架构,然后更改分区列:

val input = spark.read.table("mytable")

input.write.format("delta")
  .mode("overwrite")
  .option("overwriteSchema", "true")
  .partitionBy("colB") // different column
  .saveAsTable("mytable")

更新:以前存在时间旅行和分区更改的错误,现已修复。

于 2019-05-04T00:58:21.293 回答