15

蜂巢版本 1.1

我有一个配置单元外部表,如下所示:

 CREATE EXTERNAL TABLE `schedule_events`(
  `schedule_id` string COMMENT 'from deserializer',
  `service_key` string COMMENT 'from deserializer',
  `event_start_date_time` string COMMENT 'from deserializer',
  `event_id` string COMMENT 'from deserializer',
  `event_type` string COMMENT 'from deserializer',
  `transitional_key` string COMMENT 'from deserializer',
  `created_date_time` string COMMENT 'from deserializer',
  `bus_date` string COMMENT 'from deserializer')
    PARTITIONED BY (
                    `year` string,
                    `month` string,
                    `day` string)
   ROW FORMAT SERDE
   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
   STORED AS INPUTFORMAT
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
   OUTPUTFORMAT
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
   LOCATION
   'hdfs://nameservice1/hadoop/raw/omega/scheduled_events'
  TBLPROPERTIES (
    'avro.schema.url'='hdfs:////hadoop/raw/omega/schema/schedule_events.avsc',
   'transient_lastDdlTime'='1505742141')

现在要删除特定分区,我可以运行 ALTER 命令,如下所示

 ALTER TABLE schedule_events DROP IF EXISTS PARTITION  (year='2016',month='06',day='01')
 Dropped the partition year=2016/month=06/day=01

 hive> show partitions schedule_events;
 OK
 year=2017/month=09/day=01
 year=2017/month=09/day=02
 year=2017/month=09/day=03
 year=2017/month=09/day=04
 year=2017/month=09/day=05

但是这个表有很多分区。

如何一次删除所有现有分区?我想一次删除所有现有分区?那可能吗?

4

4 回答 4

28

有多种选择,这里有一个:

alter table schedule_events drop if exists partition (year<>'');

Hive:扩展 ALTER TABLE DROP PARTITION 语法以使用所有比较器

“...要从 Hive 表中删除分区,这可行:
ALTER TABLE foo DROP PARTITION(ds = 'date')
...但它也应该可以删除日期之前的所有分区
。ALTER TABLE foo DROP PARTITION( ds < 'date') 此任务是为所有比较器实现 ALTER TABLE DROP PARTITION,< > <= >= <> = != 而不仅仅是 ="

https://issues.apache.org/jira/browse/HIVE-2908

于 2017-09-20T04:46:55.287 回答
12

You may use something similar to this:

ALTER TABLE schedule_events drop if exists partition (year>'0');
于 2018-01-24T20:04:19.250 回答
1

alter table schema_name.table_name drop partition (partition_column != '');

于 2018-04-03T08:08:41.853 回答
0

使用火花 sql:

val paritions_values = spark.sql("show partitions "+databasename+'.'+tablename)
.collect().map(f=>f(0).toString)
.toArray.mkString("partition(", "," , "\")")
.replace("," , "\") ,partition(")
.replace("=", "=\"")

spark.sql("alter table "+databasename+'.'+tablename+" drop "+paritions_values)
于 2020-03-02T22:02:34.617 回答