4

I have external tables created in AWS spectrum to query the s3 data however i am not able to identify the filenames which the record belongs to(i have thousands of files under a bucket)

In AWS Athena we have a pseudo column "$PATH" which will display the s3 filenames is there any similar ways available while using spectrum?

4

1 回答 1

8

从最近开始,您可以使用特定的伪列来访问 S3 中对象的路径和大小,以获取沿袭信息。

http://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-external-tables.html#c-spectrum-external-tables-pseudocolumns

这种查询的一个例子是:

>> select distinct "$path", "$size" from spectrum.sales_part;

 $path                                 | $size
---------------------------------------+-------
s3://awssampledbuswest2/tickit/spectrum/sales_partition/saledate=2008-01/ |  1616
s3://awssampledbuswest2/tickit/spectrum/sales_partition/saledate=2008-02/ |  1444
s3://awssampledbuswest2/tickit/spectrum/sales_partition/saledate=2008-02/ |  1444
于 2017-06-21T03:57:57.923 回答