这是我读取存储在 S3 存储桶路径中的镶木地板文件的代码。当它在路径中找到镶木地板文件时,它可以工作,但exceptions.NoFilesFound
在找不到任何文件时会给出。
import boto3
import awswrangler as wr
boto3.setup_default_session(profile_name="myAwsProfile", region_name="us-east-1")
path_prefix = 's3://example_bucket/data/parquet_files'
path_suffix = '/y=2021/m=4/d=13/h=17/'
table_path = path_prefix + path_suffix
df = wr.s3.read_parquet(path=table_path)
print(len(df))
输出:
22646
如果 S3 路径中没有文件,例如,如果我将path_suffix
from更改'/y=2021/m=4/d=13/h=17/'
为,则会'/y=2021/m=4/d=13/h=170/'
收到以下错误:
---------------------------------------------------------------------------
NoFilesFound Traceback (most recent call last)
<ipython-input-9-17df460412d8> in <module>
11
12 file_prefix = table_path + date_prefix
---> 13 df = wr.s3.read_parquet(path=file_prefix)
/usr/local/lib/python3.9/site-packages/awswrangler/s3/_read_parquet.py in read_parquet(path, path_suffix, path_ignore_suffix, ignore_empty, ignore_index, partition_filter, columns, validate_schema, chunked, dataset, categories, safe, map_types, use_threads, last_modified_begin, last_modified_end, boto3_session, s3_additional_kwargs)
602 paths = _apply_partition_filter(path_root=path_root, paths=paths, filter_func=partition_filter)
603 if len(paths) < 1:
--> 604 raise exceptions.NoFilesFound(f"No files Found on: {path}.")
605 _logger.debug("paths:\n%s", paths)
606 args: Dict[str, Any] = {
NoFilesFound: No files Found on: s3://example_bucket/data/parquet_files/y=2021/m=4/d=13/h=170/.
看起来它来自awswrangler
Python 库,所以 botocore.exceptions 无法捕捉到它。我可以简单地使用 pythontry:
并except:
绕过,但我需要抓住它才能正确处理它。我怎样才能做到这一点?