In linux hadoop fs -ls I can use wildcard (/sandbox/*) but the pyhon hdfs client list method fails on this as an unknown path. Is there a different way to use wildcards in python-hdfs?
问问题
251 次
1 回答
0
Found this which uses os.walk with fnmatch, and adopted it to hadoop_client.
here is an example for finding csv files:
for root, dirs, files in hc.walk(Path):
for filename in fnmatch.filter(files, '*.csv'):
print(os.path.join(root, filename))
于 2019-11-07T11:18:21.960 回答