我正在尝试从多个 HDFS .gz 文件中读取,但我只希望那些以昨天的日期作为文件名开头的文件。我的文件如下所示:
/notmy-data/openSourceDatasets/Temperatures/2013-06-10T133006.gz
/notmy-data/openSourceDatasets/Temperatures/2013-06-11T153006.gz
/notmy-data/openSourceDatasets/Temperatures/2013-06-11T173006.gz
/notmy-data/openSourceDatasets/Temperatures/2013-06-11T193006.gz
这就是我所拥有的...
DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd");
Calendar cal = Calendar.getInstance();
cal.add(Calendar.DATE, -1);
String yesterdate = dateFormat.format(cal.getTime());
Path tpath = new Path("/notmy-data/openSourceDatasets/Temperatures/" + yesterdate + "*");
FileStatus[] status = fileSystem.listStatus(tpath);
System.out.println("["+new Date()+"] Starting tempertaure ingest...");
for (int i=0;i<status.length;i++){
BufferedReader reader =new BufferedReader(new InputStreamReader(new GZIPInputStream(fileSystem.open(status[i].getPath()))));
String line;
while (null != (line = reader.readLine())){
System.out.println(line);
}
我试过这个有没有星星。我总是得到一个java.io.FileNotFoundException
. 我究竟做错了什么??