0

当使用 bigquery 数据传输将数据从 S3 移动到 BigQuery 时,我获得了间歇性的成功(实际上我只看到它一次正常工作)。

成功:

6:00:48 PM  Summary: succeeded 1 jobs, failed 0 jobs.   
6:00:14 PM  Job bqts_5f*** (table test_json_data) completed successfully. Number of records: 516356, with errors: 0.    
5:59:13 PM  Job bqts_5f*** (table test_json_data) started.  
5:59:12 PM  Processing files from Amazon S3 matching: "s3://bucket-name/*.json" 
5:59:12 PM  Moving data from Amazon S3 to Google Cloud complete: Moved 2661 object(s).  
5:58:50 PM  Starting transfer from Amazon S3 for files with prefix: "s3://bucket-name/" 
5:58:49 PM  Starting transfer from Amazon S3 for files modified before 2020-07-27T16:48:49-07:00 (exclusive).   
5:58:49 PM  Transfer load date: 20200727    
5:58:48 PM  Dispatched run to data source with id 138***3616

通常的实例是 0 次成功,0 次失败,如下所示:

8:33:13 PM  Summary: succeeded 0 jobs, failed 0 jobs.   
8:32:38 PM  Processing files from Amazon S3 matching: "s3://bucket-name/*.json" 
8:32:38 PM  Moving data from Amazon S3 to Google Cloud complete: Moved 3468 object(s).  
8:32:14 PM  Starting transfer from Amazon S3 for files with prefix: "s3://bucket-name/" 
8:32:14 PM  Starting transfer from Amazon S3 for files modified between 2020-07-27T16:48:49-07:00 and 2020-07-27T19:22:14-07:00 (exclusive).    
8:32:13 PM  Transfer load date: 20200728    
8:32:13 PM  Dispatched run to data source with id 13***0415

上面的第二个日志没有Job bqts...运行可能会发生什么?有什么地方可以让我获得有关这些数据传输作业的更多详细信息吗?我的另一份工作遇到了 JSON 错误,所以我不相信是这样。

谢谢!

4

1 回答 1

0

我对日志记录感到有些困惑,因为它会找到并移动对象,例如

我相信我误读了文档,我之前曾认为亚马逊 URIs3://bucket-name/*.json会抓取 json 文件的目录,但即使上面的消息似乎表明了这一点,它也只会将文件加载到顶层的 bigquery 中(对于s3://bucket-name/*.jsonURI)。

于 2020-07-28T21:33:09.267 回答