0

我在将 JSON 导入 BigQuery 时遇到问题。我们已经创建了服务帐户,并且正在使用定制的 .NET 4 库来进行我们的服务器和 BQ 之间的所有对话。查询工作,工作列表工作,基本上所有获取工作,但通过 JSON 格式上传不起作用。

以下是开始的作业返回的内容:

{
 "kind": "bigquery#job",
 "etag": "\"WgwoVdnmFVq0E0riaWM5H0QXabs/R_b3J5b4GjwliMH_X8kjPNLVYsI\"",
 "id": "dot-metrics:job_f7eea1449bb24dffb0a0de1637f31abb",
 "selfLink": "https://www.googleapis.com/bigquery/v2/projects/dot-metrics/jobs/job_f7eea1449bb24dffb0a0de1637f31abb",
 "jobReference": {
  "projectId": "dot-metrics",
  "jobId": "job_f7eea1449bb24dffb0a0de1637f31abb"
 },
 "configuration": {
  "load": {
   "schema": {
    "fields": [
     {
      "name": "word",
      "type": "STRING",
      "mode": "REQUIRED"
     },
     {
      "name": "word_count",
      "type": "INTEGER",
      "mode": "REQUIRED"
     },
     {
      "name": "corpus",
      "type": "STRING",
      "mode": "REQUIRED"
     },
     {
      "name": "corpus_date",
      "type": "INTEGER",
      "mode": "REQUIRED"
     }
    ]
   },
   "destinationTable": {
    "projectId": "dot-metrics",
    "datasetId": "DotMetric_TEST",
    "tableId": "TestTable"
   },
   "writeDisposition": "WRITE_APPEND",
   "allowQuotedNewlines": true,
   "sourceFormat": "NEWLINE_DELIMITED_JSON"
  }
 },
 "status": {
  "state": "DONE",
  "errorResult": {
   "reason": "internalError",
   "message": "Backend error. Job aborted."
  }
 },
 "statistics": {
  "startTime": "1350998303355",
  "endTime": "1350998337446",
  "load": {
   "inputFiles": "1",
   "inputFileBytes": "7359"
  }
 }
}

数据是 JSON 换行符分隔的字符串,如下所示:

{"Word":"blah_139","WordCount":6615,"Corpus":"Corpus_678","CorpusDate": 6088201915056}
{"Word":"blah_602","WordCount":2978,"Corpus":"Corpus_493","CorpusDate": 6088201915056}
{"Word":"blah_50","WordCount":8315,"Corpus":"Corpus_360","CorpusDate": 6088201915056}
{"Word":"blah_736","WordCount":8971,"Corpus":"Corpus_751","CorpusDate": 6088201915056}
{"Word":"blah_243","WordCount":2362,"Corpus":"Corpus_174","CorpusDate": 6088201915056}
{"Word":"blah_643","WordCount":765,"Corpus":"Corpus_315","CorpusDate": 6088201915056}

作业运行了一段时间(大约 10 秒),然后就死了。请帮忙!

4

1 回答 1

0

好的,看起来您复制了莎士比亚样本表并附加到它。莎士比亚示例模式,因为它是使用更旧版本的 bigquery 从谷歌内部的源数据导入的,所以它的模式有一些缺陷。当我们导入到它时,这些缺陷会导致您的问题(具体来说,我们认为 corpus_date 字段应该是 int32 字段而不是 int64,即使 bigquery 仅支持新数据的 int32)。

如果您执行 write_truncate 而不是 append 并传递新模式,或者导入新表,则不应该遇到此问题。

于 2012-10-23T15:30:52.373 回答