我正在学习 datalab jupyter notebook 教程 ~/datalab/tutorials/BigQuery/'Importing and Exporting Data.ipynb'。我无法理解以下行为:
table.extract(destination = sample_bucket_object)
此提取的结果 csv 包含:
Year,Make,Model,Description,Price
1997,Ford,E350,"ac, abs, moon",3000
1999,Chevy,Venture Extended Edition,,4900
1999,Chevy,Venture Extended Edition,Very Large,5000
1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof, loaded",4799
这看起来不完整。当表格第一次被填充时,它只提取了从 cars.csv 中插入表格的 4 行:
sample_table.load('gs://cloud-datalab-samples/cars.csv', mode='append',
source_format = 'csv', csv_options=bq.CSVOptions(skip_leading_rows = 1))
它忽略了命令添加的 cars2.csv 中的另外 2 行:
cars2 = storage.Item('cloud-datalab-samples', 'cars2.csv').read_from()
df2 = pd.read_csv(StringIO(cars2))
df2.fillna(value='', inplace=True)
sample_table.insert_data(df2)
它确实进入了表格:
%%sql
SELECT * FROM sample.cars
给出:
Year Make Model Description Price
1997 Ford E350 ac, abs, moon 3000
1999 Chevy Venture Extended Edition 4900
1999 Chevy Venture Extended Edition Very Large 5000
1996 Jeep Grand Cherokee MUST SELL! air, moon roof, loaded 4799
2015 Tesla Model S 64900
2010 Honda Civic 15000
作为测试,我在笔记本中切换了 cars.csv 和 cars2.csv 并重新运行了所有命令。然后 table.extract() 只导出了 cars2.csv 行:
Year,Make,Model,Description,Price
2010,Honda,Civic,,15000
2015,Tesla,Model S,,64900
我在这里想念什么?