0

我正在使用下一个命令将 pandas 数据框上传到 bigquery:

pandas_gbq.to_gbq(df, table_id, table_schema=schema_,if_exists= 'append', chunksize = 100)

如您所见,我正在分块上传数据。

一开始,该过程正在运行,它正在将数据块上传到所需的表,但在某个时候(在 255 次迭代之后),我得到了下一个错误:

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\AMIT~1.SHR\\AppData\\Local\\Temp\\tmp1e4wr6vl_job_88b5419e.csv

我无法在我的计算机上找到此 CSV,也不知道它是什么。

完整的追溯:

Traceback (most recent call last):
  File "C:/Users/amit.shreiber/PycharmProjects/churn/churn_metric.py", line 498, in <module>
    load_to_bq_gbq( months_retention_df)
  File "C:\Users\amit.shreiber\PycharmProjects\churn\BQ_functions.py", line 48, in load_to_bq_gbq
    pandas_gbq.to_gbq(df, table_id, table_schema=schema_,if_exists= 'append', chunksize = 100)
  File "C:\Users\amit.shreiber\Anaconda3\envs\churn\lib\site-packages\pandas_gbq\gbq.py", line 1093, in to_gbq
    connector.load_data(
  File "C:\Users\amit.shreiber\Anaconda3\envs\churn\lib\site-packages\pandas_gbq\gbq.py", line 573, in load_data
    for remaining_rows in chunks:
  File "C:\Users\amit.shreiber\Anaconda3\envs\churn\lib\site-packages\tqdm\std.py", line 1178, in __iter__
    for obj in iterable:
  File "C:\Users\amit.shreiber\Anaconda3\envs\churn\lib\site-packages\pandas_gbq\load.py", line 79, in load_chunks
    client.load_table_from_dataframe(
  File "C:\Users\amit.shreiber\Anaconda3\envs\churn\lib\site-packages\google\cloud\bigquery\client.py", line 2579, in load_table_from_dataframe
    os.remove(tmppath)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\AMIT~1.SHR\\AppData\\Local\\Temp\\tmp1e4wr6vl_job_88b5419e.csv'

Process finished with exit code 1
4

1 回答 1

0

我不知道为什么,但是当我选择更大的块大小(1000)时它起作用了。

于 2021-06-21T17:32:46.670 回答