我正在尝试实现一个 COPY 语句以将 pandas 数据帧推送到 Airflow DAG 中的 CloudSQL Postgres 数据库。我有一个限制:我只能使用 pg8000 驱动程序。我将此用作参考https://github.com/tlocke/pg8000#copy-from-and-to-a-file(我在此线程中找到了https://news.ycombinator.com/item?id =25402430 )
这是我的代码
def getconn() -> pg8000.native.Connection:
conn: pg8000.native.Connection = connector.connect(
PG_CONFIG["host"],
"pg8000",
user=PG_CONFIG["user"],
password=PG_CONFIG["password"],
db=PG_CONFIG["database"]
)
return conn
engine = sqlalchemy.create_engine("postgresql+pg8000://",creator=getconn)
engine.dialect.description_encoding = None
stream_in = StringIO()
csv_writer = csv.writer(stream_in)
csv_writer.writerow([1, "electron"])
csv_writer.writerow([2, "muon"])
csv_writer.writerow([3, "tau"])
stream_in.seek(0)
conn = engine.connect()
conn.execute("CREATE TABLE IF NOT EXISTS temp_table (user_id numeric, user_name text)")
conn.execute("COPY temp_table FROM STDIN WITH (FORMAT CSV)", stream=stream_in)
我已经尝试了我能想到的一切(使用 DELEMITER 选项,传递文本而不是 csv ......)但我不断收到以下错误“无法确定参数 $1 的数据类型”
[SQL: COPY winappsx.aa FROM STDIN WITH (FORMAT CSV)]
[parameters: {'stream': <_io.StringIO object at 0x7f86a58d7dc8>}]
(Background on this error at: http://sqlalche.me/e/13/4xp6)
Traceback (most recent call last):
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
cursor.execute(statement, parameters)
File "/opt/python3.6/lib/python3.6/site-packages/pg8000/dbapi.py", line 454, in execute
statement, vals=vals, input_oids=self._input_oids, stream=stream
File "/opt/python3.6/lib/python3.6/site-packages/pg8000/core.py", line 632, in execute_unnamed
self.handle_messages(context)
File "/opt/python3.6/lib/python3.6/site-packages/pg8000/core.py", line 769, in handle_messages
raise self.error
pg8000.exceptions.DatabaseError: {'S': 'ERROR', 'V': 'ERROR', 'C': '42P18', 'M': 'could not determine data type of parameter $1', 'F': 'postgres.c', 'L': '1363', 'R': 'exec_parse_message'}
我知道连接有效,因为表已正确创建。错误发生在 COPY 语句上。
我怀疑提供流参数的方式存在问题,但找不到正确的语法。这可能会有所帮助https://www.kite.com/python/docs/pg8000.Cursor.execute
谢谢您的帮助!