设置:我在连接到具有大 csv (200Gb) 的 HDFS 的 Centos 7.6 VM 上运行 Dremio。现在我通过 pyodbc 查询 Dremio 但是如果我使用该方法:
cnxn = pyodbc.connect("Driver={};ConnectionType=Direct;HOST={};PORT={};AuthenticationType=Plain;UID={};PWD={}".format(driver, host,port,uid,pwd),autocommit=True)
sql = "SELECT * FROM HDFS.data.\"factTable.txt\" WHERE C > 0.0 LIMIT 100"
cursor = cnxn.cursor()
cursor.execute(sql)
它未能给出错误“用户'mpowers'取消查询。使用
cnxn = pyodbc.connect("Driver={};ConnectionType=Direct;HOST={};PORT={};AuthenticationType=Plain;UID={};PWD={}".format(driver, host,port,uid,pwd),autocommit=True)
sql = "SELECT * FROM HDFS.data.\"factTable.txt\" WHERE C > 0.0 LIMIT 100"
dataframe = pandas.read_sql(sql,cnxn)
但是查询成功(但我不想阅读我的查询,我只想计时)。查看我的 server.out 文件比较两个查询:
2020-12-07 19:34:26,749 [UserServer-1] INFO com.dremio.ConnectionLog - [b61585f8-5b73-4806-97e8-28e77caa4a74] Connection opened.
Endpoint: 127.0.0.1:42862
Protocol Version: 5
Record Type: n/a
Record Formats:
Support Complex Types: false
Name: Dremio C++ Client
Version: 1.17.0 (1.17.0)
Application: Unknown
User Properties:
schema=
support_complex_types=false
userName=mpowers
2020-12-07 19:34:26,750 [UserServer-1] WARN c.dremio.sabot.rpc.user.UserSession - Ignoring unknown property: SUPPORT_COMPLEX_TYPES
2020-12-07 19:34:26,750 [UserServer-1] WARN c.dremio.sabot.rpc.user.UserSession - Ignoring unknown property: USERNAME
2020-12-07 19:34:26,754 [out-of-band-observer] INFO query.logger - {"queryId":"20317c3c-d70c-b598-20cd-f038f7647900","queryText":"NA","start":1607369666752,"finish":1607369666753,"outcome":"COMPLETED","username":"mpowers"}
2020-12-07 19:34:26,761 [out-of-band-observer] INFO query.logger - {"queryId":"20317c3d-4ef2-bd14-6949-24781167ea00","queryText":"NA","start":1607369666756,"finish":1607369666759,"outcome":"COMPLETED","username":"mpowers"}
2020-12-07 19:34:26,841 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317c3d-29fa-dec0-730e-654b426bf300:1:0
2020-12-07 19:34:26,841 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317c3d-29fa-dec0-730e-654b426bf300:1:1
2020-12-07 19:34:26,841 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317c3d-29fa-dec0-730e-654b426bf300:1:2
2020-12-07 19:34:26,842 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317c3d-29fa-dec0-730e-654b426bf300:1:3
2020-12-07 19:34:26,842 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317c3d-29fa-dec0-730e-654b426bf300:1:4
2020-12-07 19:34:26,842 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317c3d-29fa-dec0-730e-654b426bf300:1:5
2020-12-07 19:34:26,842 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317c3d-29fa-dec0-730e-654b426bf300:0:0
2020-12-07 19:34:26,904 [e5 - 20317c3d-29fa-dec0-730e-654b426bf300:frag:1:4] INFO c.d.s.e.f.FragmentStatusReporter - 20317c3d-29fa-dec0-730e-654b426bf300:1:4: State to report: FINISHED
2020-12-07 19:34:26,911 [e0 - 20317c3d-29fa-dec0-730e-654b426bf300:frag:0:0] INFO c.d.s.e.f.FragmentStatusReporter - 20317c3d-29fa-dec0-730e-654b426bf300:0:0: State to report: FINISHED
2020-12-07 19:34:26,915 [e1 - 20317c3d-29fa-dec0-730e-654b426bf300:frag:1:2] INFO c.d.s.e.f.FragmentStatusReporter - 20317c3d-29fa-dec0-730e-654b426bf300:1:2: State to report: FINISHED
2020-12-07 19:34:26,916 [e6 - 20317c3d-29fa-dec0-730e-654b426bf300:frag:1:1] INFO c.d.s.e.f.FragmentStatusReporter - 20317c3d-29fa-dec0-730e-654b426bf300:1:1: State to report: FINISHED
2020-12-07 19:34:26,916 [e2 - 20317c3d-29fa-dec0-730e-654b426bf300:frag:1:5] INFO c.d.s.e.f.FragmentStatusReporter - 20317c3d-29fa-dec0-730e-654b426bf300:1:5: State to report: FINISHED
2020-12-07 19:34:26,919 [e4 - 20317c3d-29fa-dec0-730e-654b426bf300:frag:1:3] INFO c.d.s.e.f.FragmentStatusReporter - 20317c3d-29fa-dec0-730e-654b426bf300:1:3: State to report: CANCELLED
2020-12-07 19:34:26,923 [e3 - 20317c3d-29fa-dec0-730e-654b426bf300:frag:1:0] INFO c.d.s.e.f.FragmentStatusReporter - 20317c3d-29fa-dec0-730e-654b426bf300:1:0: State to report: CANCELLED
2020-12-07 19:34:26,927 [out-of-band-observer] INFO query.logger - {"queryId":"20317c3d-29fa-dec0-730e-654b426bf300","queryText":"SELECT * FROM HDFS.data.\"factTable.txt\" WHERE C > 0.0 LIMIT 100","start":1607369666763,"finish":1607369666925,"outcome":"COMPLETED","username":"mpowers"}
2020-12-07 19:34:26,943 [UserServer-1] INFO c.d.sabot.rpc.user.UserRPCServer - [USER]: Channel closed /127.0.0.1:31010 <--> /127.0.0.1:42862 (user client)
2020-12-07 19:34:26,943 [UserServer-1] INFO com.dremio.ConnectionLog - [b61585f8-5b73-4806-97e8-28e77caa4a74] Connection Closed
vs. 失败的方法
2020-12-07 19:35:37,698 [UserServer-1] INFO com.dremio.ConnectionLog - [30c3f4d8-241f-4f3d-a9e6-de7aea93b752] Connection opened.
Endpoint: 127.0.0.1:42904
Protocol Version: 5
Record Type: n/a
Record Formats:
Support Complex Types: false
Name: Dremio C++ Client
Version: 1.17.0 (1.17.0)
Application: Unknown
User Properties:
schema=
support_complex_types=false
userName=mpowers
2020-12-07 19:35:37,698 [UserServer-1] WARN c.dremio.sabot.rpc.user.UserSession - Ignoring unknown property: SUPPORT_COMPLEX_TYPES
2020-12-07 19:35:37,699 [UserServer-1] WARN c.dremio.sabot.rpc.user.UserSession - Ignoring unknown property: USERNAME
2020-12-07 19:35:37,705 [out-of-band-observer] INFO query.logger - {"queryId":"20317bf5-c9fe-5e85-6e81-9ce420af1c00","queryText":"NA","start":1607369737702,"finish":1607369737703,"outcome":"COMPLETED","username":"mpowers"}
2020-12-07 19:35:37,712 [out-of-band-observer] INFO query.logger - {"queryId":"20317bf5-b19e-866c-5da6-2f580816e500","queryText":"NA","start":1607369737708,"finish":1607369737711,"outcome":"COMPLETED","username":"mpowers"}
2020-12-07 19:35:37,763 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317bf6-1026-5a5e-5c04-03d012fa4300:1:0
2020-12-07 19:35:37,763 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317bf6-1026-5a5e-5c04-03d012fa4300:1:1
2020-12-07 19:35:37,763 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317bf6-1026-5a5e-5c04-03d012fa4300:1:2
2020-12-07 19:35:37,763 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317bf6-1026-5a5e-5c04-03d012fa4300:1:3
2020-12-07 19:35:37,764 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317bf6-1026-5a5e-5c04-03d012fa4300:1:4
2020-12-07 19:35:37,764 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317bf6-1026-5a5e-5c04-03d012fa4300:1:5
2020-12-07 19:35:37,764 [FABRIC-rpc-event-queue] INFO c.d.sabot.exec.FragmentExecutors - Received remote fragment start instruction for 20317bf6-1026-5a5e-5c04-03d012fa4300:0:0
2020-12-07 19:35:37,824 [e5 - 20317bf6-1026-5a5e-5c04-03d012fa4300:frag:1:0] INFO c.d.s.e.f.FragmentStatusReporter - 20317bf6-1026-5a5e-5c04-03d012fa4300:1:0: State to report: FINISHED
2020-12-07 19:35:37,826 [e0 - 20317bf6-1026-5a5e-5c04-03d012fa4300:frag:1:3] INFO c.d.s.e.f.FragmentStatusReporter - 20317bf6-1026-5a5e-5c04-03d012fa4300:1:3: State to report: FINISHED
2020-12-07 19:35:37,830 [e0 - 20317bf6-1026-5a5e-5c04-03d012fa4300:frag:0:0] INFO c.d.s.e.f.FragmentStatusReporter - 20317bf6-1026-5a5e-5c04-03d012fa4300:0:0: State to report: FINISHED
2020-12-07 19:35:37,834 [e4 - 20317bf6-1026-5a5e-5c04-03d012fa4300:frag:1:4] INFO c.d.s.e.f.FragmentStatusReporter - 20317bf6-1026-5a5e-5c04-03d012fa4300:1:4: State to report: CANCELLED
2020-12-07 19:35:37,835 [UserServer-1] INFO c.d.sabot.rpc.user.UserRPCServer - [USER]: Channel closed /127.0.0.1:31010 <--> /127.0.0.1:42904 (user client)
2020-12-07 19:35:37,835 [UserServer-1] INFO com.dremio.ConnectionLog - [30c3f4d8-241f-4f3d-a9e6-de7aea93b752] Connection Closed
2020-12-07 19:35:37,835 [e3 - 20317bf6-1026-5a5e-5c04-03d012fa4300:frag:1:5] INFO c.d.s.e.f.FragmentStatusReporter - 20317bf6-1026-5a5e-5c04-03d012fa4300:1:5: State to report: CANCELLED
2020-12-07 19:35:37,838 [e2 - 20317bf6-1026-5a5e-5c04-03d012fa4300:frag:1:2] INFO c.d.s.e.f.FragmentStatusReporter - 20317bf6-1026-5a5e-5c04-03d012fa4300:1:2: State to report: CANCELLED
2020-12-07 19:35:37,840 [e6 - 20317bf6-1026-5a5e-5c04-03d012fa4300:frag:1:1] INFO c.d.s.e.f.FragmentStatusReporter - 20317bf6-1026-5a5e-5c04-03d012fa4300:1:1: State to report: CANCELLED
2020-12-07 19:35:37,844 [out-of-band-observer] INFO query.logger - {"queryId":"20317bf6-1026-5a5e-5c04-03d012fa4300","queryText":"SELECT * FROM HDFS.data.\"factTable.txt\" WHERE C > 0.0 LIMIT 100","start":1607369737716,"finish":1607369737841,"outcome":"CANCELED","outcomeReason":"Query cancelled by user 'mpowers'","username":"mpowers"}
请注意,两者之间的区别是:
2020-12-07 19:34:26,943 [UserServer-1] INFO c.d.sabot.rpc.user.UserRPCServer - [USER]: Channel closed /127.0.0.1:31010 <--> /127.0.0.1:42862 (user client)
2020-12-07 19:34:26,943 [UserServer-1] INFO com.dremio.ConnectionLog - [b61585f8-5b73-4806-97e8-28e77caa4a74] Connection Closed
在失败的方法上记录查询之前出现,而在成功的方法上出现在记录查询之后。
编辑:为了清楚起见,我不想使用 read_sql,因为我想要运行的查询的读取时间太长,所以我无法准确测量数据库速度。