python - Python mysql-connector 将一些字符串转换为字节数组

Question

我正在使用 python3 和 pandas 连接到一些 sql 数据库：

import pandas as pd
import mysql.connector

cnx = mysql.connector.connect(user='me', password='***',
                          host='***',
                          database='***')
df=pd.read_sql("select id as uid,refType from user where registrationTime>=1451606400",con=cnx)
cnx.close()

我得到 2 列：id 和 refType，它们都是字符串类型（SQL 术语中的 varchar）。但是，由于某种原因，refType 列被正确导入为字符串，但 uid 列被导入为 bytearray。这是他们的样子：

df.head()

                                             uid  
0 [49, 54, 54, 57, 55, 54, 50, 55, 64, 97, 110]
1 [49, 54, 54, 57, 55, 54, 50, 56, 64, 105, 111]
2 [ 49, 48, 49, 53, 51, 50, 51, 50, 57, 53, 57, 5...
3 [57, 53, 52, 52, 56, 57, 56, 56, 49, 50, 57, 5...
4 [49, 54, 54, 57, 55, 54, 50, 57, 64, 105, 111]
                                         refType  
0 adx_Facebook.IE_an_ph_u8_-.cc-ch.gf.au-ret7.c...
1 adx_Facebook.IE_io_ph_u4_-.cc-gb.gf.au-toppay...
2 ad_nan_1845589538__CAbroadEOScys_-.cc-ca.gf.a. ..
3 ad_offerTrialPay-DKlvl10-1009
4 adx_Facebook.IE_io_ph_u4_-.cc-us.gf.au-topspe...

这就是 uid 列的外观：

[i.decode() for i in df['uid'][1:5]]

['16697628@io', '10153232959751867@fb', '954489881295911@fb', '16697629@io']

我既不明白为什么将其转换为字节数组，也不明白如何选择将其转换为字符串。我在互联网或熊猫文档中找不到任何关于它或类似问题的信息。当然，我总是可以在导入后将该列转换为字符串，但这不是首选，因为显示的 sql 查询只是一个示例，在实际表中可能有数百个列会被错误地导入为字节数组。手动查找这些列并转换为字符串将是真正的痛苦

连接器本身输出相同的字节数组：

cursor = cnx.cursor()
cursor.execute('select id as uid,refType from user where registrationTime>=1451606400 LIMIT 1')
cursor.fetchall()`

[(bytearray(b'16697627@an'), 'adx_Facebook.IE_an_ph_u8_-.cc-ch.gf.au-ret7.cr-cys.dt-all.csd-291215.-')

SQL数据库中列的数据类型是第一列（uid）为“Varchar（32）”，第二列（refType）为“Varchar（128）”

score 3 · Accepted Answer

包“mysql-connector”有同样的问题。安装“mysql-connector-python”代替了我的窍门。

pip install mysql-connector-python

score 0 · Accepted Answer

也许尝试不同的方法。使用 Python 将 SQL 写入 CSV 文件，然后将 CSV 文件读入 Pandas。

import pyodbc
import csv
import pandas

cnxn = pyodbc.connect('DRIVER={Server Type};SERVER=YourServer;DATABASE=YourDatabase;UID=UserId;PWD=PassWord') 
cursor = cnxn.cursor()
query = cursor.execute("select id as uid,refType from user where registrationTime>=1451606400")
List = {}
for row in cursor.fetchall():
    List.update({row.uid:row.refType})
cnxn.close()

with open('C:\\file.csv', 'wb') as the_file:
    for key,value in CurrentCommentList.items():
        the_file.write(str(key).encode('utf-8') + ','.encode('utf-8') + 
        str(value).encode('utf-8') + '\n'.encode('utf-8'))

pd.read_csv('C:\\file.csv')

score 0 · Accepted Answer

这确实很奇怪。我想知道将参数“coerce_float=False”传递给 read_sql 函数是否有助于解决这种情况。

python - Python mysql-connector 将一些字符串转换为字节数组

3 回答 3

Related

Reference