2

我正在从事一个涉及 BitTorrent 的项目,我在其中收到一个位域作为 python 字符串。例如:

位域=“000001110100111000110101100010”

我希望能够将 python 字符串转换为一种格式,以便可以使用 PYODBC 将其按原样插入到 MSSQL 数据库的 varbinary(max) 列中。如果我尝试将它作为字符串插入,它当然会抱怨非法转换错误。

注意 PYODBC,根据他们的文档,需要一个字节数组或缓冲区作为 varbinary 字段的输入。

任何建议,将不胜感激。

4

2 回答 2

2

假设您使用的是最新版本的 python,您可以利用标准库struct模块和bin函数。这是一个简单的例子:

con = pyodbc.connect("...")
con.execute("CREATE TABLE bin_test ( bin_col varbinary(max) )")
con.execute("INSERT INTO bin_test VALUES (?)",
    (int("000001110100111000110101100010", 2),))
result = con.execute("SELECT * FROM bin_test").fetchone()
bin(struct.unpack(">I", result[0])[0])

最终语句的结果是

'0b1110100111000110101100010'

这是初始位域(删除了前导零)。

您可以在docs.python.org上找到 struct 模块的文档。bin 函数的文档也可以在同一个地方找到。

于 2011-11-04T04:31:31.977 回答
2

在开始编写代码之前,我想提出一个建议:“位域”值不是可以划分为字节的长度。我建议您在处理位字符串时,将它们以字节大小增长(例如,如果 len(bitfield)%8 != 0: print '确保位域可以完全由字节表示!')到确保字段在不同编程语言、编程语言中的不同库和不同数据库中的操作方式没有歧义。换句话说,数据库、python、我要推荐的库等都将要么存储要么能够以字节数组的形式表示这个位数组。如果提供的位数组没有均匀地划分为字节,则会发生以下三种情况之一:1)将引发错误,(这是乐观的) 2)位数组将自动神奇地左填充。3) 位数组将自动神奇地向右填充。

我建议使用某种位串库。为此,我使用了python-bitstring 。我没有花时间在这里处理 ODBC,但想法基本相同,并利用了 srgerg 的回答:

例子:

#!/usr/bin/python
import pymssql
from binascii import hexlify
from bitstring import BitArray
dbconninfo = {'host': 'hostname', 'user': 'username', 'password': 'secret', 'database': 'bitexample', 'as_dict': True}
conn = pymssql.connect(**dbconninfo)
cursor = conn.cursor()

bitfield = "000001110100111000110101100010"

ba = BitArray(bin=bitfield)
print '%32d (bitfield -> BitArray -> int)' % ba.int

cursor.execute("CREATE TABLE bin_test (bin_col varbinary(max) )")
cursor.execute("INSERT INTO bin_test values (%s)", (ba.int,))
cursor.execute("SELECT bin_col FROM bin_test")
results = cursor.fetchone()['bin_col'] # results now contains binary packed data '\x01\xd3\x8db'
conn.rollback()
results_int = int(hexlify(results),16)
print '%32d (bitfield -> BitArray -> int -> DB (where data is binary packed) -> unpacked with hexlify -> int)' % results_int

print '%32s (Original bitfield)' % bitfield
from_db_using_ba_hexlify_and_int_with_length = BitArray(int=int(hexlify(results),16), length=30).bin
print '%32s (From DB, decoded with hexlify, using int to instantiate BitArray, specifying length of int as 30 bits, out as bin)' %
from_db_using_ba_hexlify_and_int_with_length
from_db_using_ba_hex = BitArray(hex=hexlify(results)).bin # Can't specify length with hex
print '%32s (From DB, decoded with hexlify, using hex to instantiate BitArray, can not specify length, out as bin)' % from_db_using_ba_hex
from_db_using_ba_bytes_no_length = BitArray(bytes=results).bin # Can specify length with bytes... that's next.
print '%32s (From DB, using bytes to instantiate BitArray, no length specified, out as bin)' % from_db_using_ba_bytes_no_length
from_db_using_ba_bytes = BitArray(bytes=results,length=30).bin
print '%32s (From DB, using bytes to instantiate BitArray, specifying length of bytes as 30 bits, out as bin)' % from_db_using_ba_bytes
from_db_using_hexlify_bin = bin(int(hexlify(results),16))
print '%32s (from DB, decoded with hexlify -> int -> bin)' % from_db_using_hexlify_bin
from_db_using_hexlify_bin_ba = BitArray(bin=bin(int(hexlify(results),16))).bin
print '%32s (from DB, decoded with hexlify -> int -> bin -> BitArray instantiated with bin)' % from_db_using_hexlify_bin
from_db_using_bin = bin(int(results,16))
print '%32s (from DB, no decoding done, using bin)' % from_db_using_bin

这个的输出是:

                        30641506 (bitfield -> BitArray -> int)
                        30641506 (bitfield -> BitArray -> int -> DB (where data is binary packed) -> unpacked with hexlify -> int)
  000001110100111000110101100010 (Original bitfield)
  000001110100111000110101100010 (From DB, decoded with hexlify, using int to instantiate BitArray, specifying length of int as 30 bits, out as bin)
00000001110100111000110101100010 (From DB, decoded with hexlify, using hex to instantiate BitArray, can not specify length, out as bin)
00000001110100111000110101100010 (From DB, using bytes to instantiate BitArray, no length specified, out as bin)
  000000011101001110001101011000 (From DB, using bytes to instantiate BitArray, specifying length of bytes as 30 bits, out as bin)
     0b1110100111000110101100010 (from DB, decoded with hexlify -> int -> bin)
     0b1110100111000110101100010 (from DB, decoded with hexlify -> int -> bin -> BitArray instantiated with bin)
Traceback (most recent call last):
  File "./bitexample.py", line 38, in <module>
    from_db_using_bin = bin(int(results,16))
ValueError: invalid literal for int() with base 16: '\x01\xd3\x8db'

请注意,由于您没有可以直接分解为字节的位串(它是一个代表 30 位的字符串),因此获得完全相同的字符串的唯一方法是指定一个长度,即使这样,结果也不是一致取决于 BitArray 的实例化方式。

于 2012-02-10T22:58:14.913 回答