出于某种奇怪的原因,在我从 ubuntu 12 切换到 ubuntu 14 后,我的 python 代码停止工作。我不能再解开我的数据了。我通过转换为 latin1 编码将数据存储在 couchdb 数据库中。
我使用 latin1 是因为我前段时间读到(我不再有链接)它是我可以用来从 couchdb 数据库存储和检索 cPickled 二进制数据的唯一编码。这是为了避免 json 的编码问题(couchdbkit 在后台使用 json)。
Latin1 应该将 256 个字符映射到 256 个字符,这将是一个字节一个字节。现在,系统升级后,python 似乎抱怨只有 128 个有效值并抛出 UnicodeDecodeError (见下文)
- 旧的 python 版本是 2.7.3
- 旧的 couchdb 版本 1.6.1
旧的 couchdbkit 是 0.5.7
新的python版本是2.7.6
- 新的 couchdb 版本 1.6.1(未更改)
- 新的 couchdbkit 是 0.6.5
不确定您是否需要所有这些详细信息,但这里有一些我使用的声明:
#deals with all the errors when saving an item
def saveitem(item):
item.set_db(self.db)
item["_id"] = key
error = True
while error:
try:
item.save()
error = False
except ResourceConflict:
try:
item = DBEntry.get_or_create(key)
except ResourceConflict:
pass
except (NoMoreData) as e:
print "CouchDB.set.saveitem: NoMoreData error, retrying...", str(e)
except (RequestError) as e:
print "CouchDB.set.saveitem: RequestError error. retrying...", str(e)
#deals with most of what could go wrong when adding an attachment
def addattachment(item, content, name = "theattachment"):
key = item["_id"]
error = True
while error:
try:
item.put_attachment(content = content, name = name) #, content_type = "application/octet-stream"
error = False
except ResourceConflict:
try:
item = DBEntry.get_or_create(key)
except ResourceConflict:
print "addattachment ResourceConflict, retrying..."
except NoMoreData:
print "addattachment NoMoreData, retrying..."
except (NoMoreData) as e:
print key, ": no more data exception, wating 1 sec and retrying... -> ", str(e)
time.sleep(1)
item = DBEntry.get_or_create(key)
except (IOError) as e:
print "addattachment IOError:", str(e), "repeating..."
item = DBEntry.get_or_create(key)
except (KeyError) as e:
print "addattachment error:", str(e), "repeating..."
try:
item = DBEntry.get_or_create(key)
except ResourceConflict:
pass
except (NoMoreData) as e:
pass
然后我保存如下:
pickled = cPickle.dumps(obj = value, protocol = 2)
pickled = pickled.decode('latin1')
item = DBEntry(content={"seeattachment": True, "ispickled" : True},
creationtm=datetime.datetime.utcnow(),lastaccesstm=datetime.datetime.utcnow())
item = saveitem(item)
addattachment(item, pickled)
这就是我打开包装的方式。数据是在ubuntu 12下写的,在ubuntu 14下解包失败:
def unpackValue(self, value, therawkey):
if value is None: return None
originalval = value
value = value["content"]
result = None
if value.has_key("realcontent"):
result = value["realcontent"]
elif value.has_key("seeattachment"):
if originalval.has_key("_attachments"):
if originalval["_attachments"].has_key("theattachment"):
if originalval["_attachments"]["theattachment"].has_key("data"):
result = originalval["_attachments"]["theattachment"]["data"]
result = base64.b64decode(result)
else:
print "unpackvalue: no data in attachment. Here is how it looks like:"
print originalval["_attachments"]["theattachment"].iteritems()
else:
error = True
while error:
try:
result = self.db.fetch_attachment(therawkey, "theattachment")
error = False
except ResourceConflict:
print "could not get attachment for", therawkey, "retrying..."
time.sleep(1)
except ResourceNotFound:
self.delete(key = therawkey, rawkey = True)
return None
if value["ispickled"]:
result = cPickle.loads(result.encode('latin1'))
else:
result = value
if isinstance(result, unicode): result = result.encode("utf8")
return result
该行在result = cPickle.loads(result.encode('latin1'))
ubuntu 12 下成功,但在 ubuntu 14 下失败。以下错误:
UnicodeDecodeError:“ascii”编解码器无法解码位置 0 的字节 0xc2:序数不在范围内(128)
我在 ubuntu 12 下没有得到那个错误!
如何在保留较新的 couchdbkit 和 python 版本的同时在 ubuntu 14 下读取我的数据?这甚至是版本控制问题吗?为什么会发生这种错误?