2

我正在尝试转换:

datalist = [u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg'}",
 u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/3/_/3_13.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/3/_/3_13.jpg'}",
 u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/5/_/5_3_1.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/_/5_3_1.jpg'}",
 u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/1/_/1_22.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_22.jpg'}",
 u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/4/_/4_7_1.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/4/_/4_7_1.jpg'}"]

列出包含 python dict 的列表。如果我尝试使用关键字提取值,则会收到此错误:

for i in datalist:
    print i['smallimage']
   ....:     

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-686ea4feba66> in <module>()
      1 for i in datalist:
----> 2     print i['smallimage']
      3 

TypeError: string indices must be integers

如何将包含 Unicode Dict 的列表转换为 Dict..

4

6 回答 6

9

You could use the demjson module which has a non-strict mode that handles the data you have:

import demjson

for data in datalist:
    dct = demjson.decode(data)
    print dct['gallery'] # etc...
于 2013-06-17T13:00:42.410 回答
3

正如另一个想法,您的列表格式正确,Yaml。

> yaml.load(u'{foo: "bar"}')['foo']
'bar'

如果您想真正花哨并立即解析所有内容:

> data = yaml.load('['+','.join(datalist)+']')
> data[0]['smallimage']
'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'
> data[3]['gallery']
'gal1'
于 2013-06-17T13:09:51.657 回答
3

在这种情况下,我会手工制作一个正则表达式,将它们变成你可以评估为 Python 的东西:

import re
import ast
from functools import partial

keys = re.compile(r'(gallery|smallimage|largeimage)')
fix_keys = partial(keys.sub, r'"\1"')

for entry in datalist:
    entry = ast.literal_eval(fix_keys(entry))

是的,这是有限的;但它适用于这个集合并且只要键匹配就很健壮。正则表达式易于维护。此外,这不使用任何外部依赖项,它全部基于已包含的电池。

结果:

>>> for entry in datalist:
...     print ast.literal_eval(fix_keys(entry))
... 
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/3/_/3_13.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/3/_/3_13.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/_/5_3_1.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/5/_/5_3_1.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_22.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/1/_/1_22.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/4/_/4_7_1.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/4/_/4_7_1.jpg'}
于 2013-06-17T12:57:28.680 回答
2

如果您的字典键被引用,您可以json.loads用来加载字符串。

import json
for i in datalist:
   print json.loads(i)['smallimage']

ast.literal_eval也会工作的......)

然而,事实上,这将适用于老派eval

>>> class Mdict(dict):
...     def __missing__(self,key):
...        return key
... 
>>> eval(datalist[0],Mdict(__builtins__=None))
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'}

请注意,这可能容易受到注入攻击,因此仅当字符串来自受信任的来源时才使用它。


最后,对于任何想要一个简短但有点密集的解决方案,它只使用标准库并且不易受到注入攻击的人......这个小宝石可以解决问题(假设字典键是有效的标识符)!

import ast
class RewriteName(ast.NodeTransformer):
    def visit_Name(self,node):
        return ast.Str(s=node.id)

transformer = RewriteName()
for x in datalist:
    tree = ast.parse(x,mode='eval')
    transformer.visit(tree)
    print ast.literal_eval(tree)['smallimage']
于 2013-06-17T12:51:33.487 回答
0

Your datalist is a list of unicode strings.

You could use eval, except your keys are not properly quoted. what you can do is requote your keys on the fly with replace:

for i in datalist:
    my_dict = eval(i.replace("gallery", "'gallery'").replace("smallimage", "'smallimage'").replace("largeimage", "'largeimage'"))
    print my_dict["smallimage"]
于 2013-06-17T13:01:23.890 回答
-4

我不明白为什么需要所有额外的东西,比如使用rejson......

fdict = {str(k): v for (k, v) in udict.items()}

哪里udict有钥匙dictunicode只需将它们转换为str. 在您给定的数据中,您可以简单地...

datalist = [dict((str(k), v) for (k, v) in i.items()) for i in datalist]

简单测试:

>>> datalist = [{u'a':1,u'b':2},{u'a':1,u'b':2}]
[{u'a': 1, u'b': 2}, {u'a': 1, u'b': 2}]
>>> datalist = [dict((str(k), v) for (k, v) in i.items()) for i in datalist]
>>> datalist
[{'a': 1, 'b': 2}, {'a': 1, 'b': 2}]

没有import reimport json。简单快捷。

于 2013-06-17T13:04:53.117 回答