9

我有一个格式为的python字符串:

str = "name: srek age :24 description: blah blah"

有什么方法可以将它转换为看起来像的字典

{'name': 'srek', 'age': '24', 'description': 'blah blah'}  

其中每个条目都是取自字符串的 (key,value) 对。我尝试将字符串拆分为列表

str.split()  

然后手动删除:,检查每个标签名称,添加到字典中。这种方法的缺点是:这种方法很讨厌,我必须:为每一对手动删除,如果字符串中有多个单词“值”(例如blah blahfor description),每个单词将是列表中的一个单独条目不可取。是否有任何获取字典的 Pythonic 方式(使用 python 2.7)?

4

3 回答 3

35
>>> r = "name: srek age :24 description: blah blah"
>>> import re
>>> regex = re.compile(r"\b(\w+)\s*:\s*([^:]*)(?=\s+\w+\s*:|$)")
>>> d = dict(regex.findall(r))
>>> d
{'age': '24', 'name': 'srek', 'description': 'blah blah'}

解释:

\b           # Start at a word boundary
(\w+)        # Match and capture a single word (1+ alnum characters)
\s*:\s*      # Match a colon, optionally surrounded by whitespace
([^:]*)      # Match any number of non-colon characters
(?=          # Make sure that we stop when the following can be matched:
 \s+\w+\s*:  #  the next dictionary key
|            # or
 $           #  the end of the string
)            # End of lookahead
于 2012-04-30T09:07:56.457 回答
3

没有re

r = "name: srek age :24 description: blah blah cat: dog stack:overflow"
lis=r.split(':')
dic={}
try :
 for i,x in enumerate(reversed(lis)):
    i+=1
    slast=lis[-(i+1)]
    slast=slast.split()
    dic[slast[-1]]=x

    lis[-(i+1)]=" ".join(slast[:-1])
except IndexError:pass    
print(dic)

{'age': '24', 'description': 'blah blah', 'stack': 'overflow', 'name': 'srek', 'cat': 'dog'}
于 2012-04-30T09:07:22.740 回答
0

以原始顺序显示字典的 Aswini 程序的其他变体

import os
import shutil
mystr = "name: srek age :24 description: blah blah cat: dog stack:overflow"
mlist = mystr.split(':')
dict = {}
list1 = []
list2 = []
try:
 for i,x in enumerate(reversed(mlist)):
    i = i + 1
    slast = mlist[-(i+1)]
    cut = slast.split()
    cut2 = cut[-1]
    list1.insert(i,cut2)
    list2.insert(i,x)
    dict.update({cut2:x})
    mlist[-(i+1)] = " ".join(cut[0:-1])
except:
 pass   

rlist1 = list1[::-1]
rlist2= list2[::-1]

print zip(rlist1, rlist2)

输出

[('name', 'srek'), ('age', '24'), ('description', 'blah blah'), ('cat', 'dog'), ('stack', 'overflow')]

于 2013-01-25T03:32:48.697 回答