2

这是我的OrderedDict对象,

a=OrderedDict([(u'p', [u'"The Exam Room" is a new series in which everyday medical questions are answered by physicians and professors from the Yale School of Medicine.', u'In our second episode: Dr. Stephen Strittmatter, Vincent Coates Professor of Neurology and director of the Adler Memory Clinic in Neurology, explains when memory loss can become a problem and what you can do to boost your brain power.', OrderedDict([(u'em', u'Produced & Hosted by Noah Golden')])])])

我想做的是从这个对象中获取文本,

>>> a.get('p')

并获得输出,

[u'"The Exam Room" is a new series in which everyday medical questions are answered by physicians and professors from the Yale School of Medicine.', u'In our second episode: Dr. Stephen Strittmatter, Vincent Coates Professor of Neurology and director of the Adler Memory Clinic in Neurology, explains when memory loss can become a problem and what you can do to boost your brain power.', OrderedDict([(u'em', u'Produced & Hosted by Noah Golden')])]

但结果文本也包含一个OrderedDict.

我如何结合来自两者的文本OrderedDict

预期输出:

The Exam Room" is a new series in which everyday medical questions are answered by physicians and professors from the Yale School of Medicine.', u'In our second episode: Dr. Stephen Strittmatter, Vincent Coates Professor of Neurology and director of the Adler Memory Clinic in Neurology, explains when memory loss can become a problem and what you can do to boost your brain power. Produced & Hosted by Noah Golden
4

2 回答 2

2

如果您事先不知道类型的嵌套,这里的关键是递归。这是一个示例(为便于阅读,对文本进行了格式化):

#!/usr/bin/env python

import collections

a = collections.OrderedDict([(u'p', [u""" 
    "The Exam Room" is a new series in
    which everyday medical questions are answered by physicians and 
    professors from the Yale School of Medicine.""", 
    u"""In our second episode: Dr. Stephen Strittmatter,
    Vincent Coates Professor of Neurology and director of
    the Adler Memory Clinic in Neurology, explains when 
    memory loss can become a problem and what you can do to 
    boost your brain power.""", 
    collections.OrderedDict([(u'em',
        u'Produced & Hosted by Noah Golden')])])])

现在展平对象,它可能是映射或列表。实现了三个选项:如果找到的值是一个字符串,我们只需将其附加到我们的collector. 如果它是 alist或 a Mapping,我们flatten再次调用。请注意,您可以使用allowedkwarg 指定一些允许的标签:

def flatten(obj, allowed=(u'p', u'em')):
    collector = []

    def process(v, collector=collector):
        if isinstance(v, (list, collections.Mapping)):
            collector += flatten(v, allowed=allowed)
        elif isinstance(v, basestring):
            collector.append(v)
        else:
            raise ValueError('Cannot handle type: {t}'.format(t=v.__class__))

    if isinstance(obj, list):
        for v in obj:
            process(v)

    if isinstance(obj, collections.Mapping):
        for k, v in obj.iteritems():
            if k in allowed:
                process(v)

    return collector

if __name__ == '__main__':
    print(flatten(a))

您的示例的结果将是一个三元素列表,如下所示:

[u'"The Exam Room" is a new series ...',
 u'In our second episode: ...',
 u'Produced & Hosted by Noah Golden']

现在,如果您想要一个字符串,只需join现在扁平化的列表:

print(''.join(flatten(a)))
于 2014-04-10T14:36:21.943 回答
1

这是一个奇怪的字典,但你可以像这样实现你想要的:

[a['p'][0],a['p'][1] + u' ' + a['p'][2]['em']]

结果:

[u'“The Exam Room”是一个新系列,耶鲁大学医学院的医生和教授回答日常医学问题。',u'在我们的第二集中:Stephen Strittmatter 博士,Vincent Coates 神经病学教授和神经病学阿德勒记忆诊所的主任解释了记忆丧失何时会成为一个问题,以及你可以做些什么来提高你的脑力。由Noah Golden制作和主持']

正如您在问题中要求的那样,这将返回一个列表。如果您想使用单个字符串:

import string
string.join([a['p'][0],a['p'][1],a['p'][2]['em']])

这将导致:

“The Exam Room”是一个新系列,由耶鲁大学医学院的医生和教授回答日常医学问题。在我们的第二集中:Stephen Strittmatter 博士,Vincent Coates 神经病学教授和 Adler 神经病学记忆诊所主任解释了记忆丧失何时会成为一个问题,以及您可以做些什么来提高您的脑力。由 Noah Golden 制作和主持

于 2014-04-10T14:24:23.623 回答