我想合并两个字典 A 和 B 使得结果包含:
- 来自 A 的所有对,其中 key 对 A 是唯一的
- 来自 B 的所有对,其中密钥对 B 是唯一的
- f(valueA, valueB) 其中 A 和 B 中都存在相同的键
例如:
def f(x, y):
return x * y
A = {1:1, 2:3}
B = {7:3, 2:2}
C = merge(A, B)
输出:
{1:1, 7:3, 2:6}
感觉应该有一个很好的单线来做到这一点。
我想合并两个字典 A 和 B 使得结果包含:
例如:
def f(x, y):
return x * y
A = {1:1, 2:3}
B = {7:3, 2:2}
C = merge(A, B)
输出:
{1:1, 7:3, 2:6}
感觉应该有一个很好的单线来做到这一点。
使用字典视图来实现这一点;结果就像一个集合,让你做交叉和对称的差异dict.viewkeys()
:
def merge(A, B, f):
# Start with symmetric difference; keys either in A or B, but not both
merged = {k: A.get(k, B.get(k)) for k in A.viewkeys() ^ B.viewkeys()}
# Update with `f()` applied to the intersection
merged.update({k: f(A[k], B[k]) for k in A.viewkeys() & B.viewkeys()})
return merged
在 Python 3 中,该.viewkeys()
方法已重命名为.keys()
,替换了旧.keys()
功能(在 Python 2 中返回一个列表)。
上述merge()
方法是适用于任何给定的通用解决方案f()
。
演示:
>>> def f(x, y):
... return x * y
...
>>> A = {1:1, 2:3}
>>> B = {7:3, 2:2}
>>> merge(A, B, f)
{1: 1, 2: 6, 7: 3}
>>> merge(A, B, lambda a, b: '{} merged with {}'.format(a, b))
{1: 1, 2: '3 merged with 2', 7: 3}
偷这个(A.get(k, B.get(k))
来自@MartijnPieters的片段
>>> def f(x, y):
return x * y
>>> A = {1:1, 2:3}
>>> B = {7:3, 2:2}
>>> {k: f(A[k], B[k]) if k in A and k in B else A.get(k, B.get(k))
for k in A.viewkeys() | B.viewkeys()}
{1: 1, 2: 6, 7: 3}
这是我在 Python 3 中针对一般情况的解决方案代码。
我首先编写了merge函数,然后将其扩展为更通用的merge_with函数,该函数接受一个函数和各种数量的字典。如果这些字典中有任何重复的键,请将提供的函数应用于键重复的值。
可以使用merge_with函数重新定义合并函数,就像合并函数一样。名称合并意味着将它们全部合并并保留最右边的值,是否有任何重复。mergel函数也是如此,它保留在最左边。
这里的所有函数——merge、merge_with、merge和merge——在它们采用任意数量的字典参数的情况下都是通用的。具体来说,merge_with必须将一个与其应用的数据兼容的函数作为参数。
from functools import reduce
from operator import or_
def merge(*dicts):
return { k: reduce(lambda d, x: x.get(k, d), dicts, None)
for k in reduce(or_, map(lambda x: x.keys(), dicts), set()) }
def merge_with(f, *dicts):
return { k: (lambda x: f(*x) if len(x)>1 else x[0])([ d[k] for d in dicts
if k in d ])
for k in reduce(or_, map(lambda x: x.keys(), dicts), set()) }
mergel = lambda *dicts: merge_with(lambda *x: x[0], *dicts)
merger = lambda *dicts: merge_with(lambda *x: x[-1], *dicts)
测试
>>> squares = { k:k*k for k in range(4) }
>>> squares
{0: 0, 1: 1, 2: 4, 3: 9}
>>> cubes = { k:k**3 for k in range(2,6) }
>>> cubes
{2: 8, 3: 27, 4: 64, 5: 125}
>>> merger(squares, cubes)
{0: 0, 1: 1, 2: 8, 3: 27, 4: 64, 5: 125}
>>> merger(cubes, squares)
{0: 0, 1: 1, 2: 4, 3: 9, 4: 64, 5: 125}
>>> mergel(squares, cubes)
{0: 0, 1: 1, 2: 4, 3: 9, 4: 64, 5: 125}
>>> mergel(cubes, squares)
{0: 0, 1: 1, 2: 8, 3: 27, 4: 64, 5: 125}
>>> merge(squares, cubes)
{0: 0, 1: 1, 2: 8, 3: 27, 4: 64, 5: 125}
>>> merge(cubes, squares)
{0: 0, 1: 1, 2: 4, 3: 9, 4: 64, 5: 125}
>>> merge_with(lambda x, y: x+y, squares, cubes)
{0: 0, 1: 1, 2: 12, 3: 36, 4: 64, 5: 125}
>>> merge_with(lambda x, y: x*y, squares, cubes)
{0: 0, 1: 1, 2: 32, 3: 243, 4: 64, 5: 125}
更新
在我写完上面的内容之后,我发现还有另一种方法可以做到这一点。
from functools import reduce
def merge(*dicts):
return reduce(lambda d1, d2: reduce(lambda d, t:
dict(list(d.items())+[t]),
d2.items(), d1),
dicts, {})
def merge_with(f, *dicts):
return reduce(lambda d1, d2: reduce(lambda d, t:
dict(list(d.items()) +
[(t[0], f(d[t[0]], t[1])
if t[0] in d else
t[1])]),
d2.items(), d1),
dicts, {})
mergel = lambda *dicts: merge_with(lambda x, y: x, *dicts)
merger = lambda *dicts: merge_with(lambda x, y: y, *dicts)
请注意,使用merge_with的合并和合并的定义已更改为新函数作为第一个参数。f函数现在必须是二进制的。上面提供的测试仍然有效。这里有一些更多的测试来展示这些功能的普遍性。
>>> merge() == {}
True
>>> merge(squares) == squares
True
>>> merge(cubes) == cubes
True
>>> mergel() == {}
True
>>> mergel(squares) == squares
True
>>> mergel(cubes) == cubes
True
>>> merger() == {}
True
>>> merger(squares) == squares
True
>>> merger(cubes) == cubes
True
>>> merge_with(lambda x, y: x+y, squares, cubes, squares)
{0: 0, 1: 2, 2: 16, 3: 45, 4: 64, 5: 125}
>>> merge_with(lambda x, y: x*y, squares, cubes, squares)
{0: 0, 1: 1, 2: 128, 3: 2187, 4: 64, 5: 125}
dict(list(A.items()) + list(B.items()) + [(k,f(A[k],B[k])) for k in A.keys() & B.keys()])
在我看来,它是 Python 3 中最短且最易读的代码。我从 DhruvPathak 的答案中得出它,并意识到优化它会导致 kampu 专门针对 Python 3 的答案:
dict(itertools.chain(A.items(), B.items(), ((k,f(A[k],B[k])) for k in A.keys() & B.keys())))
我在这里比较了所有答案的性能,并得到了这个排名:
mergeLZ: 34.0ms
(赵磊,相当笨重的单线)mergeJK: 11.6ms
(贾米拉克)mergeMP: 11.5ms
(Martijn Pieters,几乎是单线)mergeDP: 6.9ms
(德鲁帕塔克)mergeDS: 6.8ms
(上面的第一个单行)mergeK3: 5.2ms
(kampu = 上面的第二个单线)mergeS3: 3.5ms
(势在必行,而不是单行)后者 mergeS3 是一个天真的、命令式的多行代码。我很失望,在性能方面,旧方法占了上风。此测试针对简单的整数键和值,但对于大字符串键和值的排名非常相似。显然,里程可能会因字典大小和键重叠量(我的测试中的 1/3)而异。顺便说一句,我还没有尝试理解的 Lei Zhao 的第二个实现似乎性能很差,慢了约 1000 倍。
编码:
import functools
import itertools
import operator
import timeit
def t(x): # transform keys and values
return x # str(x) * 8
def f(x,y): # merge values
return x + y
N = 10000
A = {t(k*2): t(k*22) for k in range(N)}
B = {t(k*3): t(k*33) for k in range(N)}
def check(AB):
assert(len(A) == N)
assert(len(B) == N)
assert(len(AB) == 16666)
assert(AB[t(0)] == f(t(0), t(0)))
assert(t(1) not in AB)
assert(AB[t(2)] == t(1*22))
assert(AB[t(3)] == t(1*33))
assert(AB[t(4)] == t(2*22))
assert(t(5) not in AB)
assert(AB[t(6)] == f(t(3*22), t(2*33)))
assert(t(7) not in AB)
assert(AB[t(8)] == t(4*22))
assert(AB[t(9)] == t(3*33))
def mergeLZ(): # Lei Zhao
merged = {k: (lambda x: f(*x) if len(x)>1 else x[0])([ d[k] for d in [A, B]
if k in d ])
for k in functools.reduce(operator.or_, map(lambda x: x.keys(), [A, B]), set()) }
check(merged)
def mergeJK(): # jamylak
merged = {k: f(A[k], B[k]) if k in A and k in B else A.get(k, B.get(k)) for k in A.keys() | B.keys()}
check(merged)
def mergeMP(): # Martijn Pieters
merged = {k: A.get(k, B.get(k)) for k in A.keys() ^ B.keys()}
merged.update({k: f(A[k], B[k]) for k in A.keys() & B.keys()})
check(merged)
def mergeDP(): # DhruvPathak
merged = dict([(k,v) for k,v in A.items()] + [ (k,v) if k not in A else (k,f(A[k],B[k])) for k,v in B.items()])
check(merged)
def mergeDS(): # more elegant (IMO) variation on DhruvPathak
merged = dict(list(A.items()) + list(B.items()) + [(k,f(A[k],B[k])) for k in A.keys() & B.keys()])
check(merged)
def mergeK3(): # kampu adapted to Python 3
merged = dict(itertools.chain(A.items(), B.items(), ((k,f(A[k],B[k])) for k in A.keys() & B.keys())))
check(merged)
def mergeS3(): # "naive" imperative way
merged = A.copy()
for k,v in B.items():
if k in A:
merged[k] = f(A[k], v)
else:
merged[k] = v
check(merged)
for m in [mergeLZ, mergeJK, mergeMP, mergeDP, mergeDS, mergeK3, mergeS3]:
print("{}: {:4.1f}ms".format(m.__name__, timeit.timeit(m, number=1000)))
>>> def f(x,y):
... return x*y
...
>>> dict([(k,v) for k,v in A.items()] + [ (k,v) if k not in A else (k,f(A[k],B[k])) for k,v in B.items()])
{1: 1, 2: 6, 7: 3}
from itertools import chain
intersection = set(A.keys()).intersection(B.keys())
C = dict(chain(A.items(), B.items(), ((k, f(A[k], B[k])) for k in intersection)))
从技术上讲,可以制成单线。适用于 Py2 和 Py3。如果你只关心 Py3,你可以将 'intersection' 行重写为:
intersection = A.keys() & B.keys()
(仅适用于 Py2,请A.viewkeys() & B.viewkeys()
改用。)
对于具有函数式编程背景的用户来说,一种不同的方法(恕我直言)更具可读性
def merge_with(f):
def merge(a,b):
g = lambda l: [x for x in l if x is not None]
keys = a.keys() | b.keys()
return {key:f(*g([a.get(key), b.get(key)])) for key in keys}
return merge
将此应用于OP的示例:
A = {1:1, 2:3}
B = {7:3, 2:2}
merge_with(lambda x,y=1: x*y)(A,B)
def merge_dict(dict1,dict2):
dict1={1:'red'}
dict2={2:'black',3:'yellow'}
dict1.update(dict2)
print 'dict3 =',dict1
merge_dict(dict1,dict2)
输出:
dict3 = {1: 'red', 2: 'black', 3: 'yellow'}