9

I want to take two dictionaries and print a diff of them. This diff should include the differences in keys AND values. I've created this little snippet to achieve the results using built-in code in the unittest module. However, it's a nasty hack since I have to subclass unittest.TestCase and provide a runtest() method for it to work. In addition, this code will cause the application to error out since it will raise an AssertError when there are differences. All I really want is to print the diff.

import unittest
class tmp(unittest.TestCase):
    def __init__(self):
         # Show full diff of objects (dicts could be HUGE and output truncated)
        self.maxDiff = None
    def runTest():
        pass
_ = tmp()
_.assertDictEqual(d1, d2)

I was hoping to use the difflib module, but it looks to only work for strings. Is there some way to work around this and still use difflib?

4

7 回答 7

7

改编自 cpython 源代码:

https://github.com/python/cpython/blob/01fd68752e2d2d0a5f90ae8944ca35df0a5ddeaa/Lib/unittest/case.py#L1091

import difflib
import pprint

def compare_dicts(d1, d2):
    return ('\n' + '\n'.join(difflib.ndiff(
                   pprint.pformat(d1).splitlines(),
                   pprint.pformat(d2).splitlines())))
于 2015-05-20T18:10:04.593 回答
4

You can use difflib, but the use unittest method seems more appropriate to me. But if you wanted to use difflib. Let's say say the following are the two dicts.

In [50]: dict1
Out[50]: {1: True, 2: False}

In [51]: dict2
Out[51]: {1: False, 2: True}

You may need to convert them to strings (or list of strings) and then go about using difflib as a normal business.

In [43]: a = '\n'.join(['%s:%s' % (key, value) for (key, value) in sorted(dict1.items())])
In [44]: b = '\n'.join(['%s:%s' % (key, value) for (key, value) in sorted(dict2.items())])
In [45]: print a
1:True
2:False
In [46]: print b
1:False
2:True
In [47]: for diffs in difflib.unified_diff(a.splitlines(), b.splitlines(), fromfile='dict1', tofile='dict2'):
    print diffs

THe output would be:

--- dict1

+++ dict2

@@ -1,2 +1,2 @@

-1:True
-2:False
+1:False
+2:True
于 2012-10-18T14:50:34.630 回答
2

您可以.items()与集合一起使用来执行以下操作:

>>> d = dict((i,i) for i in range(10))
>>> d2 = dict((i,i) for i in range(1,11))
>>>
>>> set(d.items()) - set(d2.items())
set([(0, 0)])
>>>
>>> set(d2.items()) - set(d.items())
set([(10, 10)])
>>>
>>> set(d2.items()) ^ set(d.items())  #symmetric difference
set([(0, 0), (10, 10)])
>>> set(d2.items()).symmetric_difference(d.items())  #only need to actually create 1 set
set([(0, 0), (10, 10)])
于 2012-10-18T14:30:15.943 回答
1

我找到了一个名为datadiff的库(没有很好的文档记录),它给出了 python 中可散列数据结构的差异。您可以使用 pip 或 easy_install 安装它。试试看!

于 2013-10-29T11:00:56.410 回答
0

使用@mgilson 的解决方案并更进一步,以便 OP 请求使用unittest模块。

def test_dict_diff(self):
    dict_diff = list(set(self.dict_A.items()).symmetric_difference(set(self.dict_B.items()))))
    fail_message = "too many differences:\nThe differences:\n" +
                   "%s" % "\n".join(dict_diff)
    self.assertTrue((len(dict_diff) < self.maxDiff), fail_message)
于 2012-10-18T15:00:26.733 回答
0

请参阅Python 配方以创建两个字典的差异(作为字典)。你能描述一下输出应该是什么样子(请附上一个例子)?

于 2012-10-18T14:36:17.910 回答
0

查看https://github.com/inveniosoftware/dictdiffer

print list(diff(
    {2014: [
        dict(month=6, category=None, sum=672.00),
        dict(month=6, category=1, sum=-8954.00),
        dict(month=7, category=None, sum=7475.17),
        dict(month=7, category=1, sum=-11745.00),
        dict(month=8, category=None, sum=-12140.00),
        dict(month=8, category=1, sum=-11812.00),
        dict(month=9, category=None, sum=-31719.41),
        dict(month=9, category=1, sum=-11663.00),
    ]},

    {2014: [
       dict(month=6, category=None, sum=672.00),
       dict(month=6, category=1, sum=-8954.00),
       dict(month=7, category=None, sum=7475.17),
       dict(month=7, category=1, sum=-11745.00),
       dict(month=8, category=None, sum=-12141.00),
       dict(month=8, category=1, sum=-11812.00),
       dict(month=9, category=None, sum=-31719.41),
       dict(month=9, category=1, sum=-11663.00),
    ]}))

给出了我认为非常棒的输出:

[('change', ['2014', 4, 'sum'], (-12140.0, -12141.0))]

即它给出了发生的事情:一个值“改变”,路径“['2014',4,'sum']”,它从-12140.0变为-12141.0。

于 2014-11-02T14:21:07.173 回答