11

我想我有一个相当独特的问题要解决。好吧,我无法使用谷歌找到足够的信息。就这样,

我在一个 Java EE SOA 应用程序上工作,该应用程序使用 Oracle XML DB 将 XML 文档存储为 XML。每当 XML 更改时,我都会增加版本并将以前的版本放入不同的表中。

现在的要求是,我应该将两个版本之间的差异存储为 XML,而不是整个 XML 文档。

  1. 有没有可以进行 XML 比较的 Java 库?(XMLUnit,...?)
  2. 是否有用于捕获 XML 差异的标准 XML 模式?
  3. 我可以使用什么转换技术将“差异”应用于 XML 以在版本之间来回切换?(XSLT、Groovy、....?)

我很感激你的时间。

4

4 回答 4

10

In my last job, we had a similar problem: We had to detect changes, insertions, and deletions of specific items between two XML files. The files weren't arbitrary XML; they had to adhere to our XSD.

Our solution was to implement a kind of merge sort: Parse the files (using a SAX parser, not a DOM parser, to permit arbitrarily large files), and store the parsed data in separate HashMaps. Then, we compared the contents of the two maps using a merge-sort type of algorithm.

Naturally, the larger the files got, the more memory pressure we experienced, so I ultimately wrote a FileHashMap class that pushed the HashMap's value space to random access files. While theoretically slower, this solution allowed our comparisons to work with very large files, without thrashing or OutOfMemoryError conditions. (A version of that FileHashMap class is available in this library: http://www.clapper.org/software/java/util/)

I have no idea whether what I just described is even remotely close to what you need, but I thought I'd share it, just in case.

Good luck.

于 2009-01-10T02:22:56.413 回答
8

旁注:现在在RFC 5261中有一种可识别 XML 的“补丁”的标准格式。至少有一个自由软件程序xmlpatch来实现它。它是用 C 编写的,您可以从 Java 中调用它。

于 2009-01-12T09:25:14.127 回答
4

有许多用 Java 编写的开源 XML diff 工具可供您借鉴。此类工具的一个列表在这里

于 2009-01-09T22:43:29.510 回答
1

尝试使用漂亮的差异。它旨在与基本 XML 语法的几个不同扩展一起使用。

http://prettydiff.com/

于 2011-12-10T13:45:21.510 回答