1

我正在实现一个在计算机之间同步文件的客户端-服务器程序。在某些时候,当客户端连接到服务器时,一个包含必须与服务器同步的位置的目录结构的 XML 文件被发送到服务器。如果客户端不是第一次连接,则服务器上存在这样的 XML 文件,因此必须找到这两个文件之间的差异,并且客户端仅请求已更改或新文件的文件。我的问题是......我怎样才能找到 XML 之间的差异。我有 2 个 XML 样本(一个 XML 可以有和 1Gb)。

第一个 XML (source.xml)

<?xml version='1.0'?>
<RootDirectory name="New folder" dateCreated="5/20/2013 7:16:32 PM">
  <Folder  name="New folder1" >
    <File name="New Text Document11.txt" />
    <File name="New Text Document12.txt"  />
  </Folder>
   <File name="New Text Document1.txt"  />
</RootDirectory>

第二个 XML (changed.xml)

<?xml version="1.0" encoding="UTF-8"?>
<RootDirectory name="New folder" dateCreated="5/20/2013 7:15:50 PM">
   <Folder name="New folder1">
      <File name="New Text Document11.txt" />
      <File name="New Text Document12.txt" />
   </Folder>
   <Folder name="New folder2">
      <Folder name="New folder21">
         <File name="New Text Document211.txt" />
         <Folder name="New folder211">
            <File name="New Text Document2111.txt" />
            <Folder name="New folder2111">
               <File name="New Text Document21111.txt" />
            </Folder>
         </Folder>
      </Folder>
      <File name="New Text Document21.txt" />
   </Folder>
   <File name="New Text Document1.txt" />
</RootDirectory>

我找到了一个使用 LinQ to XML http://www.codeproject.com/Articles/45233/Diff-in-XML-files-with-LINQ的示例,但是那里使用的 XML 结构不同,它对我没有帮助非常。鉴于我可以拥有标签FolderFile嵌套在其他节点内..我真的不知道如何找到差异......

有人可以给我一个想法吗?

谢谢!

最好的问候, Oana

4

1 回答 1

0

一个可能的解决方案可能是这样的算法:

  • 从旧文件中提取所有文件夹
  • 从新文件中提取所有文件夹
  • 使用List except 方法获取新文件夹=> new = list2.Except(list1)
  • 使用List Intersect 方法获取相同的文件夹=> possibleChanges = list2.Intersect(list1)
  • 比较更改列表的内容
  • 对所有级别重复所有这些(递归或迭代)

示例代码:

var xml1 = @"<?xml version='1.0'?>
<RootDirectory name=""New folder"" dateCreated=""5/20/2013 7:16:32 PM"">
  <Folder  name=""New folder1"" >
    <File name=""New Text Document11.txt"" />
    <File name=""New Text Document12.txt""  />
  </Folder>
   <File name=""New Text Document1.txt""  />
</RootDirectory>";

var xml2 = @"<?xml version=""1.0"" encoding=""UTF-8""?>
<RootDirectory name=""New folder"" dateCreated=""5/20/2013 7:15:50 PM"">
   <Folder name=""New folder1"">
      <File name=""New Text Document11.txt"" />
      <File name=""New Text Document12.txt"" />
      <File name=""New Text Document13.txt"" />
   </Folder>
   <Folder name=""New folder2"">
      <Folder name=""New folder21"">
         <File name=""New Text Document211.txt"" />
         <Folder name=""New folder211"">
            <File name=""New Text Document2111.txt"" />
            <Folder name=""New folder2111"">
               <File name=""New Text Document21111.txt"" />
            </Folder>
         </Folder>
      </Folder>
      <File name=""New Text Document21.txt"" />
   </Folder>
   <File name=""New Text Document1.txt"" />
</RootDirectory>";

var xmlDoc1 = XDocument.Parse(xml1);
var xmlDoc2 = XDocument.Parse(xml2);

var f1 = allFolders(xmlDoc1.Root);
var f2 = allFolders(xmlDoc2.Root);
var newFolders = f2.Except(f1);
var sameFolders = f2.Intersect(f1);
// completely new folders, content is new => no further checks needed!
newFolders.Dump();
// check if content of same folders has changed
foreach (var sameFolder in sameFolders)
{
    var oldContent = folderContent(xmlDoc1.Root.Descendants("Folder").Where (r => r.Attribute("name").Value == sameFolder).ToList());
    var newContent = folderContent(xmlDoc2.Root.Descendants("Folder").Where (r => r.Attribute("name").Value == sameFolder).ToList());
    var newFiles = newContent.Except(oldContent);
}

private List<string> folderContent(IEnumerable<XElement> nodes)
{
    var files = new List<String>();
    nodes.ToList().ForEach(n => files.AddRange(n.Elements("File").Select (x => x.Attribute("name").Value).ToList()));
    return files;
}

private List<string> allFolders(XElement node)
{
    var folders = node.Descendants("Folder").ToList().Select (e => e.Attribute("name").Value).ToList();
    return folders;
}

输出:

//changed folders
New folder2
New folder21
New folder211
New folder2111

// new files for New folder 1
NewTextDocument13.txt
于 2013-05-25T18:47:52.100 回答