4

我喜欢 Linq to Xml API。我用过的最简单的一个。

我还记得它是在 atop 上实现的XmlReader,它是一个非缓存阅读器,意思是:

var rdr = XmlReader.Create("path/to/huge_3Gb.xml");

...将立即返回(可能最多读取 xml 标头)。

文档确实表明XDocument.Load()它正在使用XmlReader.Create().

我希望,就像所有Linq一样,我会使用 Linq2Xml 获得延迟执行行为。
但是后来我尝试了这个,就像我通常对涉及文件的任何事情一样:

using(var xdoc = XDocument.Load("file")){ ... }

和惊喜!它不编译,因为 XDocument 没有实现IDisposable

嗯,这很奇特!使用完后,我将如何释放文件句柄XDocument

然后我恍然大悟:可能会XDocument.Load()立即吃掉内存中的整个 Xml(并立即关闭文件句柄)?

所以我尝试了:

var xdoc = XDocument.Load("path/to/huge_3Gb.xml");

等,等,然后进程说:

Unhandled Exception: OutOfMemoryException.

所以 Linq to Xml 接近完美(很棒的 AP​​I),但没有雪茄(在大型 Xml 上使用时)。

</rant>

我的问题是:

  1. 我是否遗漏了什么并且有一种方法可以懒惰地使用 Linq to Xml?

  2. 如果上一个问题的答案是“否”:

Linq to Xml API 不能有类似于 Linq to Objects 的延迟行为是否有客观原因?在我看来,至少一些操作(例如,使用 forward-only 可能的事情XmlReader)可以延迟实现。

...Or is it not implemented like this, quoting Eric Lippert,

" because no one ever designed, specified, implemented, tested, documented and shipped that feature" ?

4

1 回答 1

6

Actually Linq to Xml uses deferred execution. But it queries in-memory data, not data from file. You can load data from file, from stream, from string, or build document manually - it does not matter how in-memory nodes graph will be constructed. Linq to xml is used to query in-memory representation of xml tree (i.e. objects graph).

Here is a sample which shows how deferred execution works with Linq to Xml. Consider you have XDocument which contains objects graph with following data:

<movies>
  <movie id="1" name="Avatar"/>
  <movie id="2" name="Doctor Who"/>
</movies>

It does not matter how you will create in-memory representation of this xml data. E.g.

 var xdoc = XDocument.Parse(xml_string);
 // or XDocument.Load(file_name);
 // or new XDocument(new XElement("movies"), ...)

Now define query:

var query = xdoc.Descendants("movie");

You can modify in-memory xml representation, which document contains:

xdoc.Root.Add(new XElement("movie"), new XAttribute("id", 3));

Now execute the query:

int moviesCount = query.Count(); // returns 3

As you can see, Linq to Xml uses deferred execution, but it works similar to Linq to Objects - in-memory data is queried here.

NOTE: XDocument does not implement IDisposable, because it does not holds any unmanaged resources after nodes graph has been constructed.

于 2013-10-18T11:14:10.230 回答