0

我想根据 loc 元素在我的谷歌站点地图中找到任何重复项。

示例 XML:

<?xml version="1.0" encoding="UTF-8"?>
 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:s chemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
  <url><loc>http://mysite.net/Members.aspx</loc><lastmod>2011-07-01</lastmod></url>      
  <url><loc>http://mysite.net/Topics.aspx</loc><lastmod>2011-05-27</lastmod></url>
  <url><loc>http://mysite.net/Members.aspx</loc><lastmod>2011-07-02</lastmod></url>      
</urlset>

示例 LINQ:

            var duplicates = (from req in doc.Descendants("urlset")
                          group req by req.Descendants("//loc").First().Value
                              into g
                              where g.Count() > 1
                          select g.Skip(1)).SelectMany(elements => elements
                        );

为什么重复返回空?

4

2 回答 2

0

doc.Descendants("urlset")只会返回一个元素(the <urlset>)。

您需要选择<url>元素:

from u in doc.Descendants("url")
group u by u.Element("loc").Value into g
from elem in g.Skip(1)    //This is the SelectMany()
select elem
于 2012-05-14T15:17:48.330 回答
0

您的查询未找到元素,因为您没有指定命名空间。它也比必要的复杂:

XNamespace ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
var duplicates =
    from loc in doc.Root.Elements(ns + "url").Elements(ns + "loc")
    group loc by loc.Value into g
    where g.Count() > 1
    select g.Key;
于 2012-05-14T15:18:48.890 回答