0

I have an XML file generated by the Microsoft SEO toolkit in the following format:

<?xml version="1.0" encoding="utf-8"?>
<urls>
<url url="First URL">
<violations>
  <violation code="HasBrokenLinks" url2="First URL - 1st Broken Link" />
</violations>
</url>
<url url="Second URL">
<violations>
  <violation code="HasBrokenLinks" url2="Second URL - 1st Broken Link" />
  <violation code="HasBrokenLinks" url2="Second URL - 2nd Broken Link" />
  <violation code="HasBrokenLinks" url2="Second URL - 3rd Broken Link" />
  <violation code="HasBrokenLinks" url2="Second URL - 4th Broken Link" />
  <violation code="HasBrokenLinks" url2="Second URL - 5th Broken Link" />
  <violation code="HasBrokenLinks" url2="Second URL - 6th Broken Link" />
</violations>
</url>
</urls>

I am trying to parse the results with a c# app and output it similar to this:

URL:  First URL
Broken Links: First URL - 1st Broken Link

URL: Second URL
Broken Links: Second URL - 1st Broken Link
Second URL - 2nd Broken Link
Second URL - 3rd Broken Link
Second URL - 4th Broken Link
Second URL - 5th Broken Link
Second URL - 6th Broken Link

I have a class defined as follows:

public class WebpageErrors
{
    public String SourceURL { get; set; }
    private static List<string> BrokenLinkList = new List<string>();

    public void BrokenLinkStore(string BrokenLink)
    {
        BrokenLinkList.Add(BrokenLink);
    }
    public List<string> BrokenLinkReturner
    {
        get { return BrokenLinkList; }
    }

}

Then start by iterating through the xml:

        // Generate an array to store a list of URLs
        List<WebpageErrors> errorList = new List<WebpageErrors>();

        // File to open up, can be an URL too
        string XmlFileUrl = @path;
        using (XmlReader reader = new XmlTextReader(XmlFileUrl))
        {
            //Define a new object to store errors in
            WebpageErrors Error = new WebpageErrors();

            // Loop the reader, till it cant read anymore
            while (reader.Read())
            {
                // An object with the type Element was found.
                if (reader.NodeType == XmlNodeType.Element)
                {

                    // Check name of the node and write the contents in the object accordingly.
                    if (reader.Name == "url")
                    {
                        //Define a new object to store errors in
                        Error = new WebpageErrors();

                        Error.SourceURL = reader["url"];
                    }

                    // Check name of the node and write the contents in the object accordingly.
                    if (reader.Name == "violation")
                    {
                        // Check name of the node and write the contents in the object accordingly.
                        if (reader["code"] == "HasBrokenLinks")
                        {
                            Error.BrokenLinkStore(reader["url2"]);

                        }
                    }
                } 
                else if (reader.NodeType == XmlNodeType.EndElement)
                {
                    if (Error.BrokenLinkReturner.Count > 0) 
                    {
                        errorList.Add(Error);
                    }
                }


            }
        }
        return errorList;

After that I iterate through the list of Errors and print them out:

    private static void PrintErrors(List<WebpageErrors> Errors)
    {

        StringBuilder Output = new StringBuilder();

        for (int i = 0; i < Errors.Count; i++)
        {
            Output.Append("Source URL: " + Errors[i].SourceURL + Environment.NewLine);

            List<string> BrokenLinkList = Errors[i].BrokenLinkReturner;
            foreach (String BrokenLink in BrokenLinkList)
            {
                Output.Append("Broken Link: " + BrokenLink + Environment.NewLine);
            }

            Output.Append(Environment.NewLine);
        }

Instead of getting the expected output though I am getting something different:

 Source URL: First URL
 Broken Link: First URL - 1st Broken Link
 Broken Link: Second URL - 1st Broken Link
 Broken Link: Second URL - 2nd Broken Link
 Broken Link: Second URL - 3rd Broken Link
 Broken Link: Second URL - 4th Broken Link
 Broken Link: Second URL - 5th Broken Link
 Broken Link: Second URL - 6th Broken Link

 Source URL: First URL
 Broken Link: First URL - 1st Broken Link
 Broken Link: Second URL - 1st Broken Link
 Broken Link: Second URL - 2nd Broken Link
 Broken Link: Second URL - 3rd Broken Link
 Broken Link: Second URL - 4th Broken Link
 Broken Link: Second URL - 5th Broken Link
 Broken Link: Second URL - 6th Broken Link

 Source URL: Second URL
 Broken Link: First URL - 1st Broken Link
 Broken Link: Second URL - 1st Broken Link
 Broken Link: Second URL - 2nd Broken Link
 Broken Link: Second URL - 3rd Broken Link
 Broken Link: Second URL - 4th Broken Link
 Broken Link: Second URL - 5th Broken Link
 Broken Link: Second URL - 6th Broken Link

 Source URL: Second URL
 Broken Link: First URL - 1st Broken Link
 Broken Link: Second URL - 1st Broken Link
 Broken Link: Second URL - 2nd Broken Link
 Broken Link: Second URL - 3rd Broken Link
 Broken Link: Second URL - 4th Broken Link
 Broken Link: Second URL - 5th Broken Link
 Broken Link: Second URL - 6th Broken Link

 Source URL: Second URL
 Broken Link: First URL - 1st Broken Link
 Broken Link: Second URL - 1st Broken Link
 Broken Link: Second URL - 2nd Broken Link
 Broken Link: Second URL - 3rd Broken Link
 Broken Link: Second URL - 4th Broken Link
 Broken Link: Second URL - 5th Broken Link
 Broken Link: Second URL - 6th Broken Link

I can't seem to figure out why my output is so screwed up. It has to have something to do with the creation of the WebpageErrors object? Can anyone help me understand what I'm doing wrong?

Thanks Brad

4

1 回答 1

0

我的第一个问题似乎是我不想要:

private static List<string> BrokenLinkList = new List<string>();

反而:

private List<string> BrokenLinkList = new List<string>();

(在我的班级中没有静态声明)。这会为每个对象创建唯一列表,而不是共享的违规列表。

第二个问题是我有:

else if (reader.NodeType == XmlNodeType.EndElement)

匹配的 EndElements 比我预期的要多。相反,我需要将其更改为:

else if (reader.Name == "url" && reader.NodeType == XmlNodeType.EndElement)

只有当元素是 url 并且它也是 EndElement 时才会匹配

于 2013-11-05T13:19:03.030 回答