c# - Try-Catch 语句结束 While 循环读取 C# 中的 XML 文件

Question

我有一个遍历 XML 文件的 while 循环，对于其中一个节点“url”，其中有时存在无效值。我在此周围放置了一个 try-catch 语句来捕获任何无效值。问题是，每当抓取到无效值时，while 循环就会被终止，并且程序会在该循环之外继续。如果发现无效值，我需要 while 循环继续读取 XML 文件的其余部分。

这是我的代码：

        XmlTextReader reader = new XmlTextReader(fileName);
        int tempInt;

        while (reader.Read())
        {
            switch (reader.Name)
            {
                case "url":
                    try
                    {
                        reader.Read();
                        if (!reader.Value.Equals("\r\n"))
                        {
                            urlList.Add(reader.Value);
                        }
                    }
                    catch
                    {                            
                        invalidUrls.Add(urlList.Count);   
                    }
                    break;
            }
        }

我选择不包含 switch 语句的其余部分，因为它不相关。这是我的 XML 示例：

<?xml version="1.0"  encoding="ISO-8859-1" ?>
<visited_links_list>
    <item>
        <url>http://www.grcc.edu/error.cfm</url>
        <title>Grand Rapids Community College</title>
        <hits>20</hits>
        <modified_date>10/16/2012 12:22:37 PM</modified_date>
        <expiration_date>11/11/2012 12:22:38 PM</expiration_date>
        <user_name>testuser</user_name>
        <subfolder></subfolder>
        <low_folder>No</low_folder>
        <file_position>834816</file_position>
     </item>
</visited_links_list>

我在整个代码中遇到的异常类似于以下内容：

“''，十六进制值 0x05，是无效字符。第 3887 行，位置 13。”

score 3 · Accepted Answer

观察：

您为每个条目调用reader.Read() 两次。一次进入while()，一次进入case。你真的是要跳过记录吗？如果源 XML 中有奇数个条目，这将reader.Read()导致异常（因为将 XML 流中的指针前进到下一个项目），但不会捕获该异常，因为它发生在您的try...catch.

除此之外：

reader.Read(); /// might return false, but no exception, so keep going...

if (!reader.Value.Equals("\r\n")) /// BOOM if the previous line returned false, which you ignored
{ 
    urlList.Add(reader.Value); 
} 
/// reader is now in unpredictable state

编辑

冒着写长篇答案的风险……

您收到的错误

“''，十六进制值 0x05，是无效字符。第 3887 行，位置 13。”

表示您的源 XML 格式不正确，并且以某种方式^E在指定位置以 (ASCII 0x05) 结尾。我会看看那条线。如果您从供应商或服务获取此文件，您应该让他们修复他们的代码。更正该内容以及 XML 中的任何其他格式错误的内容应该可以更正您所看到的问题。

一旦修复，您的原始代码应该可以工作。但是，使用XmlTextReader它并不是最强大的解决方案，并且涉及构建一些 Visual Studio 会很乐意为您生成的代码：

在 VS2012 中（我没有再安装 VS2010，但它应该是相同的过程）：

将 XML 示例添加到您的解决方案中
在该文件的属性中，将 CustomTool 设置为“MSDataSetGenerator”（不带引号）
IDE 应生成一个 .designer.cs 文件，其中包含一个可序列化的类，该类为 XML 中的每个项目提供一个字段。（如果没有，请在解决方案资源管理器中右键单击 XML 文件并选择“运行自定义工具”。）

在此处输入图像描述

使用如下代码在运行时加载具有与示例相同架构的 XML：

/// make sure the XML doesn't have errors, such as non-printable characters
private static bool IsXmlMalformed(string fileName)
{
    var reader = new XmlTextReader(fileName);
    var result = false;

    try
    {
        while (reader.Read()) ;
    }
    catch (Exception e)
    {
        result = true;
    }

    return result;
}

/// Process the XML using deserializer and VS-generated XML proxy classes
private static void ParseVisitedLinksListXml(string fileName, List<string> urlList, List<int> invalidUrls)
{
    if (IsXmlMalformed(fileName))
        throw new Exception("XML is not well-formed.");

    using (var textReader = new XmlTextReader(fileName))
    {
        var serializer = new XmlSerializer(typeof(visited_links_list));

        if (!serializer.CanDeserialize(textReader))
            throw new Exception("Can't deserialize this XML. Make sure the XML schema is up to date.");

        var list = (visited_links_list)serializer.Deserialize(textReader);

        foreach (var item in list.item)
        {
            if (!string.IsNullOrEmpty(item.url) && !item.url.Contains(Environment.NewLine))
                urlList.Add(item.url);
            else
                invalidUrls.Add(urlList.Count);
        }
    }
}

您也可以使用 Windows SDK 中包含的 XSD.exe 工具来执行此操作。

score 1 · Accepted Answer

我感觉reader在抛出异常后处于错误状态（因为reader.Read();（在内部switch，而不是while）最有可能是发生异常的行。然后reader.Read()inwhile不返回任何内容，并且退出。

我switch在控制台应用程序中做了一个简单的操作，并在其中捕获和异常，并且包含循环继续进行。

var s = "abcdefg";
foreach (var character in s)
{
    switch (character)
    {
        case 'c':
            try
            {
                throw new Exception("c sucks");
            }
            catch
            {
                // Swallow the exception and move on?
            }
            break;
        default:
            Console.WriteLine(character);
            break;
    }
}

如果您遍历代码，它是否会尝试reader.Read()在while捕获异常后运行？

score 0 · Accepted Answer

我假设您正在阅读一个有效的 xml 文档，例如 myFile.xml。我还假设“url”是您要获取的元素。

将文档加载到 XMLDocument 类中并使用它来遍历节点。这应该消除坏字符，因为它会将这些字符转换为正确的格式，例如 & 会变成 amp; 等等。。

下面的方法应该适用于您提供的示例。

        //get the text of the file into a string
        System.IO.StreamReader sr = new System.IO.StreamReader(@"C:\test.xml");
        String xmlText = sr.ReadToEnd();
        sr.Close();  
        //Create a List of strings and call the method
        List<String> urls = readXMLDoc(xmlText);
        //check to see if we have a list
        if (urls != null)
        {
            //do somthing
        }


    private List<String> readXMLDoc(String fileText)
    {
        //create a list of Strings to hold our Urls
        List<String> urlList = new List<String>();
        try
        {
            //create a XmlDocument Object
            XmlDocument xDoc = new XmlDocument();
            //load the text of the file into the XmlDocument Object
            xDoc.LoadXml(fileText);
            //Create a XmlNode object to hold the root node of the XmlDocument
            XmlNode rootNode = null;
            //get the root element in the xml document
            for (int i = 0; i < xDoc.ChildNodes.Count; i++)
            {
                //check to see if it is the root element
                if (xDoc.ChildNodes[i].Name == "visited_links_list")
                {
                    //assign the root node
                    rootNode = xDoc.ChildNodes[i];
                    break;
                }
            }

            //Loop through each of the child nodes of the root node
            for (int j = 0; j < rootNode.ChildNodes.Count; j++)
            {
                //check for the item tag
                if (rootNode.ChildNodes[j].Name == "item")
                {
                    //assign the item node
                    XmlNode itemNode = rootNode.ChildNodes[j];
                    //loop through each if the item tag's elements
                    foreach (XmlNode subNode in itemNode.ChildNodes)
                    {
                        //check for the url tag
                        if (subNode.Name == "url")
                        {
                            //add the url string to the list
                            urlList.Add(subNode.InnerText);
                        }
                    }
                }
            }
        }
        catch (Exception e)
        {
            System.Windows.Forms.MessageBox.Show(e.Message);
            return null;
        }
        //return the list
        return urlList;
    }

score -1 · Accepted Answer

使用继续

while (reader.Read())
        {
            switch (reader.Name)
            {
                case "url":
                    try
                    {
                        reader.Read();
                        if (!reader.Value.Equals("\r\n"))
                        {
                            urlList.Add(reader.Value);
                        }
                    }
                    catch
                    {
                        invalidUrls.Add(urlList.Count);
                        continue;
                    }
                    break;
            }
        }

c# - Try-Catch 语句结束 While 循环读取 C# 中的 XML 文件

4 回答 4

Related

Reference