0

我正在使用下面的代码在网页中搜索数据并将数据返回给 da datagridview。

当我将它与具有许多行(例如 100)的网页一起使用时,有时它会返回这样的错误行:CaucaiaCE

应该只有高加亚

为什么它只发生在 100 行中的 2 行中?

这是我正在搜索的 html http://pastie.org/8220836

{
    int i = 0;
    Match matchLogradouro = Regex.Match(pagina, "<td width=\"268\" style=\"padding: 2px\">(.*)</td>");
    Match matchBairroCidade = Regex.Match(pagina, "<td width=\"140\" style=\"padding: 2px\">(.*)</td>");
    Match matchEstado = Regex.Match(pagina, "<td width=\"25\" style=\"padding: 2px\">([A-Z]{2})</td>");
    Match matchCep = Regex.Match(pagina, "<td width=\"65\" style=\"padding: 2px\">(.*)</td>");
    int z = Regex.Matches(pagina, "detalharCep").Count;
    while (z > i -1)
    {    
        dataGridView1.Rows.Add(matchLogradouro.Groups[1].Value);
        matchLogradouro = matchLogradouro.NextMatch();
        dataGridView1.Rows[i].Cells[1].Value = matchBairroCidade.Groups[1].Value;
        matchBairroCidade = matchBairroCidade.NextMatch();
        dataGridView1.Rows[i].Cells[2].Value = matchBairroCidade.Groups[1].Value;
        matchBairroCidade = matchBairroCidade.NextMatch();
        dataGridView1.Rows[i].Cells[3].Value = matchEstado.Groups[1].Value;
        matchEstado = matchEstado.NextMatch();

        dataGridView1.Rows[i].Cells[4].Value = matchCep.Groups[1].Value;
        matchCep = matchCep.NextMatch();
        i++;
    }
}
4

1 回答 1

7

创建类(对不起,我不懂葡萄牙语以了解您的类中应该包含什么样的数据)

public class Foo // I believe it should be something like Address
{
    public string Logradouro { get; set; }
    public string BairroCidade1 { get; set; }
    public string BairroCidade2 { get; set; }
    public string Estado { get; set; } // this should be State
    public string Cep { get; set; }
}

并使用HtmlAgilityPack解析您的 html 文档

HtmlDocument doc = new HtmlDocument();
doc.Load(html_file_name); // or doc.LoadHtml(html_string)

var foos = from row in doc.DocumentNode.SelectNodes("//tr[td]")
           let cells = row.SelectNodes("td").Select(td => td.InnerText).ToArray()
           where cells.Length > 4
           select new Foo {
               Logradouro = cells[0],
               BairroCidade1 = cells[1],
               BairroCidade2 = cells[2],
               Estado = cells[3],
               Cep = cells[4]
           };
于 2013-08-09T06:20:49.787 回答