c# - 从html源文件中提取文本值

Question

在此代码中，varTempTxt将 Html 正文内容保存为字符串，我如何使用 lambda 语法提取元素<table>或内部文本/ html？<td>

    public  string  ExtractPageValue(IWebDriver DDriver, string url="") 
    {
        if(string.IsNullOrEmpty(url))
        url = @"http://www.boi.org.il/he/Markets/ExchangeRates/Pages/Default.aspx";
        var service = InternetExplorerDriverService.CreateDefaultService(directory);
        service.LogFile = directory + @"\seleniumlog.txt";
        service.LoggingLevel = InternetExplorerDriverLogLevel.Trace;

        var options = new InternetExplorerOptions();
        options.IntroduceInstabilityByIgnoringProtectedModeSettings = true;

        DDriver = new InternetExplorerDriver(service, options, TimeSpan.FromSeconds(60));
        DDriver.Navigate().GoToUrl(url);
        var TempTxt = DDriver.PageSource;
        return "";//Math.Round(Convert.ToDouble( TempTxt.Split(' ')[10]),2).ToString();

    }

score 1 · Accepted Answer

如果您愿意尝试HtmlAgilityPack

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

var table = doc.DocumentNode.SelectNodes("//table/tr")
               .Select(tr => tr.Elements("td").Select(td => td.InnerText).ToList())
               .ToList();

c# - 从html源文件中提取文本值

1 回答 1

Related

Reference