powershell - 在 PowerShell V3 中解析 HTML 表

Question

我有以下 HTML 表格链接到 HTML

我想解析它并将其转换为 XML/CSV/PS 对象，我尝试使用 HtmlAgilityPack.dll 但没有成功。任何人都可以给我任何指示吗？

我想将表格转换为 PSObject 并将其导出为 csv，我目前只有代码的开头，并且可以访问行但我无法访问行中的值

Add-Type -Path C:\Windows\system32\HtmlAgilityPack.dll
$HTML = New-Object HtmlAgilityPack.HtmlDocument
$res = $HTML.Load("C:\Test\Test.html")
$table = $HTML.DocumentNode.SelectNodes("//table/tr/td/nobr")

当我访问 $table[0..47].InnerHtml 我只得到文件的第一列 **，我无法访问第二列等等

谢谢奥哈德

score 3 · Accepted Answer

你可以试试这个来获取<nobr>标签中的所有 html。我让你找到输出你想要的逻辑......

$ie = new-object -com "InternetExplorer.Application"
$ie.navigate("http://urltoyourfile.html")
$doc = $ie.Document
($doc.getElementsByTagName("nobr"))|%{$_.innerHTML}

输出：

Lead User&nbsp;&nbsp;
Accesses&nbsp;&nbsp;
Last Accessed&nbsp;&nbsp;
Average&nbsp;&nbsp;
Max&nbsp;&nbsp;
Min&nbsp;&nbsp;
Total&nbsp;&nbsp;
amirt</NO br>
2
01/20/2013 09:40:47
04:18:17
06:19:26
02:17:09
08:36:35
andream
1
01/20/2013 10:33:01
02:34:37
02:34:37
02:34:37
02:34:37
avnerm
1
01/17/2013 11:34:16
00:30:44
00:30:44
00:30:44
00:30:44
brouria

一种解析它的方法：

($doc.getElementsByTagName("nobr"))|%{
    write-host -nonew $_.innerHTML";"
    $cpt++
    if ($cpt % 8 -eq 0){$cpt=1;write-host ""}
}

powershell - 在 PowerShell V3 中解析 HTML 表

1 回答 1

Related

Reference