使用 VB.net 或 c#,如何获取生成的 HTML 源代码?
要获取页面的 html 源代码,我可以在下面使用它,但这不会获取生成的源代码,它不会包含任何由浏览器中的 javascript 动态添加的 html。如何获得最终生成的 HTML 源代码?
谢谢
WebRequest req = WebRequest.Create("http://www.asp.net");
WebResponse res = req.GetResponse();
StreamReader sr = new StreamReader(res.GetResponseStream());
string html = sr.ReadToEnd();
如果我在下面尝试这个,那么它会返回没有注入 JavaScript 代码的文档
Public Class Form1
Dim WB As WebBrowser = Nothing
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
WB = New WebBrowser()
Me.Controls.Add(WB)
AddHandler WB.DocumentCompleted, AddressOf WebBrowser1_DocumentCompleted
WB.Navigate("mysite/Default.aspx")
End Sub
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs)
'Dim htmlcode As String = WebBrowser1.Document.Body.OuterHtml()
Dim s As String = WB.DocumentText
End Sub
End Class
返回的 HTML
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
<title></title>
</head>
<body>
<form id="form1" runat="server">
<div id="center_text_panel">
//test text this text should be here
</div>
</form>
</body>
</html>
<script type="text/javascript">
document.getElementById("center_text_panel").innerText = "test text";
</script>