1

我使用下面的代码来保存一个 utf-8 网页:

    HttpWebRequest myWebRequest = (HttpWebRequest) WebRequest.Create(txtUrl.Text);
    myWebRequest.UserAgent = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1);Accept-Language:fa";
    WebResponse myWebResponse = myWebRequest.GetResponse();
    Stream ReceiveStream = myWebResponse.GetResponseStream();
    Encoding encode = System.Text.Encoding.GetEncoding("utf-8");
    StreamReader readStream = new StreamReader(ReceiveStream, encode);
    string strResponse = readStream.ReadToEnd();
    StreamWriter oSw = new StreamWriter(@"c:\ehsan.html");
    oSw.WriteLine(strResponse);
    oSw.Close();
    readStream.Close();
    myWebResponse.Close();  
    txtUrl.Text = strResponse;

但在 ehsan.html 文件和 txtUrl 中,所有 unicode 字符都是符号。我的解决方案是否正确?有谁有想法吗?

4

2 回答 2

0
Use "Arabic" instead of UTF-8 for your Encoding
于 2012-09-26T11:42:41.583 回答
0

您正在加载的页面包含:

<meta http-equiv="Content-Type" content="text/html; charset=windows-1256">

因此,如果您将代码更改为:

Encoding encode = System.Text.Encoding.GetEncoding("windows-1256");

有用。(我已经测试过):-)

是否要对编码进行硬编码取决于您,并且取决于您是仅从同一页面加载还是从具有不同编码的页面加载。

于 2012-09-26T11:45:58.873 回答