2

我正在尝试DownloadData. WebClient我当前的问题是,我无法弄清楚如何将生成的ASCII结果&lt;<、、到\n)转换为从该页面产生的结果。&gt;>Encoding.ASCII.GetString(myDataBuffer);

页源
(来源:iforce.co.nz

    /// <summary>
    /// Curl data from the PMID
    /// </summary>
    private void ClientPMID(int pmid)
    {
        //generate the URL for the client
        StringBuilder pmid_url_string = new StringBuilder();
        pmid_url_string.Append("http://www.ncbi.nlm.nih.gov/pubmed/").Append(pmid.ToString()).Append("?report=xml");
        Uri PMIDUri = new Uri(pmid_url_string.ToString());
        //declare and initialize the client
        WebClient client = new WebClient();
        // Download the Web resource and save it into a data buffer. 
        byte[] myDataBuffer = client.DownloadData(PMIDUri);
        this.DownloadCompleted(myDataBuffer);
    }
    /// <summary>
    /// Crawl over the binary from myDataBuffer
    /// </summary>
    /// <param name="myDataBuffer">Binary Buffer</param>
    private void DownloadCompleted(byte[] myDataBuffer)
    {
        string download = Encoding.ASCII.GetString(myDataBuffer);
        PMIDCrawler pmc = new PMIDCrawler(download, "/pre/PubmedArticle/MedlineCitation/Article");
        //iterate over each node in the file
        foreach (XmlNode xmlNode in pmc.crawl)
        {
            string AbstractTitle = xmlNode["ArticleTitle"].InnerText;
            string AbstractText = xmlNode["Abstract"]["AbstractText"].InnerText;
        }
    }

PMIDCrawler 的代码可在我关于DownloadStringCompletedEventHandler. 虽然从. string html = HttpUtility.HtmlDecode(nHtml);_ _ _xmlEncoding.ASCII.GetString

4

1 回答 1

2

不幸的是,该服务器无法正确响应,Accept: text/xml因此Accept: application/xml您必须以艰难的方式进行此操作(HttpUtility

string download = HttpUtility.HtmlDecode(Encoding.ASCII.GetString(myDataBuffer));

(或.NET Fx 4.5+ 上的 WebUtility.Decode

或者

string download = Encoding.ASCII.GetString(myDataBuffer);
if (download != null) { // this won't get all HTML escaped characters...
    download = download.Replace("&lt;", "<").Replace("&gt;", ">");
}

另请参阅此问题以获取更多信息。

于 2013-03-13T03:13:21.040 回答