2

我有一组用于抓取的 url,我想并行下载资源,同时返回一组强类型结果。

有一个WebClient.DownloadString() 和一个“ MyTypedResult Process(string s)

我如何将其包装起来以进行string[] urls => IEnumerable<MyTypedResult>转换?

string[] urls = {"url1","url2","url3"};
List<MyTypedResult> ResultCollection = new List<MyTypedResult>();
foreach (var u in urls)
{
    WebClient wc = new WebClient();
    var content = wc.DownloadString(u);
    MyTypedResult r = Process(content);
    ResultCollection.Add(r);
}

我希望 Web 请求并行运行,但我需要一个 List 中的结果集合;

4

3 回答 3

4

您可以使用HttpClient.NET 4.5 中的新玩具并行获得结果:

var httpClient = new HttpClient();

var tasks = urls.Select(url => httpClient.GetStringAsync(url)
                        .ContinueWith(task =>
                        {
                            string response = task.Result;
                            return ConvertToStrongType(response);
                        }));

 Task.WaitAll(tasks.ToArray());
 var results = tasks.Select(t => t.Result);
于 2013-02-07T09:40:16.950 回答
2

这是 Rx 版本HttpClient

var urls = new[] { "url1", "url2", "url3" };
var client = new HttpClient();
var results = from url in urls.ToObservable()
              from content in client.GetStringAsync(url).ToObservable()
              select Process(content);
var enumerable = results.ToEnumerable();
于 2013-02-08T10:28:26.930 回答
1

以下是代码,它使用 Parallel.ForEach 从 url 并行下载内容。您需要使用 ConcurrentList 来确保应该并行填充集合,而不会出现线程锁定问题。

void YourTask()
{
    string[] urls = {"url1","url2","url3"};
    ConcurrentList<MyTypedResult> ResultCollection = new ConcurrentList<MyTypedResult>();

    Parallel.ForEach(urls, url => 
    {
        GetData(url);
        ResultCollection.TryAdd(myTypedResult);
    );

    //on this line all parallel task will be completed and ResultCollection will be filled with the results

}

MyTypedResult GetData(string url)
{
   WebClient wc = new WebClient();
    var content = wc.DownloadString(url);
    MyTypedResult r = Process(content);
    return r;
}
于 2013-02-07T10:37:14.697 回答