c# - .Net 4 中的多线程 C# 队列

Question

我正在为网页开发一个简单的爬虫。我搜索了很多实现多线程爬虫的解决方案。创建线程安全队列以包含唯一 URL 的最佳方法是什么？

编辑：.Net 4.5 中有更好的解决方案吗？

score 2 · Accepted Answer

使用任务并行库并使用使用 ThreadPool 的默认调度程序。

好的，这是一次将 30 个 URL 排队的最小实现：

    public static void WebCrawl(Func<string> getNextUrlToCrawl, // returns a URL or null if no more URLs 
        Action<string> crawlUrl, // action to crawl the URL 
        int pauseInMilli // if all threads engaged, waits for n milliseconds
        )
    {
        const int maxQueueLength = 50;
        string currentUrl = null;
        int queueLength = 0;

        while ((currentUrl = getNextUrlToCrawl()) != null)
        {
            string temp = currentUrl;
            if (queueLength < maxQueueLength)
            {
                Task.Factory.StartNew(() =>
                    {
                        Interlocked.Increment(ref queueLength);
                        crawlUrl(temp);
                    }
                    ).ContinueWith((t) => 
                    {
                        if(t.IsFaulted)
                            Console.WriteLine(t.Exception.ToString());
                        else
                            Console.WriteLine("Successfully done!");
                        Interlocked.Decrement(ref queueLength);
                    }
                    );
            }
            else
            {
                Thread.Sleep(pauseInMilli);
            }
        }
    }

假人用法：

    static void Main(string[] args)
    {
        Random r = new Random();
        int i = 0;
        WebCrawl(() => (i = r.Next()) % 100 == 0 ? null : ("Some URL: " + i.ToString()),
            (url) => Console.WriteLine(url),
            500);

        Console.Read();

    }

score 2 · Accepted Answer

ConcurrentQueue确实是框架的线程安全队列实现。但是由于您可能会在生产者-消费者场景中使用它，因此您真正追求的类可能是无限有用的BlockingCollection。

score 1 · Accepted Answer

1

System.Collections.Concurrent.ConcurrentQueue<T>符合要求吗？

于 2012-04-10T10:51:41.237 回答

score 1 · Accepted Answer

我会使用 System.Collections.Concurrent.ConcurrentQueue。

您可以安全地从多个线程排队和出队。

score 1 · Accepted Answer

查看 System.Collections.Concurrent.ConcurrentQueue。如果需要等待，可以使用 System.Collections.Concurrent.BlockingCollection

c# - .Net 4 中的多线程 C# 队列

5 回答 5

Related

Reference