3

我试图调用Browser.NewPageAsync()另一个静态方法,但是当我调用它时,调用它的方法就退出了。

    partial class Program
    {
        static Browser Browser;

        static async Task StartBrowser()
        {
            Browser = await Puppeteer.LaunchAsync
               (
                   new LaunchOptions
                   {
                       Headless = true,
                       ExecutablePath = "Chromium\\chrome.exe"
                   }
               );
            Console.WriteLine("Browser launched");
        }

        static void StartScraping(int threads)
        {
            for (int i = 0; i < threads; i++)
            {
                Task.Run(async () =>
                {
                    int ThreadNumber = i;
                    Console.WriteLine("Thread #" + ThreadNumber + " started");
                    Page p = await Browser.NewPageAsync(); //exits here
                    await p.GoToAsync("https://www.google.com");
                    Console.WriteLine("Content:\n" + await p.GetContentAsync());
                });
            }
        }

        static async Task MainAsync()
        {
            await StartBrowser();
            StartScraping(1);
        }

        static void Main(string[] args)
        {
            MainAsync().GetAwaiter().GetResult();
        }
    }

例如:如果我调用Browser.NewPageAsync()MainAsync()thenBrowser.NewPageAsync()将按预期调用。

4

2 回答 2

1

我找到了一个解决方案:如果页面将在与其浏览器实例相同的范围内创建,那么页面将按预期创建,否则Task.Run()会因NewPageAsync()方法而卡住。

不良行为:

Task[] Tasks = new Task[1];
Browser browser = await Puppeteer.LaunchAsync
(
    new LaunchOptions
    {
        Headless = true,
        ExecutablePath = "Chromium\\chrome.exe"
    }
);
for (int i = 0; i < Tasks.Length; i++)
{
    int ThreadNumber = i;
    Tasks[i] = Task.Run(async () =>
    {
       Page page = await browser.NewPageAsync(); //stucks
    });
}

Task.WaitAll(Tasks);

正如预期的那样:

Task[] Tasks = new Task[1];
for (int i = 0; i < Tasks.Length; i++)
{
    int ThreadNumber = i;
    Tasks[i] = Task.Run(async () =>
    {
       Browser browser = await Puppeteer.LaunchAsync
       (
           new LaunchOptions
           {
                Headless = true,
               ExecutablePath = "Chromium\\chrome.exe"
           }
       );
       Page page = await browser.NewPageAsync(); //creates as expected
    });
}

Task.WaitAll(Tasks);

无论如何,这不是最好的解决方案,因为我必须为异步任务创建浏览器,而不是为所有异步任务使用一个浏览器。希望有人能解释一下。感谢大家的帮助!

于 2019-09-17T21:02:26.887 回答
0

您正在开始任务,但不等到完成。你需要等待他们:

    ...
    static void StartScraping(int threads)
    {
        Task.WaitAll(
            Enumerable.Range(0, threads)
            .Select(async ThreadNumber =>
            {
                try
                {
                    Console.WriteLine("Thread #" + ThreadNumber + " started");
                    Page p = await Browser.NewPageAsync(); //exits here
                    await p.GoToAsync("https://www.google.com");
                    Console.WriteLine("Content:\n" + await p.GetContentAsync());
                }
                catch (Exception e)
                {
                    Console.WriteLine("Thread #" + ThreadNumber + " failed. " + e);
                    throw;
                }
            }).ToArray());
    }

    static async Task MainAsync()
    {
        await StartBrowser();
        StartScraping(1);
    }

另请检查此 Puppeteer 问题:链接。并确保 Chromium 版本与此处匹配:链接

于 2019-09-17T08:28:30.220 回答