Click here to Skip to main content
14,427,367 members
Rate this:
Please Sign up or sign in to vote.
See more:
I'm trying to build a web scraper, hence I need to automate a few things, one of them is navigation between pages.

I setup event handler to catch a response from websocket and everything is working until the moment I have to go to another page and start to listen for messages there.

I arrive on the page, but handler doesn't pick up anything.

This piece of code is working by itself.

private async Task NavigateToSportAsync()
        {
            try
            {
                var container = await this._page.WaitForSelectorAsync(".ipo-ClassificationBar_ButtonContainer");
                var sports = await container.QuerySelectorAllAsync(".ipo-ClassificationBarButtonBase");
                bool clicked = false;
                var opitions = new NavigationOptions();
                opitions.WaitUntil = new[] { WaitUntilNavigation.Networkidle0 };
                opitions.Timeout = 4000;

                foreach (var sport in sports)
                {
                    var classesHamdle = await sport.GetPropertyAsync("className");
                    var nameHandle = await sport.GetPropertyAsync("innerText");
                    var classes = classesHamdle.ToString();
                    var name = nameHandle.ToString();

                    if (name.Contains(this._sport) && !classes.Contains("ipo-ClassificationBarButtonBase_Selected"))
                    {
                        await sport.ClickAsync();
                        await this._page.WaitForNavigationAsync(opitions);
                        clicked = true;
                        break;
                    }
                    else if (name.Contains(this._sport) && classes.Contains("ipo-ClassificationBarButtonBase_Selected"))
                    {
                        clicked = true;
                        break;
                    }
                }

                if (!clicked)
                {
                    throw new Exception($"There aren't any matches currently present in {this._sport}");
                }
            }
            catch (Exception ex)
            {
                this._logger.LogWarning($"SPORT NAVIGATION ERROR: " +
                                $"{Environment.NewLine + ex.Message}", true);
            }
        }


This function seems to be the problem.

private async Task CheckPageAsync()
        {
            try
            {
                var header = await this._page.EvaluateFunctionAsync<string>(@"() => {
                    let header = document.querySelector('.ipo-ClassificationHeader_HeaderLabel');
                    if (header === null || header.length == 0) {{
                          return '';
                        }}
                    return header.innerText;
                    }");

                while (header != this._sport)
                {
                    this._logger.LogWarning("Not on the right page, trying to navigate...", true);
                    await this.NavigateToSportAsync();
                    //await Task.Delay(2000);
                    header = await this._page.EvaluateFunctionAsync<string>(@"() => {
                    let header = document.querySelector('.ipo-ClassificationHeader_HeaderLabel');
                    if (header === null || header.length == 0) {{
                          return '';
                        }}
                    return header.innerText;
                    }");

                    if (header != this._sport)
                        await Task.Delay(30 * 1000);
                }
            }
            catch (Exception ex)
            {
                this._logger.LogError($"ERROR DURING PAGE CHECK, ERROR: " +
                        $"{Environment.NewLine + ex.Message + Environment.NewLine + ex.StackTrace }", true);
            }
        }


What I have tried:

I've tried to initialize new CDP session, doesn't seem to do anything, then I've tried to wait for various amount of time in case that the page is not fully loaded. Didn't work. I would be grateful for any piece of advice. Than You for your patience and time.
Posted
Comments
Richard MacCutchan 27-Dec-19 8:53am
   
I have a webscraper app but use an Event handler to capture when a page load completes. Using your methods means you have to look for content in the returned data, and it is difficult to know when the page has completely loaded.

1 solution

Rate this:
Please Sign up or sign in to vote.

Solution 1

Some Puppeteer/PuppeteerSharp versions have freezing issues. I have fixed my problem downgrading to 1.11.0 version.

More information here.
   

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100