Click here to Skip to main content
15,036,363 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Good afternoon,
I am trying to scrape the overall volume of unread messages on my Instagram profile and I am using Selenium through Python to access it. I have managed to reach my mailbox and I have 5 unread messages, signified with the classic 'blue' dot next to them.
The issue I am facing is that BeautifulSoup is not reading the respective div and classes to count the number of unread messages.

counter = 0
#count messages
soup = BeautifulSoup(browser.page_source, features='html.parser')
new_message = soup.find_all(lambda tag:"div" and tag.get("class") == "Igw0E   rBNOH          YBx95   ybXk5    _4EzTm                      soMvl")
for i in new_message:
    counter += 1
print('Unread messages: ', counter)

The class, as shown through the console is as follows. However, something tells me that Instagram's based on JS and this is why I cannot count the divs. Any ideas?

<pre><div class="                     Igw0E   rBNOH          YBx95   ybXk5    _4EzTm                      soMvl                                                                                        "><div class=" _41V_T   Sapc9                 Igw0E     IwRSH      eGOV_         _4EzTm                                                                                                              " style="height: 8px; width: 8px;"></div></div>

What I have tried:

I have tried numerous variations of new_message, such as:

new_message = soup.find_all("div", {"class" : "Igw0E   rBNOH          YBx95   ybXk5    _4EzTm                      soMvl"})

new_message = soup.find_all("div", {"class" : "_41V_T   Sapc9                 Igw0E     IwRSH      eGOV_         _4EzTm"})

and by its style, but to no avail.

new_message = soup.find_all("div",{"style" : "height: 8px; width: 8px;"})

Also tried checking whether it locates something to print and it does, but I am unsure as to why the counter is not working:

new_message = soup.find_all(lambda tag:"div" and tag.get("class") == "_4EzTm")
for message in new_message:
    counter =+ 1
    print('Unread messages: ', counter)
if new_message is not None:
Updated 31-Jan-21 7:56am
Dave Kreskowiak 1-Feb-21 0:49am
Possibly because the data is filled in using javascript. You'll have to use Seleniums JavaScriptExecutor to fill in the data.

JavaScriptExecutor in Selenium WebDriver with Example[^]
Giorgio Anagio 1-Feb-21 9:30am
Could you please elaborate on how I could incorporate that to my code?
I have tried finding the element by xpath, but my knowledge is kind of limiting me from seeing this work at the moment.
Dave Kreskowiak 1-Feb-21 10:15am
Nope. I already gave you a link that does that very thing, with examples, and I have no use for Selenium myself.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900