Click here to Skip to main content
15,886,095 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am trying to pull only the links and their text from a webpage line by line and insert text and link into a dictionary. Without using beautiful soup or a regex.

i keep getting this error:

error:

Traceback (most recent call last):
File "F:/Homework7-2.py", line 13, in <module>
link2 = link1.split("href=")[1]
IndexError: list index out of range


code:

Python
import urllib.request
url = "http://www.facebook.com" 
page = urllib.request.urlopen(url)
mylinks = {}
links = page.readline().decode('utf-8')


for items in links:
  links = page.readline().decode('utf-8')
  if "a href=" in links:
     links = page.readline().decode('utf-8')
     link1 = links.split(">")[0]
     link2 = link1.split("href=")[1]
     mylinks = link2
     print(mylinks)
Posted
Updated 31-Mar-15 22:21pm
v2

1 solution

IndexError: list index out of range

The message is telling you that the index value is greater than the number of items in the list. You should not make assumptions about the results of commands, but check first. If the returned list has only one item then you need to code for that.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900