Click here to Skip to main content
15,881,516 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,
I want to skim data from some websites. But with my code, i can get it correctly at 1st time. But when i run again, it still generates body data with 200 code. But the data does not look like as the 1st time i got. I think maybe they block connection. How can i solve it ?

What I have tried:

This is my code :
url=http://www.carparts.com/results/?N=0&Nr=AND%28universal%3A0%29&Ntk=Main&Ntx=mode+matchallany&Nty=1&PN=0+5727&VN=4294953018+4294962799+4294962221+4294957507+4294965468&universal=0[^]
request_headers = {
"Accept-Language": "en-US,en;q=0.5",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Cache-Control": "no-cache, no-store, must-revalidate,post-check=0, pre-check=0",
"Content-Length":"22035",
"Connection":"keep_alive",
"Content-Type":"text/html; charset=UTF-8",
"Vary":"Accept-Encoding",
"Pragma":"no-cache"
}
pageSkimData = requests.get(url, headers=request_headers)
treeSkimData = html.fromstring(pageSkimData.content)
Posted

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900