Click here to Skip to main content
15,742,120 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hi how can i extract a part of a website and show it in web browser? (and it must be update every 1 hour)
what should i do?
please help me
With Respect "Spaceman"
[no name] 1-Sep-14 17:50pm    
You do some research on "web page scraping" and write some code.
Gihan Liyanage 2-Sep-14 0:39am

1 solution

This is not exactly how the Web works. There is no such thing as "part". In general case, the concept of "part" is something undefined.

If you considered only a 100% static Web site, it could be considered is a general-case graph which nodes are files ("resources"), which are inter-connected with links (such as anchors). Then you could define some sun-set of the set of those files and extract them using the techniques of Web scraping ([^]).

But the trouble is: 1) the links don't have to point to the resources on the same site; and it's not always obvious what is "the same site"; 2) you don't know the set of nodes in advance, before you start scraping; 3) the site does not have to be static: resources can be generated on request; and there is no any predefined correspondence between the set of URLs and the set of resources; 4) even scrapping of the same URL can give different (possibly random) results each time, which is the case for, for example, the games.

So, only for some special simple case, when you have the set of URLs known in the very beginning, you can give some reasonable definition of the "part" and actually scrap it. How to do it? Please see my past answers:
get specific data from web page[^],
How to get the data from another site[^].

Share this answer
Avenger1 2-Sep-14 1:07am    
i didn't understand all these sentences
can you give me a project about this,please?
With Respect "Spaceman"
Sergey Alexandrovich Kryukov 2-Sep-14 2:21am    
Project would be too much. If you don't understand some part of this post, please ask your follow-up questions.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900