Click here to Skip to main content
15,920,438 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
how can i search in html or text of my website pages like google search ???
i use searchDotNet.dll it does not work in my website!but it is working like my need,
please help me!!
Posted
Updated 4-Jul-10 23:09pm
v2

The obvious way is to use google. Beyond that, the html does not exist in any form for your C# code to look at, so you really can't. Of course, any text you have, you can search any way you want, depending on where it comes from. Putting it in a database is a good first step.
 
Share this answer
 
I think I understand what you want and are currently looking in the wrong direction. I guess your current idea is to search your web page from the point of view of the server that also hosts the website. For security reasons and some practical reasons this is very hard to do because you would for example be indexing your database and must also figure out to which page this data belongs. This very hard to maintain because you need a lot of structural info about the pages.

How to do it then? Search engines crawl your webpage and go from there. They interpret the html and look for links to follow. Specific terms on the webpage are linked using a function that connects certain terms together. For example, you have two pages with some simple text:
page1.asp
This is a simple text for testing.
page2.asp
This test page is simple.

Assume that the crawler can find both pages. The search indexer would take a word like "simple" and connect it to page1.asp, and later on this would be done for page2.asp also. A somewhat more complex link is created for the term "testing" on page1.asp. The search indexer must be smart enough to link "test" on page2.asp to "testing" on page1.asp and also "testing" back to "test" on page2.asp. This would create a link both ways and makes the search engine more usefull because it is not needed to specify terms precisely to find what you want.

Furthermore it is up to you to define the search indexer function you use and how smart this is done. For example, you could add info about the link between "testing" and page2.asp that indicates the that this term is not found literal on this page. More info can be found everywhere on the internet of course.

The most important is to keep the search engine (search indexer) apart from your webpage and let this operate on its own. This makes it also more flexible and easy to reuse. Also, don't forget that indexing the site is a constant process and the indexer must check on a regular basis for changes and update itself to ensure it stays up to date.

Good luck!
 
Share this answer
 
Comments
faezeh66 5-Jul-10 5:12am    
i use searchDotNet.dll it does not work in my website!but it is working like my need but not in my project
Christian Graus 5-Jul-10 5:26am    
This person has given you a good answer, essentially a more indepth version of what I said. If you're too stupid to understand it, consider paying a real developer to do this job for you. Your responses are nonsensical, and add nothing to the discussion. They don't give anyone any reason to respond any differently to what you've been told.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900