Click here to Skip to main content
Click here to Skip to main content
 
Add your own
alternative version

Web scraping with regular expressions

Using regular expressions for web scraping is sometimes criticized, but I believe they still have their place, particularly for one-off scrapes. Let's say I want to extract the title of a particular webpage - here is an implementation using BeautifulSoup, lxml, and regular expressions:import reimpor
No downloads associated with this content

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Richard Penman

Australia Australia
No Biography provided

| Advertise | Privacy | Mobile
Web04 | 2.8.140721.1 | Last Updated 18 Jan 2013
Article Copyright 2013 by Richard Penman
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid