Click here to Skip to main content
13,449,298 members (47,010 online)
Click here to Skip to main content
Add your own
alternative version


19 bookmarked
Posted 30 Nov 2004

Retrieving the HTML source code

, 30 Nov 2004
Rate this:
Please Sign up or sign in to vote.
An article on how to retrieve the full source code of a web page.


An app I was writing needed to store the full HTML of a web page. I looked all over the web and the MSDN library on how to get the complete HTML from a CHtmlView. I found out how to get the <BODY></BODY> data, but not how to get the <HTML></HTML> data. After lots of stumbling, I hit on the following very simple technique.

Examples of getting the outer HTML of the <BODY> tag abound. While exploring the IHTMLDocument2 interface, I noticed the get_ParentElement method. I realized that the parent of <BODY> is <HTML>.

This function took care of my problem:

bool CMyHtmlView::GetDocumentHTML(CString &str)
    IHTMLDocument2 *lpHtmlDocument = NULL;
    LPDISPATCH lpDispatch = NULL;

    lpDispatch = GetHtmlDocument();
        return false;

    lpDispatch->QueryInterface(IID_IHTMLDocument2, (void**)&lpHtmlDocument);

    IHTMLElement *lpBodyElm;
    IHTMLElement *lpParentElm;

    // get_body returns all between <BODY> and </BODY>. 
    // I need all between <HTML> and </HTML>.

    // the parent of BODY is HTML
    BSTR    bstr;
    str = bstr;


    return true;

Points of Interest

There is bound to be a better way of doing this. If you know it, please share it with me.


This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


About the Author

Geno Carman
United States United States
No Biography provided

You may also be interested in...


Comments and Discussions

GeneralOne-line way of doing it Pin
jocool13-Dec-04 3:37
sussjocool13-Dec-04 3:37 
GeneralRe: One-line way of doing it Pin
RancidCrabtree13-Dec-04 17:47
memberRancidCrabtree13-Dec-04 17:47 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.180318.3 | Last Updated 1 Dec 2004
Article Copyright 2004 by Geno Carman
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid