Creating a Yahoo! Mail Client using IWebBrowser2 and DHTML






3.75/5 (15 votes)
Mar 3, 2004
2 min read

153746

1723
Describes a practical use of the HTML interfaces available in C++.
Introduction
This is just another example of using the DHTML support provided by the web browser control that can be used for any operation that needs to be done on web pages over and over again without much logical interaction of the user. This is an attempt to automate the otherwise cumbersome act of logging on to the mail site, then type-in your password (often making mistakes), click on inbox, open each mail and finally logout. I thought it would be a nice idea if some software does this all for me and saves the mails so that you can read them later. What I present here is only a framework, you can now think of some cool tool that does this all.
There are two ways of doing this. First, you can download each page with HTTP. Then find out the links and get mails the same way. The second method is rather exciting. It simulates your mouse-clicks and navigates across pages just like what we would have done, and then save the pages before logging out. I will show the second method here.
The code should be self-explanatory. I have added comments for the important points in the code. Here’s the summary of the entire operation.
- Create an instance of the web browser control
- Navigate to http://mail.yahoo.com
- When the log-in page is loaded, find out the input fields-username and password
- Paste your User-Id and Password into the fields
- Click on Login button
- The next page shows you the inbox information with links to the ‘Inbox’
- Click on ‘Inbox’ to see the list of mails
- Once the mail-list is loaded click on the first unread mail
- When the mail window opens up find where the content of the mail is written in the underlying HTML code. For HTML formats the mail content was found to appear between <XBODY> and </XBODY> tags. Text messages appear within a different pair of tags.
- Get the message and save it.
- Go back and get the next unread message
- When all unread mails are downloaded, don’t even think for a second, Sign-out!!!
- That’s it!
Using the code
In the code, I have used many interfaces pertinent to the HTML elements. As you can see the usage is much the same. I haven’t done any optimization or error/exception handling, leaving it to the benefactors of the code. Please refer to the comments placed along side the lines of code. Although I built this on VS 7, it should work on VC 6 as well.
All you need to do is to change the method OnBnClickedButton1() to the corresponding button handler in VC 6.
void CYahooDlg::OnBnClickedButton1() { BSTR bsStatus; bReady=0; CString mPass("*****"); //<-------- Your Password here CString mUser("*****");//<-------- Your user ID here BSTR bsPW = mPass.AllocSysString(); BSTR bsUser=mUser.AllocSysString(); CString mStr; HRESULT hr1 ; hr1 = CoInitialize(NULL); if(!SUCCEEDED(hr1)) return ; hr1 = CoCreateInstance (CLSID_InternetExplorer, NULL, CLSCTX_LOCAL_SERVER, IID_IWebBrowser2, (LPVOID *)&pBrowser); //Create an Instance of web browser if(hr1==S_OK) { VARIANT_BOOL pBool=true; //Commentout this line if you dont //want the browser to be displayed pBrowser->put_Visible( pBool ) ; //the yahoo mail site COleVariant vaURL("http://mail.yahoo.com") ; COleVariant null; //Open the mail login page pBrowser->Navigate2(vaURL,null,null,null,null) ; //This while loop maks sure that the page //is fully loaded before we go to the next page while(!bReady) { pBrowser->get_StatusText(&bsStatus); mStr=bsStatus; if(mStr=="Done")bReady=1; } IDispatch* pDisp; //Get the underlying document object of the browser hr1=pBrowser->get_Document(&pDisp); if (pDisp != NULL ) { IHTMLDocument2* pHTMLDocument2; HRESULT hr; hr = pDisp->QueryInterface( IID_IHTMLDocument2, (void**)&pHTMLDocument2 ); //Ask for an HTMLDocument2 interface if (hr == S_OK) { //Enumerate the HTML elements IHTMLElementCollection* pColl = NULL; hr = pHTMLDocument2->get_all( &pColl ); if (hr == S_OK && pColl != NULL) { LONG celem; //Find the count of the elements hr = pColl->get_length( &celem ); if ( hr == S_OK ) { //Loop through each elment for ( int i=0; i< celem; i++ ) { VARIANT varIndex; varIndex.vt = VT_UINT; varIndex.lVal = i; VARIANT var2; VariantInit( &var2 ); IDispatch* pDisp; hr = pColl->item( varIndex, var2, &pDisp );//Get an element if ( hr == S_OK ) { IHTMLElement* pElem; hr = pDisp->QueryInterface( //Ask for an HTMLElemnt interface IID_IHTMLElement, (void **)&pElem); if ( hr == S_OK ) { BSTR bstr; //Get the tag name for the element hr = pElem->get_tagName(&bstr); CString strTag; strTag = bstr; //We need to check for //input elemnt on login screen IHTMLInputTextElement* pUser; hr = pDisp->QueryInterface( IID_IHTMLInputTextElement, (void **)&pUser ); if ( hr == S_OK ) { pUser->get_name(&bstr); mStr=bstr; if(mStr=="login") //Is this a User Id frield pUser->put_value(bsUser); //Paste the User Id else if(mStr=="passwd") //Or, is this a password field pUser->put_value(bsPW); //Paste your password //into the field pUser->Release(); } else{ IHTMLInputButtonElement* pButton; //If not Input field, //is this a submit button? hr = pDisp->QueryInterface( IID_IHTMLInputButtonElement, (void **)&pButton); if ( hr == S_OK ) { //We will submit the form that //contains the button //than clicking it IHTMLFormElement* pForm; //This will send the all the //information in correct format hr=pButton->get_form(&pForm); if ( hr == S_OK ) { //Submit the form pForm->submit(); //Now we dont have to see //other elements, //stop looping. i=celem; pForm->Release(); } pButton->Release(); } } pElem->Release(); } pDisp->Release(); } } } pColl->Release(); } pHTMLDocument2->Release(); //For the next page open a fresh document } pDisp->Release(); } } //Lets change the staus text, so as to //make sure that the next page is loaded CString statustext="OK"; pBrowser->put_StatusText(statustext.AllocSysString()); bReady=0; while(!bReady) { pBrowser->get_StatusText(&bsStatus); mStr=bsStatus; if(mStr=="Done")bReady=1; } //Ok, next page that is Inbox is loaded GetInbox(); //Call inbox handler } int CYahooDlg::GetInbox(void) //This will check for the link "Inbox" //in the next page after login. { BSTR bsStatus; CString mStr,mLocation; bReady=0; HRESULT hr1; IDispatch* pDisp; hr1=pBrowser->get_Document(&pDisp); //Again get the document object if (pDisp != NULL ) { IHTMLDocument2* pHTMLDocument2; HRESULT hr; hr = pDisp->QueryInterface( IID_IHTMLDocument2, (void**)&pHTMLDocument2 ); if (hr == S_OK) { IHTMLElementCollection* pColl = NULL; hr = pHTMLDocument2->get_all( &pColl ); if (hr == S_OK && pColl != NULL) { LONG celem; hr = pColl->get_length( &celem ); if ( hr == S_OK ) { for ( int i=0; i< celem; i++ ) { VARIANT varIndex; varIndex.vt = VT_UINT; varIndex.lVal = i; VARIANT var2; VariantInit( &var2 ); IDispatch* pDisp; hr = pColl->item( varIndex, var2, &pDisp ); if ( hr == S_OK ) { IHTMLElement* pElem; hr = pDisp->QueryInterface( IID_IHTMLElement, (void **)&pElem); if ( hr == S_OK) { //Look for Anchor element since //"Inbox" has a hyper link //associated with it IHTMLAnchorElement* pLink; hr = pDisp->QueryInterface( IID_IHTMLAnchorElement, (void **)&pLink); if ( hr == S_OK) { BSTR bstr; //Get the HREF value hr = pLink->get_href(&bstr); if(hr == S_OK){ CString strTag; strTag = bstr; if(strTag.Find("Inbox")>=0 && strTag.Find("ym/ShowFolder")>0) { //Does that contain the //keywords of a proper Inbox link? pElem->click(); //If so , click on the link i=celem; //Quit searching for other links } } pLink->Release(); } pElem->Release(); } pDisp->Release(); } } pColl->Release(); } } pHTMLDocument2->Release(); } pDisp->Release(); } CString statustext="OK"; //Change Status text again pBrowser->put_StatusText(statustext.AllocSysString()); bReady=0; while(!bReady) { pBrowser->get_StatusText(&bsStatus); mStr=bsStatus; if(mStr=="Done")bReady=1; } //Ok, mail list loaded, get each mail OnBnClickedGetmail(); return 0; } void CYahooDlg::OnBnClickedGetmail() //opens each unread mail (well, no support yet for //the next page of the list if the mail list is big) { BSTR bsStatus=0; CString mStr,mLocation; CString statustext="OK"; bReady=0; BOOL bMailAhead=0; HRESULT hr1; IDispatch* pDisp; hr1=pBrowser->get_Document(&pDisp); if (pDisp != NULL ) { IHTMLDocument2* pHTMLDocument2; HRESULT hr; hr = pDisp->QueryInterface( IID_IHTMLDocument2, (void**)&pHTMLDocument2 ); if (hr == S_OK) { IHTMLElementCollection* pColl = NULL; hr = pHTMLDocument2->get_all( &pColl ); if (hr == S_OK && pColl != NULL) { LONG celem; hr = pColl->get_length( &celem ); if ( hr == S_OK ) { for ( int i=0; i< celem; i++ ) { VARIANT varIndex; varIndex.vt = VT_UINT; varIndex.lVal = i; VARIANT var2; VariantInit( &var2 ); IDispatch* pDisp; hr = pColl->item( varIndex, var2, &pDisp ); if ( hr == S_OK ) { IHTMLElement* pElem; hr = pDisp->QueryInterface( //Get each element IID_IHTMLElement, (void **)&pElem); if ( hr == S_OK) { //Each unread mail-link appears //within <TR classname=msgnew> //and </TR> BSTR bstr; CString classname; pElem->get_className(&bstr); //So, lets check if the //class name is msgnew classname = bstr; if(classname=="msgnew") {bMailAhead=1;} //if it is found, it means //there's an unread mail following else if(classname=="msgold") {bMailAhead=0;} //Otherwise, it is msgold IHTMLAnchorElement* pLink; hr = pDisp->QueryInterface( IID_IHTMLAnchorElement, (void **)&pLink); if ( hr == S_OK) { BSTR bstr; //Find the link for the mail hr = pLink->get_href(&bstr); if(hr == S_OK){ CString strTag; strTag = bstr; if(strTag.Find("ShowLetter")>=0 && strTag.Find("Inbox")>0 && bMailAhead) { //Is this a proper mail link? //Click-open the mail pElem->click(); pBrowser->put_StatusText (statustext.AllocSysString()); bReady=0; while(!bReady) { pBrowser-> get_StatusText(&bsStatus); mStr=bsStatus; if(mStr=="Done")bReady=1; } //Save the mail that //was click-opened OnBnClickedSavemail(); //Check again before next //download.Or else, it might //re-enter the same mail bMailAhead=0; //Come back to //the same page OnBnClickedBack(); pBrowser->put_StatusText (statustext.AllocSysString()); bReady=0; while(!bReady) { pBrowser-> get_StatusText(&bsStatus); mStr=bsStatus; if(mStr=="Done")bReady=1; } } } pLink->Release(); } pElem->Release(); } pDisp->Release(); } } pColl->Release(); } } pHTMLDocument2->Release(); } pDisp->Release(); } OnBnClickedSignout(); // Ok Done, Signout return ; } void CYahooDlg::OnBnClickedSavemail() { BOOL bAdd=0,bHeader=0; CString mStr,mLocation; bReady=0; HRESULT hr1; IDispatch* pDisp; hr1=pBrowser->get_Document(&pDisp); if (pDisp != NULL ) { IHTMLDocument2* pHTMLDocument2; HRESULT hr; hr = pDisp->QueryInterface( IID_IHTMLDocument2, (void**)&pHTMLDocument2 ); if (hr == S_OK) { IHTMLElementCollection* pColl = NULL; hr = pHTMLDocument2->get_all( &pColl ); if (hr == S_OK && pColl != NULL) { LONG celem; hr = pColl->get_length( &celem ); if ( hr == S_OK ) { for ( int i=0; i< celem; i++ ) { VARIANT varIndex; varIndex.vt = VT_UINT; varIndex.lVal = i; VARIANT var2; VariantInit( &var2 ); IDispatch* pDisp; hr = pColl->item( varIndex, var2, &pDisp ); if ( hr == S_OK ) { IHTMLElement* pElem; hr = pDisp->QueryInterface( IID_IHTMLElement, (void **)&pElem); if ( hr == S_OK) { BSTR tagName; CString tag,tempStr; COleVariant attrb; if(!bHeader){ pElem->get_tagName(&tagName); tag=tagName; if(tag=="TR" ||tag=="tr") //a mail begins with //<TR classname="bge" > { tag="className"; tagName=tag.AllocSysString(); pElem->getAttribute(tagName,0,attrb); tag=attrb; if(tag=="bge"){ //let's add some //header tags to the mail //that can be processed //later from the client pElem->get_innerText(&tagName); tag=tagName; if(tag.Find("To:")>-1){ pElem->get_outerText(&tagName); tempStr=tagName; m_Body+="<YMailTo>"+ tempStr+"</YMailTo><BR>"; } if(tag.Find("From:")>-1){ pElem->get_outerText(&tagName); tempStr=tagName; m_Body+="<YMailFrom>"+ tempStr+"</YMailFrom><BR>"; } if(tag.Find("Date:")>-1){ pElem->get_outerText(&tagName); tempStr=tagName; m_Body+="<YMailDate>"+ tempStr+"</YMailDate><BR>"; } if(tag.Find("Subject:")>-1){ pElem->get_outerText(&tagName); tempStr=tagName; m_Body+="<YMailSub>"+tempStr+ "</YMailSub><BR><BR>"; bHeader=1; } } } } pElem->get_tagName(&tagName); tag=tagName; if((tag.Find("xbody")>-1) || (tag.Find("XBODY")>-1) )bAdd=1; // All HTML messages appear // inside <XBODY> tags if(bAdd) { //This will copy the contents //between the <XBODY> tags pElem->get_outerHTML(&tagName); m_Body+=tagName; if((m_Body.Find("/xbody")>-1) ||(m_Body.Find("/XBODY")>-1) ) { // end of he message bAdd=0; //done with copying the //message, so no //more iterations i=celem; } } pElem->Release(); } pDisp->Release(); } } pColl->Release(); } } pHTMLDocument2->Release(); } pDisp->Release(); } WriteMailFile(); // write the message on to file. return ; } void CYahooDlg::WriteMailFile() // Create a unique file name with time information { SYSTEMTIME sysTime;// Win32 time information GetSystemTime(&sysTime); COleDateTime Today(sysTime); CString filename,mStr; CFile datafile; filename.Format("m%d%d%d%d%d.htm", Today.GetDayOfYear(),Today.GetYear(), Today.GetHour(),Today.GetMinute(),Today.GetSecond()); if(datafile.Open(filename,CFile::modeCreate | CFile::modeWrite)){ datafile.Write(m_Body,m_Body.GetLength()); datafile.Close(); } m_DLStatus="Download Complete"; UpdateData(0); } CYahooDlg::~CYahooDlg(void) { } void CYahooDlg::OnBnClickedSignout()//signout ... bye!!! { BSTR bsStatus=0; CString mStr,mLocation; bReady=0; HRESULT hr1; IDispatch* pDisp; hr1=pBrowser->get_Document(&pDisp); if (pDisp != NULL ) { IHTMLDocument2* pHTMLDocument2; HRESULT hr; hr = pDisp->QueryInterface( IID_IHTMLDocument2, (void**)&pHTMLDocument2 ); if (hr == S_OK) { IHTMLElementCollection* pColl = NULL; hr = pHTMLDocument2->get_all( &pColl ); if (hr == S_OK && pColl != NULL) { LONG celem; hr = pColl->get_length( &celem ); if ( hr == S_OK ) { for ( int i=0; i< celem; i++ ) { VARIANT varIndex; varIndex.vt = VT_UINT; varIndex.lVal = i; VARIANT var2; VariantInit( &var2 ); IDispatch* pDisp; hr = pColl->item( varIndex, var2, &pDisp ); if ( hr == S_OK ) { IHTMLElement* pElem; hr = pDisp->QueryInterface( IID_IHTMLElement, (void **)&pElem); if ( hr == S_OK) { IHTMLAnchorElement* pLink; hr = pDisp->QueryInterface( IID_IHTMLAnchorElement, (void **)&pLink); if ( hr == S_OK) { BSTR bstr; hr = pLink->get_href(&bstr); if(hr == S_OK){ CString strTag; strTag = bstr; // Find a link that // lets us log-off if(strTag.Find("Logout")>=0 || strTag.Find("Sign Out")>=0){ pElem->click(); i=celem; } } pLink->Release(); } pElem->Release(); } pDisp->Release(); } } pColl->Release(); } } pHTMLDocument2->Release(); } pDisp->Release(); } return ; pBrowser->Release(); CoUninitialize(); } void CYahooDlg::OnBnClickedBack() { pBrowser->GoBack(); //ask browser to go back to the previous page }