Click here to Skip to main content
Click here to Skip to main content

CSplitURL - split a URL into component parts

By , 10 Nov 2002
 
<!-- Download Links --> <!-- Article image -->

Sample Image - SplitURL.jpg

<!-- Add the rest of your HTML here -->

Introduction

Recently I needed some code that would split a URL into component parts (scheme, host, folder, etc.) and I came across a WinInet function called InternetCrackUrl which does the job. The function itself is fairly straighforward, but I thought the demo supplied on the MSDN might be a tad confusing for new users - so I have created CSplitURL to wrap the call.

This class is not going to win any prizes, but it is very easy to use - first include the header:

#include "url.h"

Then, to use the class, simply declare a CSplitURL object, passing the URL to split, e.g.:

CSplitURL url(_T("http://www.codeproject.com"));

Once you have created a CSplitURL, you can access the various URL components using the following methods:

  • GetScheme
  • GetPort
  • GetSchemeName
  • GetHostName
  • GetUserName
  • GetPassword
  • GetURLPath
  • GetExtraInfo

All pretty self-explanatory. Note that using this class in your application will create a dependency on WININET.DLL, which hopefully won't be a problem. Note also that this class can be used with any framework that includes a CString class - including MFC, WTL and ATL7.

CSplitURL

// Implementation of the CURLComponents and CSplitURL classes.

#pragma once
#pragma comment(lib, "wininet.lib")

#include "wininet.h"

// Class to wrap the Win32 URL_COMPONENTS structure
class CURLComponents : public URL_COMPONENTS
{
public:    
    CURLComponents(void)
    {
        memset(this, 0, sizeof(URL_COMPONENTS));
        dwStructSize = sizeof(URL_COMPONENTS);
    }
};

// Class used to split a URL into component parts.
// Note: Uses WININET InternetCrackUrl function.
class CSplitURL
{
private:
    CString m_strScheme;
    INTERNET_SCHEME m_nScheme;
    CString m_strHostName;
    INTERNET_PORT m_nPort;
    CString m_strUserName;
    CString m_strPassword;
    CString m_strURLPath;
    CString m_strExtraInfo;
public:    
    CSplitURL(void)
        : m_nScheme(INTERNET_SCHEME_DEFAULT)
        , m_nPort(0)
    {
    }
    
    CSplitURL(LPCTSTR lpsz)
        : m_nScheme(INTERNET_SCHEME_DEFAULT)
        , m_nPort(0)
    {
        Split(lpsz);
    }

    ~CSplitURL(void)
    {
    }

    // Split a URL into component parts
    bool Split(LPCTSTR lpsz)
    {
        // Be defensive
        ATLASSERT(lpsz != NULL && *lpsz != '\0');
        // Get the URL length
        DWORD dwLength = _tcslen(lpsz);

        CURLComponents url;        
        // Fill structure
        url.lpszScheme = m_strScheme.GetBuffer(dwLength);
        url.dwSchemeLength = dwLength;
        url.lpszHostName = m_strHostName.GetBuffer(dwLength);
        url.dwHostNameLength = dwLength;
        url.lpszUserName = m_strUserName.GetBuffer(dwLength);
        url.dwUserNameLength = dwLength;
        url.lpszPassword = m_strPassword.GetBuffer(dwLength);
        url.dwPasswordLength = dwLength;
        url.lpszUrlPath = m_strURLPath.GetBuffer(dwLength);
        url.dwUrlPathLength = dwLength;
        url.lpszExtraInfo = m_strExtraInfo.GetBuffer(dwLength);
        url.dwExtraInfoLength = dwLength;
        // Split
        bool bRet = InternetCrackUrl(lpsz, 0, 0, &url) != FALSE;
        // Release buffers
        m_strScheme.ReleaseBuffer();
        m_strHostName.ReleaseBuffer();
        m_strUserName.ReleaseBuffer();
        m_strPassword.ReleaseBuffer();
        m_strURLPath.ReleaseBuffer();
        m_strExtraInfo.ReleaseBuffer();
        // Get the scheme/port
        m_nScheme = url.nScheme;
        m_nPort = url.nPort;
        // Done
        return bRet;
    }

    // Get the scheme number
    inline INTERNET_SCHEME GetScheme(void) const { return m_nScheme; }
    // Get the port number
    inline INTERNET_PORT GetPort(void) const { return m_nPort; }
    // Get the scheme name
    inline LPCTSTR GetSchemeName(void) const { return m_strScheme; }
    // Get the host name
    inline LPCTSTR GetHostName(void) const { return m_strHostName; }
    // Get the user name
    inline LPCTSTR GetUserName(void) const { return m_strUserName; }
    // Get the password
    inline LPCTSTR GetPassword(void) const { return m_strPassword; }
    // Get the URL path
    inline LPCTSTR GetURLPath(void) const { return m_strURLPath; }
    // Get the extra info
    inline LPCTSTR GetExtraInfo(void) const { return m_strExtraInfo; }
};

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Rob Caldecott
Architect
United Kingdom United Kingdom
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralGreat classmemberlekrot10-Aug-06 13:57 
Fantastic work. So easy to use. So simple and yet so powerful. Just what I needed. Again, thanks...
GeneralRe: Great classmemberRobert Edward Caldecott10-Aug-06 22:32 
Thanks! Smile | :)
GeneralGreat ArticlememberThatsAlok22-Sep-04 19:55 
Really great article, i am finding tha same solution for month for my com application thanks
5 globe from me
 
-----------------------------
"I Think It Will Help"
-----------------------------
Alok Gupta
visit me at http://www.thisisalok.tk
GeneralRe: Great ArticlememberRobert Edward Caldecott22-Sep-04 20:38 
Thankyou very much Alok! It's always good to get some feedback.
 

The Rob Blog
GeneralCUrl in ATL 7.0 atlutil.hmemberNorm Almond12-Nov-02 9:52 
There's also a URL cracker in the ATL 7.0, in fact there's ton of new functionality in ATL 7.0!

GeneralRe: CUrl in ATL 7.0 atlutil.hmemberRobert Edward Caldecott12-Nov-02 21:56 
Indeed - ATL7 rocks - but sadly, not everyone will be using it... Frown | :(
 

When I am king, you will be first against the wall.
GeneralRe: CUrl in ATL 7.0 atlutil.hmemberBjornar Henden30-Nov-02 6:03 
You get a link error in the ATL7 version of CUrl when _ATL_MIN_CRT is defined.
GeneralExtra feature requestmemberChris Hambleton12-Nov-02 6:27 
Nice class! Can you add the capability for the extra info to be chopped into parts (key=val) so these keys-values can be enumerated? Something like:
 
map KeyValPairs;
srcStr = "select=335280&exp=5&fr=1#xx335280xx"
 
SplitExtraInfo(srcStr, "?&#", KeyValPairs)
 
Calling SplitExtraInfo() strtoks the srcStr with the chars in param #2, and would produce an array like:
select --> 335280
exp --> 5
fr --> 1
--> xx335280xx
 
Just some thoughts.... Smile | :)
 


 
"No one goes to hell because of their sin, but because of rejecting God's method of salvation: His Son's life for yours..."

"It does not take a majority to prevail ... but rather an irate, tireless minority, keen on setting brushfires of freedom in the minds of men." --Samuel Adams

GeneralRe: Extra feature requestmemberRobert Edward Caldecott12-Nov-02 6:38 
Sounds like you've done most of thw work already! Big Grin | :-D
 
I'll consider it when I get some time (more articles to prepare...).
 

When I am king, you will be first against the wall.
GeneralRe: Extra feature requestmemberHoudini13-Nov-02 5:50 
Might I suggest writing a tokenizer iterator class for that, rather than using strok.
 

- Houdini
GeneralAmazingmemberAdrian Edmonds11-Nov-02 18:37 
Just when I was starting a new WTL test suite and bemoaning the lack of AfxparseUrl , along this comes on the ATL list. Wonderful!
Than:-Dks.
GeneralRe: AmazingsussAnonymous11-Nov-02 18:44 

Why don't you get the source code for AfxParseUrl in the MFC ? Don't we have it ?
GeneralRe: AmazingmemberAdrian Edmonds11-Nov-02 19:36 
I explained badly. That would mean porting away from MFC as all my projects now are ATl/WTL/STL.Big Grin | :-D
GeneralRe: AmazingsussAnonymous11-Nov-02 20:16 
Adrian Edmonds wrote:
That would mean porting away from MFC
 
Not quite, the AfxParseUrl() implementation is exactly the code shown in this article.

GeneralRe: AmazingmemberRobert Edward Caldecott11-Nov-02 22:01 
I just had a look at the source for the MFC7 version of AfxParseUrl and it's quite different. My version is much simpler (not sure if that's good or bad! Wink | ;) ).
 

When I am king, you will be first against the wall.
GeneralGreatsussSRK11-Nov-02 15:33 
Exactly the one that i was looking for.
Thanks a lot.
Big Grin | :-D
GeneralCoolsitebuilderUwe Keim11-Nov-02 6:13 
Well done. Now go on and make the parameters accessible as a name-value-map Smile | :)
 
--
- Free Windows-based Web Content Management System: http://www.zeta-software.de/enu/producer/freeware/download.html
- Scanned MSDN Mag ad with YOUR name: www.magerquark.de/misc/CodeProject.html
- See me: www.magerquark.de
GeneralRe: CoolmemberRobert Edward Caldecott11-Nov-02 6:16 
Well done
 
Thanks Uwe. Much appreciated. Rose | [Rose]
 

When I am king, you will be first against the wall.
QuestionAfxParseUrl?membertwask11-Nov-02 3:54 
What about AfxParseUrl? Confused | :confused:
AnswerRe: AfxParseUrl?memberRobert Edward Caldecott11-Nov-02 4:04 
No good to WTL users like myself! Afx... functions are MFC only.
 

When I am king, you will be first against the wall.
GeneralRe: AfxParseUrl?membertwask11-Nov-02 4:05 
Accepted. Smile | :)
Jokehttp://camp.tangoing.info/susshttp://camp.tangoing.info/4-Dec-07 15:21 
http://camp.tangoing.info/
http://camp.tangoing.info/

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130619.1 | Last Updated 11 Nov 2002
Article Copyright 2002 by Rob Caldecott
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid