Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

String Conversions

0.00/5 (No votes)
22 May 2000 1  
Set of classes enabling UNICODE and ANSI string conversion.
  • Download source files - 1 Kb
  • This article describes a set of classes that are used to perform string conversions between the Unicode and ANSI character sets. Why is it needed? There are several reasons:

    1. If you are developing for COM, then all strings passed via the COM interfaces are wide strings (unless they are part of the VARIANT structure).

    2. If you are developing communication applications then (most probably) you need to send/receive data to/from the communication channel in ANSI character set (and your application is either Unicode or ANSI).
    3. If you want to have a portable application and you need to use either ANSI or Unicode strings for any reason.
    4. You are developing a DLL, service or applications that are not MFC based and you cannot use CString.

    In addition, one should consider the following:

    1. When passing wide strings via the COM interfaces, memory for these strings must be allocated with CoTaskMemAlloc() and released with CoTaskMemFree().

    2. In any other situation, it is allowed to use new and delete.

    Besides ANSI and Unicode strings, there is also a BSTR string. Microsoft class _bstr_t encapsulates the BSTR strings.

    Summary

    The small class library presented in this article contains the following classes:

    Class name

    Description

    Memory (de)allocation

    _tochar

    Convert any string to LPSTR or LPCSTR

    new/delete

    _towchar

    Convert any string to LPWSTR or LPCWSTR

    new/delete

    _totchar

    Convert any string to LPTSTR or LPCTSTR

    new/delete

    _cochar

    Convert any string to LPSTR or LPCSTR

    CoTaskMemAlloc/CoTaskMemFree

    _cowchar

    Convert any string to LPWSTR or LPCWSTR

    CoTaskMemAlloc/CoTaskMemFree

    _cotchar

    Convert any string to LPTSTR or LPCTSTR

    CoTaskMemAlloc/CoTaskMemFree

    where

    LPSTR

    char *

    Always

    LPCSTR

    const char *

    Always

    LPWSTR

    wchar_t *

    Always

    LPCWSTR

    const wchar_t *

    Always

    LPTSTR

    char *

    _UNICODE not defined

    LPTSTR

    wchar_t *

    _UNICODE defined

    LPCTSTR

    const char *

    _UNICODE not defined

    LPCTSTR

    const wchar_t *

    _UNICODE defined

    All classes are implemented inline so it is enough to include the file in your project and use the classes.

    Short Description

    All classes are designed to create a new string based on the supplied one. The buffer for the created string is automatically deleted in class destructor (unless auto delete flag is set to FALSE in the call to the constructor). This is important in COM. According to COM memory management rules, all OUT arguments must be allocated by the callee and deallocated by the caller using the standard COM memory allocator (CoTaskMemAlloc/CoTaskMemFree).

    All other classes have exactly the same interface. The only difference is the type casting operator and the type of string argument in constructor.

    For example, the _totchar class declaration looks like following:

    class _totchar { 
    private:
    BOOL m_bAutoDelete;
    LPTSTR m_tszBuffer;
    public:
    _totchar(LPCSTR szText, BOOL bAutoDelete = TRUE);
    _totchar(LPCWSTR wszText, BOOL bAutoDelete = TRUE);
    ~_totchar();
    operator LPTSTR();
    operator LPCTSTR();
    };

    The first constructor takes a const char * argument while the second constructor takes a const wchar_t * argument. Both constructors convert the string argument to LPTSTR using "new" (the _cotchar class has exactly the same functionality but uses CoTaskMemAlloc for memory allocation).

    By default, the internal auto delete flag is set to TRUE. This means that the memory for internally created LPTSTR string will be deallocated in destructor. By setting auto delete to FALSE, memory allocated for internal LPTSTR is left intact and you have to delete it manually. This is most often used with class _cowchar because it creates a UNICODE string that can be passed via COM.

    In order to access the internal LPTSTR string, class supplies 2 type casting operators.

    Examples

    Following are several examples that use some of the classes:

    Example 1:

    You are developing a COM server and one interface method receives a string as an argument. This is the code to convert it to a portable string:

    HRESULT STDMETHODCALLTYPE 
    IOPCGroupStateMgt::SetName( /* [string][in] */ LPCWSTR szName)
    {
    _totchar c1(szName);
    printf(_T("%s"), c1);
    return S_OK;
    }

    Example 2:

    You are developing a COM server and one interface method requires you to return a string as an argument.

    HRESULT STDMETHODCALLTYPE 

    IOPCGroupStateMgt::GetName( /* [string][out] */ LPWSTR *ppName)
    {
    LPCTSTR szName = _T("Test Name");
    _cowchar c1(szName, FALSE);
    ppName = c1;
    return S_OK;
    }

    This example converts a portable string (LPCTSTR) to Unicode string and does not deallocate memory because the converted string is assigned to an out argument of the COM interface method.

    Example 3:

    You are developing a portable application and need to send an ANSI string to serial port.

    void TSerialPort::Write(LPCTSTR szText)
    { _tochar c1(szText);
    writeString(c1, strlen(c1));
    }

    void TSerialPort::writeString(const char *szString)
    {
    ...
    }

    That's all. I hope that you will find it as useful as I have.

    License

    This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

    A list of licenses authors might use can be found here