Download source files - 1 Kb
This article describes a set of classes that are used to perform
string conversions between the Unicode and ANSI character sets. Why
is it needed? There are several reasons:
-
If you are developing for COM, then all strings passed via the COM
interfaces are wide strings (unless they are part of the VARIANT structure).
- If you are developing communication applications
then (most probably) you need to send/receive data to/from the
communication channel in ANSI character set (and your application is
either Unicode or ANSI).
- If you want to have a portable application and you
need to use either ANSI or Unicode strings for any reason.
- You are developing a DLL, service or applications
that are not MFC based and you cannot use CString.
In addition, one should consider the following:
-
When passing wide strings via the COM interfaces, memory for these
strings must be allocated with CoTaskMemAlloc()
and
released with CoTaskMemFree()
.
- In any other situation, it is allowed to use new
and delete.
Besides ANSI and Unicode strings, there is also a BSTR string.
Microsoft class _bstr_t encapsulates the BSTR strings.
Summary
The small class library presented in this article contains the
following classes:
Class name |
Description |
Memory (de)allocation |
_tochar |
Convert any string to LPSTR or LPCSTR |
new/delete |
_towchar |
Convert any string to LPWSTR or LPCWSTR |
new/delete |
_totchar |
Convert any string to LPTSTR or LPCTSTR |
new/delete |
_cochar |
Convert any string to LPSTR or LPCSTR |
CoTaskMemAlloc/CoTaskMemFree |
_cowchar |
Convert any string to LPWSTR or LPCWSTR |
CoTaskMemAlloc/CoTaskMemFree |
_cotchar |
Convert any string to LPTSTR or LPCTSTR |
CoTaskMemAlloc/CoTaskMemFree |
where
LPSTR |
char * |
Always |
LPCSTR |
const char * |
Always |
LPWSTR |
wchar_t * |
Always |
LPCWSTR |
const wchar_t * |
Always |
LPTSTR |
char * |
_UNICODE not defined |
LPTSTR |
wchar_t * |
_UNICODE defined |
LPCTSTR |
const char * |
_UNICODE not defined |
LPCTSTR |
const wchar_t * |
_UNICODE defined |
All classes are implemented inline so it is enough to include the
file in your project and use the classes.
Short Description
All classes are designed to create a new string based on the supplied
one. The buffer for the created string is automatically deleted in
class destructor (unless auto delete flag is set to FALSE in the call
to the constructor). This is important in COM. According to COM
memory management rules, all OUT arguments must be allocated by the
callee and deallocated by the caller using the standard COM memory
allocator (CoTaskMemAlloc/CoTaskMemFree
).
All other classes have exactly the same interface. The only
difference is the type casting operator and the type of string
argument in constructor.
For example, the _totchar
class declaration looks like following:
class _totchar {
private:
BOOL m_bAutoDelete;
LPTSTR m_tszBuffer;
public:
_totchar(LPCSTR szText, BOOL bAutoDelete = TRUE);
_totchar(LPCWSTR wszText, BOOL bAutoDelete = TRUE);
~_totchar();
operator LPTSTR();
operator LPCTSTR();
};
The first constructor takes a const char *
argument
while the second constructor takes a const wchar_t *
argument. Both constructors convert the string argument to LPTSTR
using "new" (the _cotchar
class has exactly
the same functionality but uses CoTaskMemAlloc
for
memory allocation).
By default, the internal auto delete flag is set to TRUE. This means
that the memory for internally created LPTSTR string will be
deallocated in destructor. By setting auto delete to FALSE, memory
allocated for internal LPTSTR is left intact and you have to delete
it manually. This is most often used with class _cowchar
because it creates a UNICODE string that can be passed via COM.
In order to access the internal LPTSTR string, class supplies 2 type
casting operators.
Examples
Following are several examples that use some of the classes:
Example 1:
You are developing a COM server and one interface method receives a
string as an argument. This is the code to convert it to a portable string:
HRESULT STDMETHODCALLTYPE
IOPCGroupStateMgt::SetName( LPCWSTR szName)
{
_totchar c1(szName);
printf(_T("%s"), c1);
return S_OK;
}
Example 2:
You are developing a COM server and one interface method requires you
to return a string as an argument.
HRESULT STDMETHODCALLTYPE
IOPCGroupStateMgt::GetName( LPWSTR *ppName)
{
LPCTSTR szName = _T("Test Name");
_cowchar c1(szName, FALSE);
ppName = c1;
return S_OK;
}
This example converts a portable string (LPCTSTR) to Unicode string
and does not deallocate memory because the converted string is
assigned to an out argument of the COM interface method.
Example 3:
You are developing a portable application and need to send an ANSI
string to serial port.
void TSerialPort::Write(LPCTSTR szText)
{ _tochar c1(szText);
writeString(c1, strlen(c1));
}
void TSerialPort::writeString(const char *szString)
{
...
}
That's all. I hope that you will find it as useful as I have.