Click here to Skip to main content
15,867,977 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hi,
Japanese text is not reading from ANSI File (i.e. I have Japanese text in some ANSI file) how to read the Japanese data from ANSI file without losing character please let me know.
Posted
Comments
[no name] 17-Apr-15 9:31am    
for testing purpose

CString str;
FILE *fOutStream;
CStdioFile fileOut(_T("C:\jap.txt"), CFile::modeRead);
// _wfopen_s(&fOutStream, _T("C:\jap.txt"), _T("rt,ccs=ANSI"))
/// CStdioFile fileOut(fOutStream);
fileOut.ReadString(str);

this read string - (not correctly displaying japanse text)

Please let me know how to do code page
Jochen Arndt 17-Apr-15 10:13am    
I have added an example to my solution. But you still have to know the code page number.
Mohibur Rashid 22-Apr-15 3:18am    
What is the encoding? Shift-jis? EUC-JP? UTF8, something else????????? Why don't you give us better picture of what you are trying to do...

You must know the code page used by the file. Then read the file content into a buffer and use MultiByteToWideChar[^] passing the code page number to convert it to Unicode.

With Japanese text, the code page number may be 932 (Shift-JIS), 2032 (EUC-JP), 50220 to 50222 (ISO-2022), or 51932. See also Code Page Identifiers[^].

[EDIT]
An untested (but compiling) example:
C++
#include <io.h>

// Read ANSI file into allocated wide string buffer.
// Returned buffer must be deleted manually.
LPWSTR ReadAnsiFile(LPCTSTR lpszFileName, UINT nCP)
{
    WCHAR *lpszWideString = NULL;
    // Open file for reading
    FILE * f = _tfopen(lpszFileName, _T("rb"));
    if (f)
    {
        // Get file length
        long nAnsiLen = _filelength(_fileno(f));
        // Allocate buffer and read file into it
        char *buf = new char[nAnsiLen];
        fread(buf, 1, nAnsiLen, f);
        // Get number of corresponding Unicode characters
        int nWideLen = ::MultiByteToWideChar(nCP, 0, buf, nAnsiLen, NULL, 0);
        if (nWideLen > 0)
        {
            // Allocate Unicode string buffer with space for zero terminator
            lpszWideString = new wchar_t[nWideLen + 1];
            // Convert to Unicode
            ::MultiByteToWideChar(nCP, 0, buf, nAnsiLen, lpszWideString, nWideLen);
            // Terminate string
            lpszWideString[nWideLen] = L'\0';
        }
        // Cleanup
        delete [] buf;
        fclose(f);
    }
    return lpszWideString;
}


When the file has been created on the same system by the same user, using CP_ACP may be used. Otherwise, the code page number must be known.

[EDIT 2: Added inclusion of io.h to code]
 
Share this answer
 
v3
_filelength(fileno(f)); is giving some error please suggest me other function
 
Share this answer
 
Comments
Jochen Arndt 22-Apr-15 3:00am    
You must include the io.h header file. I have updated my solution.

Please don't post such comments as solution. It might be removed.

Use the 'Have a Question or Comment' button instead (like I have done now). Then the poster of the related post gets a mail notification (like you should get for this one).
[no name] 30-Apr-15 2:57am    
code page for Czeh please ... for Czeh its not working

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS


CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900