Click here to Skip to main content
Rate this:
 
Please Sign up or sign in to vote.
See more: C++ Win32 Unicode
I have tried to display URL(contain unicode characters) response in Message Box.

But garbage characters displaying, so that i tried to made different conversions, but alphabits only displaying remaining unicode characters displaying as ?(or)garbage values.

im trying this from 7 days..plz help me...im very new to coding plz write the complete code.
HINTERNET hInternet = InternetOpen( _T(""), INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0 );
 
HINTERNET hConnect = InternetConnect( hInternet, L"http://xxxxxxxxxx.com", INTERNET_DEFAULT_HTTP_PORT, NULL, NULL, INTERNET_SERVICE_HTTP, 0, 1);
 
 DWORD options = INTERNET_FLAG_NEED_FILE|INTERNET_FLAG_HYPERLINK|INTERNET_FLAG_RESYNCHRONIZE|INTERNET_FLAG_RELOAD;
 
 HINTERNET hRequest  = InternetOpenUrl(hInternet,  L"http://xxxxxxxxxxx.com/c.php?varname=artxt&text=يبتىلمينبىمنيبغعاanand123",  NULL, 0, options, 0); 
 

TCHAR buffer[100];
DWORD bytesRead;
InternetReadFile(hRequest, buffer,100, &bytesRead);
MessageBoxW(NULL,ATL::CA2W(buffer),L"Check",MB_OK);
}
  
    InternetCloseHandle(hRequest);
    InternetCloseHandle(hConnect);

when i added below code:

HttpQueryInfo (hRequest,HTTP_QUERY_RAW_HEADERS_CRLF, (LPVOID) lpHeadersA,
&dwSize, NULL);
MessageBoxW(NULL,(LPWSTR)lpHeadersA,L"Type",MB_OK);

It giving output:

HTTP/1.1 200 OK
Date: Mon, 05 Nov 2012 03:55:10 GMT
Server: Apache
Keep-Alive: timeout=5, max=74
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
Posted 1-Nov-12 21:23pm
Edited 4-Nov-12 19:18pm
v8
Comments
Jochen Arndt 2-Nov-12 4:33am
   
You are using CA2W() to convert the received data. This will only work if the data is ANSI encoded using the same code page as your application. But many web pages use UTF-8 encoding nowadays. So you must check which encoding is used by the requested web page (usually part of the HTML header) and convert this to Unicode or the encoding of your application.

To avoid mixing of ANSI/Multi-Byte and Unicode strings in your application, it should be a Unicode application (I think it is already because otherwise you would get no connection when passing wide strings to the ANSI versions of the InternetOpen() functions).
venkat.yva 2-Nov-12 23:52pm
   
This is my URL
http://convert.wajihah.com/c.php?varname=artxt&text=يبتىلمينبىمنيبغعاanand123
i got it is of chunked-encoding But cant say exactly..plz can you do the code.
venkat.yva 2-Nov-12 6:59am
   
Can please tell me, what is the Method is used to check which encoding is used by the requested web page.
Mohibur Rashid 3-Nov-12 0:58am
   
Reading header. Header must have to tell the encoding type
venkat.yva 3-Nov-12 1:49am
   
This is my URL
http://convert.wajihah.com/c.php?varname=artxt&text=يبتىلمينبىمنيبغعاanand123
i found it is of chunked-encoding from headers...and still cant able to convert and display in MessageBox..plz can you solve the code.
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 4

I made some modifications on the code you posted, I assumed that your web page uses utf-8:

Test web application:

#!python
# -*- coding: utf-8 -*-
from bottle import *
 
content_type = 'text/html; charset=utf-8'
 
@route("/ar/")
def ar():
    response.content_type = content_type
    return "تجربة"
 
@route("/en/")
def en():
    response.content_type = content_type
    return "Test"
 
@route("/zn/")
def zn():
    response.content_type = content_type
    return "测试"
 
run(port=8080)

The code:

#include <windows.h>
#include <wininet.h>

#pragma comment(lib, "wininet.lib")

INT WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nShowCmd)
{
    HINTERNET hInternet = InternetOpen(TEXT("foo"), INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);
 
    HINTERNET hConnect = InternetConnect(hInternet, TEXT("http://localhost:8080/"), INTERNET_DEFAULT_HTTP_PORT, NULL, NULL, INTERNET_SERVICE_HTTP, 0, 1);
 
    DWORD options = INTERNET_OPTION_HTTP_DECODING | INTERNET_FLAG_NEED_FILE | INTERNET_FLAG_HYPERLINK | INTERNET_FLAG_RESYNCHRONIZE | INTERNET_FLAG_RELOAD;
 
    HINTERNET hRequest  = InternetOpenUrl(hInternet,  TEXT("http://localhost:8080/ar/"), NULL, 0, options, 0); 
 
    BYTE buffer[100] = {0};
    TCHAR szResp[100] = {0};
    DWORD bytesRead;
 
    InternetReadFile(hRequest, &buffer[0], sizeof(buffer) - 1, &bytesRead);
 
    MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, (LPCSTR) &buffer[0], sizeof(buffer), &szResp[0], sizeof(szResp));
 
    MessageBox(NULL, &szResp[0], TEXT("Check"), MB_OK);
 
    InternetCloseHandle(hRequest);
    InternetCloseHandle(hConnect);
 
	return 0;
}

<img src="http://img849.imageshack.us/img849/8574/testdk.png" alt="test">

Note, make sure you handle the errors.</img>
  Permalink  
Comments
venkat.yva 4-Nov-12 23:41pm
   
i am not getting the exact output,showing garbage values.
please check the above code using my URL.
venkat.yva 4-Nov-12 23:48pm
   
//when i added below code in my code

HttpQueryInfo (hRequest,HTTP_QUERY_RAW_HEADERS_CRLF, (LPVOID) lpHeadersA,
&dwSize, NULL);

//It showing like this

HTTP/1.1 200 OK
Date: Mon, 05 Nov 2012 03:55:10 GMT
Server: Apache
Keep-Alive: timeout=5, max=74
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
venkat.yva 4-Nov-12 23:51pm
   
and observe plz there, there is NO "charset=utf-8".
when i am using another URLs showing charset=utf-8.
Barakat S. 5-Nov-12 7:09am
   
Your page encoding is "ISO-8859-1", you didn't specify any encoding. Check ISO-8859-1 - Latin 1 : http://www.terena.org/activities/multiling/ml-docs/iso-8859.html#ISO-8859-1

To make it UTF-8, add the flowing in top of your php file:

header("Content-type: text/html; charset=utf-8");

for example:

<?php
header("Content-type: text/html; charset=utf-8");

if( isset($_GET["name"]) ) {
echo "name = " . htmlspecialchars($_GET["name"]);
} else {
echo "name = None";
}

?>

It should work fine.
venkat.yva 14-Nov-12 2:18am
   
i dont have any PHP file to add your code...so i want total code in win32.
Sergey Chepurin 9-Nov-12 6:46am
   
venkat.yva: Sorry for the late answer, but it took me some time to understand what you really want. I just checked the code from Baracat.S and it works fine with your site. It prints proper text in Arabic. I guess, the answer given by Jochen Arndt is also coded correctly (simply didn't check it). I could add the C++11 almost universal solution, but don't see any need in that after solutions provided work fine.
venkat.yva 14-Nov-12 2:16am
   
Barakat S. given PHP code to do arabic conversion...that is may work..but i want total code in win32 only.
Sergey Chepurin 14-Nov-12 13:03pm
   
If you hardcode URL of your site (http://convert.wajihah.com/c.php?varname=artxt&text=يبتىلمينبىمنيبغعاanand123 ) in InternetConnect() ant InternetOpenUrl() in the given code insted of localhost, you will get the message "artxt=٣٢١dnanaﺎﻌﻐﺒﻴﻨﻣﻰﺒﻨﻴﻤﻟﻰﺘﺒﻳ&done=2" printed in Message Box. PHP code is nice (but not necessary) addition from Baracat.S. Create sample Windows 32 application in VC++2010 and add this code in proper place.
venkat.yva 20-Nov-12 4:58am
   
no im not getting the result that you posted... can you please send me the code what you have used.
Sergey Chepurin 20-Nov-12 16:01pm
   
It does not work this way. I checked the solution and it works, then i told you how it can be done but you should code it yourself.
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 2

If the text from the requested web site uses UTF-8 encoding, it must be converted to show it in a mesage box:

// Use char here. InternetReadFile() reads bytes.
char buffer[100];
DWORD bytesRead;
InternetReadFile(hRequest, buffer, sizeof(buffer), &bytesRead);
 
// Get size of converted string including terminating NULL byte.
// With UTF-8 we may omit this and use bytesRead + 1 as size
//  because the resulting string would not have more than bytesRead wide
//  characters.
int nSize = ::MultiByteToWideChar(CP_UTF8, 0, buffer, bytesRead + 1, NULL, 0);
if (nSize)
{
    LPWSTR lpszText = new WCHAR[nSize];
    // Convert UTF-8 string to wide string
    ::MultiByteToWideChar(CP_UTF8, 0, buffer, bytesRead, lpszText, nSize);
    // Terminate string. buffer may be not NULL terminated.
    lpszText[nSize - 1] = L'\0';
    MessageBoxW(NULL, lpszText, L"Check", MB_OK);
    delete [] lpszText;
}
If the page uses other encodings, pass this instead of CP_UTF8 (e.g. 28596 with ISO-8859-6, see Code-Page Identifiers[^] in the MSDN).
  Permalink  
Comments
venkat.yva 3-Nov-12 7:57am
   
Code-Page Identifiers[^] in the MSDN). not working to display arabic characters.
is there any another way
Sergey Chepurin 3-Nov-12 9:00am
   
Do you use VS2010 to compile?
venkat.yva 4-Nov-12 23:09pm
   
yeah, in visual studio 2010.
Sergey Chepurin 9-Nov-12 6:46am
   
venkat.yva: Sorry for the late answer, but it took me some time to understand what you really want. I just checked the code from Baracat.S and it works fine with your site. It prints proper text in Arabic. I guess, the answer given by Jochen Arndt is also coded correctly (simply didn't check it). I could add the C++11 almost universal solution, but don't see any need in that after solutions provided work fine.
Jochen Arndt 3-Nov-12 9:34am
   
The link posted in your comments does not contain any headers. It is just some PHP generated output using UTF-8 encoding. So using my code, you should see the content.
venkat.yva 4-Nov-12 23:38pm
   
i am not getting the exact output, when i am using arabic related identifiers from Code-Page Identifiers[^] in the MSDN..i am getting different type of characters(nearly like arabic) but not showing exact output what i have given in my URL.
please check the above code using my URL.
Jochen Arndt 5-Nov-12 3:24am
   
You are right. It is not the same. But it is UTF-8. When you save the output to a file and open it with a hex editor, you will see that the codes are in the UTF-8 range EF BA xx and EF BB xx (Unicode code points U+FE7x to U+FEFx). These codes are from the 'Arabic Presentation Forms-B' range.

So the PHP script performs some sort of conversion.
venkat.yva 5-Nov-12 6:19am
   
i do want to do using C++ and win32 only....with out PHP...is it possible? can you send me any process for that.
Jochen Arndt 5-Nov-12 6:39am
   
You can't do anything else on the C++ side.

My code is showing the same text as shown by a web browser. It is the text produced by the PHP script running on the web server.

The web site you are using is a converter: It is intended that the output is not the same as the input.

At least you should make clear what you want to do. But this would be probably a new question.

The original question is answered:
You no longer have garbage displayed but the returned text.
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 1

just add the below statement in your header

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

and if you are reading this page with your c++ then first read the header i.e. http header, it will tell you what encoding you need to follow then simply convert entire page to unicode(wide character). it will be easy for you to manipulate the text. otherwise you will have to follow special method to manipulate.
  Permalink  
Comments
venkat.yva 3-Nov-12 6:05am
   
I am using visual studio 2010.. so not accepting that code which you have given.
And i am just one month experience on software field..so cant able to understand clearly.
so can u plz send me the complete code..And i already found that URL giving chunked-encoding response.
Mohibur Rashid 3-Nov-12 6:07am
   
this code is not for visual studio. this code is for html............

follow jochen arndit answer
venkat.yva 3-Nov-12 6:11am
   
im doing in visual studio 2010 only...im struggling to get the out put.. plz requesting you to do the code in visual studio 2010.
venkat.yva 3-Nov-12 6:14am
   
now i just want the code to transfer chunked-encoding response to normal form to display in MessageBox.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month


Web02 | 2.8.160204.4 | Advertise | Privacy
Copyright © CodeProject, 1999-2016
All Rights Reserved. Terms of Service

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100