Click here to Skip to main content
15,880,608 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have tried to display URL(contain unicode characters) response in Message Box.

But garbage characters displaying, so that i tried to made different conversions, but alphabits only displaying remaining unicode characters displaying as ?(or)garbage values.

im trying this from 7 days..plz help me...im very new to coding plz write the complete code.
C++
HINTERNET hInternet = InternetOpen( _T(""), INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0 );

HINTERNET hConnect = InternetConnect( hInternet, L"http://xxxxxxxxxx.com", INTERNET_DEFAULT_HTTP_PORT, NULL, NULL, INTERNET_SERVICE_HTTP, 0, 1);

 DWORD options = INTERNET_FLAG_NEED_FILE|INTERNET_FLAG_HYPERLINK|INTERNET_FLAG_RESYNCHRONIZE|INTERNET_FLAG_RELOAD;

 HINTERNET hRequest  = InternetOpenUrl(hInternet,  L"http://xxxxxxxxxxx.com/c.php?varname=artxt&text=يبتىلمينبىمنيبغعاanand123",  NULL, 0, options, 0); 


TCHAR buffer[100];
DWORD bytesRead;
InternetReadFile(hRequest, buffer,100, &bytesRead);
MessageBoxW(NULL,ATL::CA2W(buffer),L"Check",MB_OK);
}
  
    InternetCloseHandle(hRequest);
    InternetCloseHandle(hConnect);


when i added below code:


C++
HttpQueryInfo (hRequest,HTTP_QUERY_RAW_HEADERS_CRLF, (LPVOID) lpHeadersA,
&dwSize, NULL);
MessageBoxW(NULL,(LPWSTR)lpHeadersA,L"Type",MB_OK);


It giving output:


HTTP/1.1 200 OK
Date: Mon, 05 Nov 2012 03:55:10 GMT
Server: Apache
Keep-Alive: timeout=5, max=74
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
Posted
Updated 4-Nov-12 18:18pm
v8
Comments
Jochen Arndt 2-Nov-12 4:33am    
You are using CA2W() to convert the received data. This will only work if the data is ANSI encoded using the same code page as your application. But many web pages use UTF-8 encoding nowadays. So you must check which encoding is used by the requested web page (usually part of the HTML header) and convert this to Unicode or the encoding of your application.

To avoid mixing of ANSI/Multi-Byte and Unicode strings in your application, it should be a Unicode application (I think it is already because otherwise you would get no connection when passing wide strings to the ANSI versions of the InternetOpen() functions).
venkat.yva 2-Nov-12 23:52pm    
This is my URL
http://convert.wajihah.com/c.php?varname=artxt&text=يبتىلمينبىمنيبغعاanand123
i got it is of chunked-encoding But cant say exactly..plz can you do the code.
venkat.yva 2-Nov-12 6:59am    
Can please tell me, what is the Method is used to check which encoding is used by the requested web page.
Mohibur Rashid 3-Nov-12 0:58am    
Reading header. Header must have to tell the encoding type
venkat.yva 3-Nov-12 1:49am    
This is my URL
http://convert.wajihah.com/c.php?varname=artxt&text=يبتىلمينبىمنيبغعاanand123
i found it is of chunked-encoding from headers...and still cant able to convert and display in MessageBox..plz can you solve the code.

If the text from the requested web site uses UTF-8 encoding, it must be converted to show it in a mesage box:

// Use char here. InternetReadFile() reads bytes.
char buffer[100];
DWORD bytesRead;
InternetReadFile(hRequest, buffer, sizeof(buffer), &bytesRead);

// Get size of converted string including terminating NULL byte.
// With UTF-8 we may omit this and use bytesRead + 1 as size
//  because the resulting string would not have more than bytesRead wide
//  characters.
int nSize = ::MultiByteToWideChar(CP_UTF8, 0, buffer, bytesRead + 1, NULL, 0);
if (nSize)
{
    LPWSTR lpszText = new WCHAR[nSize];
    // Convert UTF-8 string to wide string
    ::MultiByteToWideChar(CP_UTF8, 0, buffer, bytesRead, lpszText, nSize);
    // Terminate string. buffer may be not NULL terminated.
    lpszText[nSize - 1] = L'\0';
    MessageBoxW(NULL, lpszText, L"Check", MB_OK);
    delete [] lpszText;
}

If the page uses other encodings, pass this instead of CP_UTF8 (e.g. 28596 with ISO-8859-6, see Code-Page Identifiers[^] in the MSDN).
 
Share this answer
 
Comments
venkat.yva 3-Nov-12 7:57am    
Code-Page Identifiers[^] in the MSDN). not working to display arabic characters.
is there any another way
Sergey Chepurin 3-Nov-12 9:00am    
Do you use VS2010 to compile?
venkat.yva 4-Nov-12 23:09pm    
yeah, in visual studio 2010.
Sergey Chepurin 9-Nov-12 6:46am    
venkat.yva: Sorry for the late answer, but it took me some time to understand what you really want. I just checked the code from Baracat.S and it works fine with your site. It prints proper text in Arabic. I guess, the answer given by Jochen Arndt is also coded correctly (simply didn't check it). I could add the C++11 almost universal solution, but don't see any need in that after solutions provided work fine.
Jochen Arndt 3-Nov-12 9:34am    
The link posted in your comments does not contain any headers. It is just some PHP generated output using UTF-8 encoding. So using my code, you should see the content.
I made some modifications on the code you posted, I assumed that your web page uses utf-8:

Test web application:

Python
#!python
# -*- coding: utf-8 -*-
from bottle import *

content_type = 'text/html; charset=utf-8'

@route("/ar/")
def ar():
    response.content_type = content_type
    return "تجربة"

@route("/en/")
def en():
    response.content_type = content_type
    return "Test"

@route("/zn/")
def zn():
    response.content_type = content_type
    return "测试"

run(port=8080)


The code:

C++
#include <windows.h>
#include <wininet.h>

#pragma comment(lib, "wininet.lib")

INT WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nShowCmd)
{
    HINTERNET hInternet = InternetOpen(TEXT("foo"), INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0);

    HINTERNET hConnect = InternetConnect(hInternet, TEXT("http://localhost:8080/"), INTERNET_DEFAULT_HTTP_PORT, NULL, NULL, INTERNET_SERVICE_HTTP, 0, 1);

    DWORD options = INTERNET_OPTION_HTTP_DECODING | INTERNET_FLAG_NEED_FILE | INTERNET_FLAG_HYPERLINK | INTERNET_FLAG_RESYNCHRONIZE | INTERNET_FLAG_RELOAD;

    HINTERNET hRequest  = InternetOpenUrl(hInternet,  TEXT("http://localhost:8080/ar/"), NULL, 0, options, 0); 

    BYTE buffer[100] = {0};
    TCHAR szResp[100] = {0};
    DWORD bytesRead;

    InternetReadFile(hRequest, &buffer[0], sizeof(buffer) - 1, &bytesRead);

    MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, (LPCSTR) &buffer[0], sizeof(buffer), &szResp[0], sizeof(szResp));

    MessageBox(NULL, &szResp[0], TEXT("Check"), MB_OK);

    InternetCloseHandle(hRequest);
    InternetCloseHandle(hConnect);

	return 0;
}


test

Note, make sure you handle the errors.
 
Share this answer
 
Comments
venkat.yva 4-Nov-12 23:41pm    
i am not getting the exact output,showing garbage values.
please check the above code using my URL.
venkat.yva 4-Nov-12 23:48pm    
//when i added below code in my code

HttpQueryInfo (hRequest,HTTP_QUERY_RAW_HEADERS_CRLF, (LPVOID) lpHeadersA,
&dwSize, NULL);

//It showing like this

HTTP/1.1 200 OK
Date: Mon, 05 Nov 2012 03:55:10 GMT
Server: Apache
Keep-Alive: timeout=5, max=74
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
venkat.yva 4-Nov-12 23:51pm    
and observe plz there, there is NO "charset=utf-8".
when i am using another URLs showing charset=utf-8.
Barakat S. 5-Nov-12 7:09am    
Your page encoding is "ISO-8859-1", you didn't specify any encoding. Check ISO-8859-1 - Latin 1 : http://www.terena.org/activities/multiling/ml-docs/iso-8859.html#ISO-8859-1

To make it UTF-8, add the flowing in top of your php file:

header("Content-type: text/html; charset=utf-8");

for example:

<?php
header("Content-type: text/html; charset=utf-8");

if( isset($_GET["name"]) ) {
echo "name = " . htmlspecialchars($_GET["name"]);
} else {
echo "name = None";
}

?>

It should work fine.
venkat.yva 14-Nov-12 2:18am    
i dont have any PHP file to add your code...so i want total code in win32.
just add the below statement in your header

HTML
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


and if you are reading this page with your c++ then first read the header i.e. http header, it will tell you what encoding you need to follow then simply convert entire page to unicode(wide character). it will be easy for you to manipulate the text. otherwise you will have to follow special method to manipulate.
 
Share this answer
 
Comments
venkat.yva 3-Nov-12 6:05am    
I am using visual studio 2010.. so not accepting that code which you have given.
And i am just one month experience on software field..so cant able to understand clearly.
so can u plz send me the complete code..And i already found that URL giving chunked-encoding response.
Mohibur Rashid 3-Nov-12 6:07am    
this code is not for visual studio. this code is for html............

follow jochen arndit answer
venkat.yva 3-Nov-12 6:11am    
im doing in visual studio 2010 only...im struggling to get the out put.. plz requesting you to do the code in visual studio 2010.
venkat.yva 3-Nov-12 6:14am    
now i just want the code to transfer chunked-encoding response to normal form to display in MessageBox.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900