Click here to Skip to main content
15,920,217 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Please Help me to Conert Ascii to Hex with Extended Ascii.Please Needful.

My Extended Ascii Char are :"€ ¥ Š"
Hexa Expected is:"80 a5 A8

It does give this answer.
My Code:
C#
String testString1 = "€¥Š";
                             for (int i=0; i<testString1.length(); i++)
                             {
                              System.out.println(String.format("%x", (byte)(testString1.charAt(i))));
                            }

ng.format("%x", (byte)(testString1.charAt(i))));
}



Please Help Code
Posted
Comments
TorstenH. 31-Jul-12 7:02am    
Please . rather more then investing in this wild thingy - tell us what you want to do. I have the strange feeling that it's about translation and/or non ASCII Characters.
Kenneth Haugland 31-Jul-12 17:07pm    
Im terrybly sorry about this... but "I'm suffering from this problem" cracked me up completely. Like it was litterally painful for you to watch the computer doing this stuff... :)
pasztorpisti 31-Jul-12 17:22pm    
Thats really interesting! Being a geek I still don't have that high level of empathy toward personal computers! :-)
Kenneth Haugland 31-Jul-12 17:29pm    
Guess it takes practice ;) But im still laughing... lol..
pasztorpisti 31-Jul-12 17:33pm    
I've just visualized in my mind what could happen as a result of an access violation or a blue screen of death with the more sensitive me. :-)

A Java char (or element of a string) is 16 bit Unicode, not 8 bit ASCII. So your cast (byte)(...) is truncating in general. Looking at a Unicode code chart, it appears that your third symbol has value 0x8a, not 0xa8. Also, byte is a signed type, whereas char is unsigned. Try casting to int instead of byte and see what you get.

Peter
 
Share this answer
 
Comments
pasztorpisti 31-Jul-12 15:36pm    
"16 bit Unicode": This is not clear...
Originally unicode did not have more than 2^16 characters so a 16 bit char was enough to represent a unicode character (see UCS). Currently it is just a unit of an utf-16 encoding where one character might consist of a high and low surrogate pair, so 2 chars in java. Currently the unicode tables have somewhat more than 1 million characters so the only encoding that is not really an encoding but pure unicode is utf-32 where a 32 bit unit is always the unicode character (codepoint) itself.
Peter_in_2780 31-Jul-12 19:17pm    
Good point. I didn't want to get too deep into encodings other than to show OP that a byte isn't big enough. The point remains that a Java char is 16 bits unsigned, unlike most other languages, where a char is 8 bits of often indeterminate signedness.
Conversion from string to hex: First encode the unicode string into some binary format, I recommend utf-8, utf-16, or utf-32 that are not lossy. utf-8 is the best if you are working with string that contain mostly latin characters. Then you convert the byte array to a hex string.
Conversion from hex to string: Convert the hex string into a byte array, and then you can easily convert this byte array to the original string if you know what the encoding is.

HexEncode.java:
Java
import java.io.UnsupportedEncodingException;

public class HexEncode {

	static class BadInputStringException extends Exception {
		public BadInputStringException(String arg0) {
			super(arg0);
		}
	}
	
	private static String ENCODING = "utf-8";

	static private final char[] HEX_DIGITS = new char[] {
		'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'
	};
	static private char intToHexDigit(int b) {
		assert b>=0 && b<16;
		return HEX_DIGITS[b];
	}
	static private int hexDigitToInt(char hexDigit) throws BadInputStringException {
		if (hexDigit>='0' && hexDigit<='9')
			return (int)(hexDigit - '0');
		if (hexDigit>='a' && hexDigit<='f')
			return (int)(hexDigit - 'a' + 10);
		if (hexDigit>='A' && hexDigit<='F')
			return (int)(hexDigit - 'A' + 10);
		throw new BadInputStringException("Invaid hex digit: " + hexDigit);
	}

	private String asciiToHex(String ascii) throws UnsupportedEncodingException, BadInputStringException {
		byte[] encoded = ascii.getBytes(ENCODING);
		StringBuilder sb = new StringBuilder();
		for (int i=0; i<encoded.length; i++) {
			byte b = encoded[i];
			// instead of the two lines below you could write: String.format("%02X", b)
			// but that would probably be slower
			sb.append(intToHexDigit((b >> 4) & 0xF));
			sb.append(intToHexDigit(b & 0xF));
		}
		return sb.toString();
	}

	private String hextoAscii(String hex) throws UnsupportedEncodingException, BadInputStringException {
		if (0 != (hex.length() & 1))
			throw new BadInputStringException("The hex string must contain even number of digits!");
		int encoded_len = hex.length() / 2;
		byte[] encoded = new byte[encoded_len];

		for (int i=0; i<encoded_len; i++) {
			encoded[i] = (byte)((hexDigitToInt(hex.charAt(i*2)) << 4) | hexDigitToInt(hex.charAt(i*2+1))); 
		}
		return new String(encoded, ENCODING);
	}
	
	private void run() {
		try {
			String hex = asciiToHex("TRON");
			String ascii = hextoAscii(hex);
			System.out.printf("hex: %s, decoded_hex: %s", hex, ascii);
		} catch (Exception ex) {
			ex.printStackTrace();
		}
	}
	
	public static void main(String[] args) {
		new HexEncode().run();
	}
}
 
Share this answer
 
hell pasztorpisti,
Your Post is Valueable in my Extended Ascii To Hex Converson but according to your program hex conversion of € is E282AC and actual hex value is different as 80 ...
Can You Please Elaborate Why There Is Differnce..?

Thanks In Advance
 
Share this answer
 
Comments
pasztorpisti 1-Aug-12 15:43pm    
First, please don't post questions as answers. I don't even get a notification about this. You should have posted this as a comment to my answer.

The reason for the problem you encountered is the following: You are using some kind of codepage I guess, and my program uses utf-8. If you are on windows than its probably an ansi-125x codepage.
Replace this:
private static String ENCODING = "utf-8";
To this (or whatever codepage your windows uses):
private static String ENCODING = "windows-1250";
shoaib bagwan (swabzz) 4-Aug-12 0:08am    
hello,
m using linux and not windows..!
and what is codepage..?
pasztorpisti 9-Aug-12 5:32am    
Then find out what codepage does your linux use. Linux is usually utf-8 but if the euro sign must be 0x80 in binary than it must be some codepaged stuff. If you want to know what a character codepage is then use google.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900