Click here to Skip to main content
13,053,276 members (60,432 online)
Rate this:
Please Sign up or sign in to vote.
See more:
I was trying to find the encoding type of a file like unicode, utf8, utf8 with BOM, ANSI etc. I was able to find all the encoding type but ANSI(Encoding.Default/Windows- 1252). I am not able to differentiate ANSI and UTF8. Tried different custom class like (Ude, TextFileEncodingDetector etc) which guesses it but not exactly right. Is there any way to do it?
Posted 18-Sep-12 3:02am
Rate this: bad
Please Sign up or sign in to vote.

Solution 1

Unless the document uses any characters >= 0x80, 1252 and UTF-8 would be indistinguishable (unless a BOM is present).

If it does use characters >= 0x80, it would be a matter of checking the documents for tell-tale indicators, see:
jebin Cherian 18-Sep-12 10:22am
Thanks for the reply Yvan. Will that differ according to the languages.
Yvan Rodrigues 18-Sep-12 10:30am
This would be true of all languages.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month

Advertise | Privacy | Mobile
Web02 | 2.8.170713.1 | Last Updated 18 Sep 2012
Copyright © CodeProject, 1999-2017
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100