Find out the file encoding
2 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   Anonymous
Posted On:   Thursday, September 8, 2005 10:40 AM

Hi.


What I need to do is to upload a file which could have any encoding type(ASCII, UTF8, ISO8859_1, UTF-16, Unicode etc). I have to parse the file and put each line into a list.


new BufferedReader (new InputStreamReader (in)) uses the Cp1252 mapping for decoding. The InputStreamReader constructor accepts a custom decoding, but I don't known how the file was encoded.


Is there a way to find the encoding type of a file?


Do you have any other proposal for doing that?


10x and have a nice day.

Re: Find out the file encoding

Posted By:   Stephen_Ostermiller  
Posted On:   Tuesday, September 13, 2005 03:51 PM

You might be intersted in jchardet (http://www.i18nfaq.com/chardet.html) -- chardet is a java port of the source from mozilla's automatic charset detection algorithm.

Re: Find out the file encoding

Posted By:   Christopher_Koenigsberg  
Posted On:   Saturday, September 10, 2005 04:42 PM

Some files declare their own encoding (e.g. xml), but most don't.
About | Sitemap | Contact