invalid encoding character 2003-09-22 - By Christopher Ebert
Hi,
>>Is there any way i can catch this using xalan.( iam using xpath >>functionality of xalan so just want to know whether i can extend any of >>xalan features for spotting invalid encoding char ) By the time the file has been parsed, it's been converted to a Java string, which no longer has the encoding of the original file (it's UTF-16). You need to address this in your original file, either by setting the encoding to what it really is (if you have characters outside of ISO-8859-1 then it's not really encoded as ISO-8859-1) or by encoding the non-ISO characters as unicode entities.
>Alternatively, you should be able to use the getBytes("ISO-8859-1") >method on a string in java that will throw an >UnsupportedEncodingException and catch that.
I don't think this does quite what you expect. UnsupportedEncodingException will be thrown if the requested encoding is unknown. Characters that are not supported by a known encoding turn into "?".
Chris
|
|