invalid encoding character 2003-09-22 - By Andrew Welch
> Hi all, > > In my xml document i have some characters that are valid unicode > characters > but invalid charaters for my encoding which is ISO-8859-1,iam using xerces > to parse the xml but xerces doesn't throw any exception because its a > valid > uncode charater. > > Is there any way i can catch this using xalan.( iam using xpath > functionality of xalan so just want to know whether i can extend any of > xalan features for spotting invalid encoding char ) > > Regards > Rohith
If your xml source doesn't have an xml declaration in the top:
<?xml version="1.0" encoding="ISO-8859-1"?>
then the xml parser will default to reading the document in utf-8. Make sure that xerces is at least reading the file in the desired encoding.
Alternatively, you should be able to use the getBytes("ISO-8859-1") method on a string in java that will throw an UnsupportedEncodingException and catch that.
|
|