Click here to Skip to main content
15,884,628 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,

Issue --> Actually I am facing issue with xml parsing (SAX Parser) in Unix Machine. Same Jar/Java-Code behave differently on windows and Unix Machine, why ? :(

Windows Machine --> works fine , Using SAX Parser to load huge xml file , Read all values correctly and populate same values. Charset.defaultCharset() windows-1252

Unix Machine --> After then created JAR and deployed at Unix --> tomcat and execute the jar.
Tried to load same huge xml file But noticed that some values or characters are populated empty or incomplete like
Country Name populated as "ysia" instead of "Malaysia" or transaction Date populate as "3 PM" instead of "18/09/2016 03:31:23 PM". Charset.defaultCharset() UTF-8

Issue is only with Unix , Because when I load same xml at windows or my local eclipse it works fine and all values populate correctly.

Also I tried to modify my code and set encoding as UTF-8 for inputSteamReader but still it's not read value correctly at unix box.

Note : There is no special characters in xml. Also noticed one thing that when I take out same records (those value not populated correctly) in other xml file and load in unix machine with same jar it works fine. It means issues occur while load these records with huge data. :(

Please suggest , What should be the solution ?

What I have tried:

I have changed the encoding to UTF-8 but still problem not resolved.
Posted
Updated 24-Sep-16 8:27am
v2
Comments
Richard MacCutchan 24-Sep-16 12:09pm    
Most likely there is a bug in your code. You need to do some debugging.
Member 12655706 24-Sep-16 12:33pm    
public static void parseXmlDocument(String inputFilePath ,ImplDefaultHandler handler) {
logger.info("Start: ImplBEUtils.parseXmlDocument() Method ");
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
try {
SAXParser saxParser = saxParserFactory.newSAXParser();
InputStream inputStream= new FileInputStream(inputFilePath);
Reader reader = new InputStreamReader(inputStream,"UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
saxParser.parse(is,(DefaultHandler) handler);
// saxParser.parse(new File(inputFilePath),(DefaultHandler) handler);
} catch (ParserConfigurationException e) {
System.out.println("ParserConfig error");
e.printStackTrace();
} catch (SAXException e) {
System.out.println("SAXException : xml not well formed");
e.printStackTrace();
} catch (IOException e) {
System.out.println("IO error");
e.printStackTrace();
}
logger.info("End: ImplBEUtils.parseXmlDocument() Method ");
}


public void endElement(String s, String s1, String element) throws SAXException {
if (element.equalsIgnoreCase("transactionDate")) { obj.setTransactionDate(tmpValue); }
}

public void characters(char[] ac, int i, int j) throws SAXException { chars.append(ac, i, j); tmpValue = new String(ac, i, j).trim(); }
Member 12655706 24-Sep-16 12:44pm    
I have posted the code , I couldn't understand if there is bug in code then it should have behaved same at windows also. But it works fine at windows. :(
Issue is only with unix/docker machine.
Can u pls suggest if something is wrong in code.

1 solution

Java
After modify the code as below it worked fine...

public void startElement(String uri, String localName, String qName, 
    Attributes attributes) throws SAXException {
  if(qName.equalsIgnoreCase("customerName")){ 
    chars.setLength(0); 
  }
  tmpValue = null;
} 


public void characters(char[] ac, int i, int j) throws SAXException {
  chars.append(ac, i, j);
  if (tmpValue == null) {
    tmpValue = new String(ac, i, j);
  } else {
    tmpValue += new String(ac, i, j);
  }
}

public void endElement(String s, String s1, String element) throws SAXException {
  if (element.equalsIgnoreCase("transactionDate") && tmpValue != null) {          
    obj.setTransactionDate(tmpValue.trim()); 
  }
}
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900