  | |  | Performace questions and possible bug | Performace questions and possible bug 2003-09-08 - By hernando.borda@(protected)
I've been doing some performance testing of Xalan-Java and Xalan-C++ for processing files that range from a few hundred Kbytes to a few hundred Mbytes. For the tests, I used Xalan-J 2.5.1 with JDK 1.4.2_01 and Xalan-C++ 1.5 on a Dual Pentium III PC with 1 GByte of memory running Windows 2K Professional.
I'm a bit surprised with the results as Xalan-C++ performance is linear with respect to the XML input size while Xalan-J performance is exponential.
To give a bit more context, the kind of transformations we're mostly interested are flattening XML into relational structures. The attached ZIP contains three stylesheets that extract data out of the input XML document at different nesting levels and a few sample documents along with an Excel spreadsheet that details the tests results.
The structure of the input documents looks like: <?xml version="1.0"?> <customers> <customer id="0" name="Acme, Inc."> <orders> <order order_no="0"> <items> <item item_no="12" quantity="260" /> ... </items> </order> ... </orders> <addresses> <address street="645 Lake Blvd." city="Boston" state="MA" zip="01011" /> ... </addresses> </customer> </customers>
Some statistics: - All documents contain 50 customer elements - The count of order elements ranges from 1000 to 441439 - The count of item elements ranges from 2960 to 1323687 - The number of address elements is almost constant around 100 instances
and the three transformations extract: - The addresses of a customer - The orders of a customer - The items of an order
In all three tests (Xalan-Java, XSLTC and Xalan-C++) I'm sending to output to the std out and redirecting the results to a file.
I tested using both the interpreted version of the XSLT processor and XSLTC and the results are very similar although XSLTC performs a little better as the size of the input increases. As far as java is concerned, I had to increase the maximum java heap size to 1 GByte (-Xmx option). I also played a little with the initial heap size (-Xms option) and got some improvement but as the size of input file approached the upper end of the tests performance degraded dramatically (the results are included in the attached spreadsheet). One interesting detail I got using the -Xprof profiling option of java is that the java.util.Vector.ensureCapacityHelper method seems to be taking most of the execution time (anywhere from 40 to 87% as the size of the file increases).
I'm interested in getting comments from other people about their experience with performance. Is this behavior typical of the kind of transformation I'm doing?
Additionally, I had a problem using the translet that extracts all item elements. Starting with a document that contains 296380 item elements the transformation aborted with a "Translet errors:No more DTM IDs are available" error. I looked through the FAQ and mailing lists and didn't find anything about this apart from an issue that existed in previous versions of Xalan-J that is no longer present in version 2.5.1.
My environment is as follows: #---- BEGIN writeEnvironmentReport($Revision: 1.20 $): Useful stuff found: ---- version.DOM.draftlevel=2.0fd java.class.path=d:/xalan-j_2_5_1/bin/xalan.jar;d:/xalan-j_2_5_1/bin/xml-apis .jar;d:/xalan-j_2_5_1/bin/xercesImpl.jar;.;d:/j2sdk1.4.2_01/lib;d:/j2sdk1.4. 2_01/jre/lib version.JAXP=1.1 or higher java.ext.dirs=d:\j2sdk1.4.2_01\jre\lib\ext #---- BEGIN Listing XML-related jars in: foundclasses.sun.boot.class.path ---- xalan.jar-path=d:\j2sdk1.4.2_01\jre\lib\endorsed\xalan.jar xercesImpl.jar-apparent.version=xercesImpl.jar from xalan-j_2_5_0 from xerces-2_4 xercesImpl.jar-path=d:\j2sdk1.4.2_01\jre\lib\endorsed\xercesImpl.jar xml-apis.jar-apparent.version=xml-apis.jar present-unknown-version xml-apis.jar-path=d:\j2sdk1.4.2_01\jre\lib\endorsed\xml-apis.jar #----- END Listing XML-related jars in: foundclasses.sun.boot.class.path ----- version.xerces2=Xerces-J 2.4.0 version.xerces1=not-present version.xalan2_2=Xalan Java 2.5.1 version.xalan1=not-present version.ant=not-present java.version=1.4.2_01 version.DOM=2.0 version.crimson=present-unknown-version sun.boot.class.path=d:\j2sdk1.4.2_01\jre\lib\endorsed\xalan.jar;d:\j2sdk1.4. 2_01\jre\lib\endorsed\xercesImpl.jar;d:\j2sdk1.4.2_01\jre\lib\endorsed\xml-a pis.jar;d:\j2sdk1.4.2_01\jre\lib\rt.jar;d:\j2sdk1.4.2_01\jre\lib\i18n.jar;d: \j2sdk1.4.2_01\jre\lib\sunrsasign.jar;d:\j2sdk1.4.2_01\jre\lib\jsse.jar;d:\j 2sdk1.4.2_01\jre\lib\jce.jar;d:\j2sdk1.4.2_01\jre\lib\charsets.jar;d:\j2sdk1 .4.2_01\jre\classes #---- BEGIN Listing XML-related jars in: foundclasses.java.class.path ---- xalan.jar-path=d:\xalan-j_2_5_1\bin\xalan.jar xml-apis.jar-apparent.version=xml-apis.jar present-unknown-version xml-apis.jar-path=d:\xalan-j_2_5_1\bin\xml-apis.jar xercesImpl.jar-apparent.version=xercesImpl.jar from xalan-j_2_5_0 from xerces-2_4 xercesImpl.jar-path=d:\xalan-j_2_5_1\bin\xercesImpl.jar #----- END Listing XML-related jars in: foundclasses.java.class.path ----- version.SAX=2.0 version.xalan2x=Xalan Java 2.5.1 #----- END writeEnvironmentReport: Useful properties found: ----- # YAHOO! Your environment seems to be OK.
Thanks, Hernando Borda Software Developer Ascential Software Corp.
<<perf.ZIP>>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii"> <META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2653.12"> <TITLE>Performace questions and possible bug</TITLE> </HEAD> <BODY>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">I've been doing some performance testing of Xalan-Java and Xalan-C++ for processing files that range from a few hundred Kbytes to a few hundred Mbytes. For the</FONT> <FONT SIZE=2 FACE="Arial ">test</FONT><FONT SIZE=2 FACE="Arial">s,</FONT><FONT SIZE=2 FACE="Arial"> I used Xalan-J 2.5.1 with JDK 1.4.2_01 and Xalan-C++ 1.5 on a Dual Pentium III PC with 1 GByte of memory running Windows 2K Professional.</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">I'm a bit surprised with the results as Xalan-C++ performance is linear with respect to the XML input size while Xalan -J performance is exponential.</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">To give a bit more context</FONT><FONT SIZE=2 FACE="Arial">,</FONT><FONT SIZE=2 FACE="Arial"> the kind of transformations we're mostly interested are flattening XML into relational structures. The attached ZIP contains three stylesheets that extract data out of the input XML document at different nesting levels and a few sample documents along with an Excel spreadsheet that details the tests results.</FONT ></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">The structure of the input documents looks like:</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"><?xml version="1.0"?>< /FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"><customers></FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> <customer id="0" name="Acme, Inc."></FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> <orders></FONT ></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> < ;order order_no="0"></FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> <items></FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> <item item_no="12" quantity="260" /> ;</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> </FONT> <FONT SIZE=2 FACE="Arial">...</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> </items></FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> < /order></FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> </FONT> <FONT SIZE=2 FACE="Arial">...</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> </orders>< /FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> <addresses>< /FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> < ;address street="645 Lake Blvd." city="Boston" state=" ;MA" zip="01011" /></FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> ...< /FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> </addresses>< /FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"> </customer></FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"></customers></FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">Some statistics:</FONT></P>
<P><FONT SIZE=2 FACE="Times New Roman">- < /FONT> <FONT SIZE=2 FACE="Arial">All documents contain 50 customer elements< /FONT> <BR><FONT SIZE=2 FACE="Times New Roman">- < /FONT> <FONT SIZE=2 FACE="Arial">The count of order elements ranges from 1000 to 441439</FONT> <BR><FONT SIZE=2 FACE="Times New Roman">- < /FONT> <FONT SIZE=2 FACE="Arial">The count of item elements ranges from 2960 to< /FONT><FONT SIZE=2 FACE="Arial"></FONT> <FONT SIZE=2 FACE="Arial">1323687</FONT> <BR><FONT SIZE=2 FACE="Times New Roman">- < /FONT> <FONT SIZE=2 FACE="Arial">The</FONT> <FONT SIZE=2 FACE="Arial">number< /FONT> <FONT SIZE=2 FACE="Arial">of address elements</FONT> <FONT SIZE=2 FACE= "Arial">is almost constant around 100 instances</FONT> </P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">and the three transformations extract:< /FONT></P>
<P><FONT SIZE=2 FACE="Times New Roman">- < /FONT> <FONT SIZE=2 FACE="Arial">The addresses of a customer</FONT> <BR><FONT SIZE=2 FACE="Times New Roman">- < /FONT> <FONT SIZE=2 FACE="Arial">The orders of a customer</FONT> <BR><FONT SIZE=2 FACE="Times New Roman">- < /FONT> <FONT SIZE=2 FACE="Arial">The items of an order</FONT> </P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">In all three test</FONT><FONT SIZE=2 FACE="Arial">s (Xalan-Java, XSLTC and</FONT> <FONT SIZE=2 FACE="Arial">Xalan-C+ +) I</FONT><FONT SIZE=2 FACE="Arial">'</FONT><FONT SIZE=2 FACE="Arial">m sending to output to the std out and redirecting the results to a file.</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">I tested using both the interpreted version of the XSLT processor and XSLTC and the results are very similar although XSLTC performs a little better as the size of the input increases.< /FONT><FONT SIZE=2 FACE="Arial"> As far as java is concerned</FONT><FONT SIZE=2 FACE="Arial">,</FONT><FONT SIZE=2 FACE="Arial"> I had to increase the maximum< /FONT> <FONT SIZE=2 FACE="Arial">java heap size to 1 GByte</FONT><FONT SIZE=2 FACE="Arial"> (-Xmx option). I also played a little with the initial heap size< /FONT> <FONT SIZE=2 FACE="Arial">(-Xms option)</FONT> <FONT SIZE=2 FACE="Arial" >and got</FONT><FONT SIZE=2 FACE="Arial"> some</FONT><FONT SIZE=2 FACE="Arial">< /FONT> <FONT SIZE=2 FACE="Arial">improvement but as the size of input file approach</FONT><FONT SIZE=2 FACE="Arial">ed</FONT><FONT SIZE=2 FACE="Arial"> the upper end of the tests performance degraded dramatically</FONT><FONT SIZE=2 FACE="Arial"> (the results are included in the</FONT> <FONT SIZE=2 FACE="Arial" >attach</FONT><FONT SIZE=2 FACE="Arial">ed spreadsheet)</FONT><FONT SIZE=2 FACE= "Arial">.</FONT><FONT SIZE=2 FACE="Arial"> One interesting detail I got using the -Xprof profiling option of java is that</FONT> <FONT SIZE=2 FACE="Arial" >the</FONT> <FONT SIZE=2 FACE="Arial">java.util.Vector.ensureCapacityHelper< /FONT><FONT SIZE=2 FACE="Arial"> method seems to be taking most of the execution time</FONT> <FONT SIZE=2 FACE="Arial">(anywhere from 40 to 87% as the size of the file increases).</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">I'm interested in getting comments from other people about their experience with performance. Is this behavior typical of the kind of transformation I'm doing?</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">Additionally, I had</FONT> <FONT SIZE=2 FACE="Arial">a</FONT> <FONT SIZE=2 FACE="Arial">problem using the translet that extracts all item elements. Starting with a document that contains</FONT> <FONT SIZE=2 FACE="Arial">296380</FONT><FONT SIZE=2 FACE="Arial"> item elements the transformation aborted with a "</FONT><FONT SIZE=2 FACE="Arial">Translet errors :No more DTM IDs are available</FONT><FONT SIZE=2 FACE="Arial">" error. I looked through the FAQ and mailing lists and didn't find anything about this apart from an issue that existed in previous versions of Xalan-J that is no longer present in version 2.5.1.</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">My environment is as follows:</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">#---- BEGIN writeEnvironmentReport( $Revision: 1.20 $): Useful stuff found: ----</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.DOM.draftlevel=2.0fd</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">java.class.path=d:/xalan-j_2_5_1/bin /xalan.jar;d:/xalan-j_2_5_1/bin/xml-apis.jar;d:/xalan-j_2_5_1/bin/xercesImpl.jar ;.;d:/j2sdk1.4.2_01/lib;d:/j2sdk1.4.2_01/jre/lib</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.JAXP=1.1 or higher</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">java.ext.dirs=d:\j2sdk1.4.2_01\jre\lib \ext</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">#---- BEGIN Listing XML-related jars in : foundclasses.sun.boot.class.path ----</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xalan.jar-path=d:\j2sdk1.4.2_01\jre\lib \endorsed\xalan.jar</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xercesImpl.jar-apparent.version =xercesImpl.jar from xalan-j_2_5_0 from xerces-2_4</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xercesImpl.jar-path=d:\j2sdk1.4.2_01 \jre\lib\endorsed\xercesImpl.jar</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xml-apis.jar-apparent.version=xml-apis .jar present-unknown-version</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xml-apis.jar-path=d:\j2sdk1.4.2_01\jre \lib\endorsed\xml-apis.jar</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">#----- END Listing XML-related jars in: foundclasses.sun.boot.class.path -----</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.xerces2=Xerces-J 2.4.0</FONT>< /P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.xerces1=not-present</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.xalan2_2=Xalan Java 2.5.1</FONT ></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.xalan1=not-present</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.ant=not-present</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">java.version=1.4.2_01</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.DOM=2.0</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.crimson=present-unknown-version </FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">sun.boot.class.path=d:\j2sdk1.4.2_01 \jre\lib\endorsed\xalan.jar;d:\j2sdk1.4.2_01\jre\lib\endorsed\xercesImpl.jar;d: \j2sdk1.4.2_01\jre\lib\endorsed\xml-apis.jar;d:\j2sdk1.4.2_01\jre\lib\rt.jar;d: \j2sdk1.4.2_01\jre\lib\i18n.jar;d:\j2sdk1.4.2_01\jre\lib\sunrsasign.jar;d: \j2sdk1.4.2_01\jre\lib\jsse.jar;d:\j2sdk1.4.2_01\jre\lib\jce.jar;d:\j2sdk1.4.2 _01\jre\lib\charsets.jar;d:\j2sdk1.4.2_01\jre\classes</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">#---- BEGIN Listing XML-related jars in : foundclasses.java.class.path ----</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xalan.jar-path=d:\xalan-j_2_5_1\bin \xalan.jar</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xml-apis.jar-apparent.version=xml-apis .jar present-unknown-version</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xml-apis.jar-path=d:\xalan-j_2_5_1\bin \xml-apis.jar</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xercesImpl.jar-apparent.version =xercesImpl.jar from xalan-j_2_5_0 from xerces-2_4</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">xercesImpl.jar-path=d:\xalan-j_2_5_1 \bin\xercesImpl.jar</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">#----- END Listing XML-related jars in: foundclasses.java.class.path -----</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.SAX=2.0</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">version.xalan2x=Xalan Java 2.5.1</FONT> </P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">#----- END writeEnvironmentReport: Useful properties found: -----</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial"># YAHOO! Your environment seems to be OK.</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">Thanks,</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">Hernando Borda</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">Software Developer</FONT></P>
<P ALIGN=LEFT><FONT SIZE=2 FACE="Arial">Ascential Software Corp.</FONT></P>
<P ALIGN=LEFT><FONT FACE="Arial" SIZE=2 COLOR="#000000"> <<perf.ZIP> > </FONT></P>
</BODY> </HTML>
|
|
 |