[Expat-discuss] Expat-discuss Digest, Vol 72, Issue 7
jzhang at ximpleware.com
Fri Mar 10 03:00:10 CET 2006
When VTD-XML project got officially started, the DTD part and external
were already deprecated somewhat... no major vocabularies seem to use the
Buffer reuse is introduced in the latest releases, maybe it should always be
performance improves starts from the second time VTDGen parses XML
We are adding more documents on this feature right now...
External references not withstanding, VTD-XML conforms to, and passes, every
test suite, VTD-XML handles namespace problem a little different than DOM or
the error checking is delayed until during navigation, the prefix induced
problem is quite unlikely to concern anyone, and is in fact part of the
problems of XML
The cost of encoding transformation ranges from zero to negligible, most are
One can argue that, to process XML, SAX parsers need to be used at least
first time is to scan the document from start to end, just to check
second pass is to perform the application processing... otherwise, what
if the application perform 10 transactions but then discover that the last
bracket of the XMl file is missing.?? roll back those 10 transactions ?? So
reduce the SAX perform by 50% just to be fair comparison with VTD-XML??
and VTD-XML is still forward only and unpleasant to use...
I don't see any comparison...
Maybe the world has moved forward... maybe it is time to say good bye to
> Date: Wed, 08 Mar 2006 18:28:41 -0500
> From: Karl Waclawek <karl at waclawek.net>
> Subject: Re: [Expat-discuss] Server JVM
> To: expat-discuss at libexpat.org
> Message-ID: <440F68A9.2010309 at waclawek.net>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> Jimmy Zhang wrote:
>> I can help you set up server JVM, just a dll file you need to put in
>> the right place... let me know if you need any help
> I installed the JDK and tested with the server JVM.
> It does indeed increase performance significantly, beating Expat on files
> with lots of markup vs character data (27ms for benchmark_BR vs 49ms for
> Question, why should one not always use the "buffer-reuse" version?
> Expat was still faster on the file with lots of character data (13ms vs
> 17ms for benchmark_BR).
> I recompiled the Expat library with all optimizations for speed.
> Now given that vtd-xml is quite fast, you still have to prove two things:
> 1) It does everything a conforming non-validating parser must do. How
> many of
> the tests in the XML-Test-Suite does it pass? Expat passes all
> the tests for a non-validating parser except a handful that are optional
> or in doubt.
> Example: vtd-xml failed to detect duplicate attributes when they had
> different prefixes pointing to the same namespace. Completing vtd-xml
> to conform as well as Expat may well add more overhead.
> 2) To which degree does it pay off to delay the work of encoding
> to the point when the data is actually needed, as in a real-world
> If the document is encoded in UTF-8 and your application requires UTF-16,
> then this is already done by Expat, but for vtd-xml this work still has
> to be performed.
> Whether it will be preferable over Expat for documents where memory
> usage is not an issue, will depend on the answers to these questions.
> Overall I do like your approach, and I think it is excellent for random
> access with an
> in-memory document. It may also do very well on SOAP processing for
> smaller messages.
> Expat-discuss mailing list
> Expat-discuss at libexpat.org
> End of Expat-discuss Digest, Vol 72, Issue 7
More information about the Expat-discuss