[Expat-discuss] [ANN] VTD-XML Version 1.5 Released
Karl Waclawek
karl at waclawek.net
Mon Mar 6 18:00:20 CET 2006
jzhang at ximpleware.com wrote:
> The VTD-XML project team is proud to announce the
> availability of both C and Java version 1.5 of VTD-XML,
> the next generation open-source XML parser that goes
> beyond DOM and SAX in terms of performance, memory
> usage and ease of use.
>
> The technical highlights of VTD-XML are:
>
> * Performance: the world's fastest XML parser,
> between 5x~10x faster than DOM
> * Memory Usage: 3x to 5x less than DOM, 1.3x~1.5x
> XML document size
> * Random access with built-in XPath support
> * A simple and intuitive API
>
On the surface, this seems intriguing. It appars what VTD-XML is doing,
is to build
a kind of "pointer structure" consisting of "VTD-records" that allows
navigating the
document structure without having copied out anything. This may be an
efficient alternative
to DOM.
I do not quite understand why it should be faster than SAX, or SAX-like
parsers, especially in the C-world (Expat). Yes, maybe the initial creation
of the VTD records is faster, but there is some memory allocation
involved as well,
and Expat already is very good on memory allocations.
Now, consider that the input document is encoded in UTF-16, but your
application needs UTF-8. You still need to convert, which involves
memory allocations and copying, more so in Java then in C, as string objects
in Java are immutable (no re-use of memory).
So, for any practical SAX work, I doubt that there are advantages,
even for small documents, unless one can use the original encoding.
Karl
More information about the Expat-discuss
mailing list