[Expat-discuss] Nested calls to XML_Parse
Derek Snider
derek at bluegenesis.com
Wed Jun 11 19:13:05 EDT 2003
I guess I should add a bit of case history here. Without going into too
much detail, what we have is an Apache/PHP based (which uses libexpat)
website management system that does interface layouts using XML template
files.
This system has evolved over the past year and a half as a
scratch-rewrite from a "plain PHP" interface for the purpose of making
things far more dynamic, easy to code and also support multiple
languages.
Not being XML wizards, the system we devised is relatively simple, but
makes use of recursive calls to xml_parse to expand certain tags as well
as embedded variables into XML data. (One example is that an embedded
variable such as $GETDISKUSAGE will result in a call to a PHP function
to lookup the disk usage for a customer, and expand to XML tags
containing the data in the proper formatting.)
Everything worked fine for the past year and a half until we tried to
upgrade to php-4.3.2. I don't know why it worked (since libexpat does
not support this), but it worked fine up to and including php-4.2.2.
This system is used by over 100,000 customers, so it's not something
we're about to scrap ;)
However, it does appear that we're going to have to make some
fundamental changes in the XML parsing.
I originally reported this as a PHP bug, but further investigating
revealed that the real issue was with our use of xml_parse.
-Derek
> -----Original Message-----
> From: Dan Rosen [mailto:dr at netscape.com]
> Sent: June 11, 2003 5:25 PM
> To: Derek Snider
> Cc: expat-discuss at libexpat.org
> Subject: Re: [Expat-discuss] Nested calls to XML_Parse
>
>
> Hi Derek,
>
> There is an important conceptual distinction here between
> what you want to do and what a parser does. Any parser's role
> is simply to parse serialized data into a programmatic
> representation of that data, regardless of the type of data
> or its representation. What you're trying to do transcends
> parsing, and falls squarely into the domain of data
> transformation and processing.
> In the process of parsing, it is not appropriate (or is, at
> least, very unorthodox) to modify the data stream being parsed.
> Allowing this essentially turns a decidable problem for which
> there exist efficient solutions into an undecidable problem,
> if you remember your theory.
> Allowing expat to do this would be a very bad thing.
> Also, with regard to appending data to the buffer: I don't know
> what expat permits and does not permit in terms of modifying the
> buffer it is currently parsing, within a call stack descending
> from XML_Parse. Even if it does permit this, it seems very sketchy
> to me to provide access from your callbacks to the buffer currently
> being parsed. Regardless, you wouldn't have to make a recursive
> call to XML_Parse in either case.
> Note also that appending data to the buffer is not what the
> "isFinal" parameter is intended to do. You don't want to think of
> it as appending, you want to think of it as a signal to the parser
> when you're done refilling the buffer from whatever your original
> data source is (the file, for example).
> In summary: what you're trying to do is just not what you want to do.
> I don't know what exactly you're trying to do with the dynamic content
> you need to generate, but I'd guess that architecturally you'd rather
> approach this in two steps: first parse, then transform.
> Transformation during parsing can be difficult depending on the
> complexity of the transformation (see XSLT), and again, transformation
> of the source document being parsed is simply the wrong thing.
> Hope this helps,
> dr
More information about the Expat-discuss
mailing list