[Expat-discuss] Large data sets (Expat v2.0.0; compiled cygwin)

Ben Keitch bkeitch at googlemail.com
Tue May 15 18:02:52 CEST 2007


Can someone help me with this code. It is trying to convert an XML file of
book data to tab-deliminated. Should be simple, but it seems to mangle about
200 of the 10000 records I give it. Supplying each record by itself, it
works fine. I don't understand why, but not being a C programmer, I dare say
I am mangling pointers, or there is a multithread issue I don't understand.

here is a typical error:
given lines 3380-3383 in a 917682 long XML file (it is well-formed according
to xmlwf):

<record>
<ISBN10>0816044384</ISBN10>
<ISBN13>9780816044382</ISBN13>
<EAN>9780816044382</EAN>
...
</record>

the data given to the data handler (and printed to stderr) is:

Data: 9780816   Data: 044382
Error : isbn10: 0816044384      isbn: 382       isbn13: 044382
Data:
Data: 9780816044382

So in this case, ISBN10 was correct, but ISBN13 only got the last 6 digits
on the first call, but managed to get all the data on the third call (the
second call gives a blank line! why?)

If you give just this XML record to the program, it works fine.

Any help greatly appreciated
-------------- next part --------------
A non-text attachment was scrubbed...
Name: processfile.c
Type: application/octet-stream
Size: 6765 bytes
Desc: not available
Url : http://mail.libexpat.org/pipermail/expat-discuss/attachments/20070515/e5f90c54/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Makefile
Type: application/octet-stream
Size: 457 bytes
Desc: not available
Url : http://mail.libexpat.org/pipermail/expat-discuss/attachments/20070515/e5f90c54/attachment-0001.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.xml
Type: text/xml
Size: 3084 bytes
Desc: not available
Url : http://mail.libexpat.org/pipermail/expat-discuss/attachments/20070515/e5f90c54/attachment.bin 


More information about the Expat-discuss mailing list