[Expat-discuss] A way to handle malicious XML with Expat / was Re: Handling malicious XML with Expat - what options do I have?

Sebastian Pipping webmaster at hartwork.org
Fri Sep 12 22:29:33 CEST 2008


I've been playing around with the Expat API and feeding a parser
instance with "a billion laughs" [1].  The approach I am taking
is counting entity value length manually inside of a custom
XML_EntityDeclHandler.  Demo code is attached, here is an excerpt
of its output:

  BEGIN handleEntityDeclaration
    laugh0 := "ha"
    Length is 2
  END

  BEGIN handleEntityDeclaration
    laugh1 := "&laugh0;&laugh0;"
    Length is 4
  END

  ..

  BEGIN handleEntityDeclaration
    laugh16 := "&laugh15;&laugh15;"
    Length is 131072
  END

  Content consided malicious XML, aborting

As Python also exposes Expat's XML_EntityDeclHandler function
I expect this approach to work for Python as well.

Comments welcome.



Sebastian


[1] http://www.cogsci.ed.ac.uk/~richard/billion-laughs.xml


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: demo_1_0.cpp
URL: <http://mail.libexpat.org/pipermail/expat-discuss/attachments/20080912/c6d64c94/attachment.txt>


More information about the Expat-discuss mailing list