From jean at upfrontsystems.co.za Thu Oct 2 11:33:18 2003
From: jean at upfrontsystems.co.za (Jean Jordaan)
Date: Sun Aug 10 16:54:54 2008
Subject: [Zope-xml] Expat missing the internal subset
Message-ID: <3F7C453E.808@upfrontsystems.co.za>
Hi all
Has anyone seen behaviour like this? We have python2.2.3 on two
different boxes. Both have pyexpat.EXPAT_VERSION: 'expat_1.95.6'
On both I do this:
"""
>>> from Products.ParsedXML.ParsedXML import createDOMDocument
>>> from Products.ParsedXML.ExtraDOM import writeStream
>>> xml = u"""\
...
...
...
...
...
...
...
... ]>
... """
>>> dom = createDOMDocument(xml)
>>> dom
>>> writeStream(dom).getvalue()
"""
On the first box, this returns:
"""
u'\n\n \n \n \n \n \n \n]>\n\n'
"""
On the second box, I see only:
"""
u'\n\n\n'
"""
I managed to trace up to this point of ParsedXML.DOM.ExpatBuilder:
"""
def _setup_subset(self, buffer):
"""Load the internal subset if there might be one."""
if self.document.doctype:
extractor = InternalSubsetExtractor()
extractor.parseString(buffer)
subset = extractor.getSubset()
if subset is not None:
d = self.document.doctype.__dict__
d['internalSubset'] = subset
"""
On the first box, when I step into `getSubset`, I see this:
"""
(Pdb) self.subset
[u'[', u'\n ', u'', u'\n ',
u'', u'\n ', u'', u'\n ', u'', u'\n ',
u'', u'\n ', u'', u'\n ', u'', u'\n', u']']
"""
On the second box, when I do that, I don't see the enclosing
square brackets:
"""
(Pdb) self.subset
[u'\n ', u'', u'\n ',
u'', u'\n ', u'', u'\n ', u'', u'\n ',
u'', u'\n ', u'', u'\n ', u'', u'\n']
"""
and so the subset extracted is empty. Any ideas would be most
gratefully considered ..
--
Jean Jordaan
http://www.upfrontsystems.co.za