From jean at upfrontsystems.co.za Thu Oct 2 11:33:18 2003 From: jean at upfrontsystems.co.za (Jean Jordaan) Date: Sun Aug 10 16:54:54 2008 Subject: [Zope-xml] Expat missing the internal subset Message-ID: <3F7C453E.808@upfrontsystems.co.za> Hi all Has anyone seen behaviour like this? We have python2.2.3 on two different boxes. Both have pyexpat.EXPAT_VERSION: 'expat_1.95.6' On both I do this: """ >>> from Products.ParsedXML.ParsedXML import createDOMDocument >>> from Products.ParsedXML.ExtraDOM import writeStream >>> xml = u"""\ ... ... ... ... ... ... ... ... ]> ... """ >>> dom = createDOMDocument(xml) >>> dom >>> writeStream(dom).getvalue() """ On the first box, this returns: """ u'\n\n \n \n \n \n \n \n]>\n\n' """ On the second box, I see only: """ u'\n\n\n' """ I managed to trace up to this point of ParsedXML.DOM.ExpatBuilder: """ def _setup_subset(self, buffer): """Load the internal subset if there might be one.""" if self.document.doctype: extractor = InternalSubsetExtractor() extractor.parseString(buffer) subset = extractor.getSubset() if subset is not None: d = self.document.doctype.__dict__ d['internalSubset'] = subset """ On the first box, when I step into `getSubset`, I see this: """ (Pdb) self.subset [u'[', u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n', u']'] """ On the second box, when I do that, I don't see the enclosing square brackets: """ (Pdb) self.subset [u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n ', u'', u'\n'] """ and so the subset extracted is empty. Any ideas would be most gratefully considered .. -- Jean Jordaan http://www.upfrontsystems.co.za