[Zope3-dev] Re: zope.tal.xmlparser.XMLParser() dislikes unicode

Martijn Faassen faassen at startifact.com
Tue Jan 16 17:23:26 EST 2007


Tres Seaver wrote:
[snip]
>> The "just store the XML" scenario is in surprisingly nice. It only needs 
>> attention to encoding and decoding in the always complicated ZPublisher 
>> direct output scenario, and in the edit form scenario.
> 
> As you speculated, this is actually my preference, except that I don't
> see the need to in scenario D to recode the data and strip the prolog
> encoding attribute.  Why wouldn't we just use the XML template's own
> declared encoding to encode any data subsituted into the template?  I
> mean, if the user has marked up the document to indicate a "preferred"
> encoding, why should we bother storing such an encoding in another location?

Yes, I was thinking along those lines too.

> Then the only time we would need to munge the document would be at
> inclusion time, which is the only time we actually *need* to have
> unicode in hand.  We might even elide the decode-recode stage if the
> target document uses the same encoding!  That such an optimization might
> not be worth the complexity, however.

Yes, one complexity is that trying to do this would break the assumption 
that ZPT templates always return unicode or pure-ascii strings, not 
anything else (such as encoded data). Only at the last phase of the 
publisher will it be encoded into something else. I really appreciate 
keeping this assumption in place. :)

> Note that in the inclusion case (scenario E), we almost certainly
> *should* be stripping the *entire* prolog, which is only valid at the
> start of the merged document. 

If you are including it as a document, yes. If you are included it 
quoted, as for instance the contents of a text area allowing you to edit 
the XML text directly, then no. This suggests we actually have two 
scenarios here.

> I guess there is a subscenario, which is
> that the "included" document is actually the 'main_template' supplying
> the prolog:  METAL might should leave the prolog alone, while
> 'tal:replace' and 'tal:content' (with 'structure') would strip it?

Yay, another scenario. :)

Regards,

Martijn



More information about the Zope3-dev mailing list