[Zope3-dev] Re: Input encoding of a PageTemplateFile

Wed Aug 10 10:14:52 EDT 2005

Fred Drake wrote:
> On 7/22/05, Dmitry Vasiliev <lists at hlabs.spb.ru> wrote:
> 
>>I think about the following generic algorithm:
>>
>>1. Preparation stage. Content type and encoding are determining based on the
>><?xml?>/<meta> declarations. In case of the 'text/html' type and a not unicoded
>>content we decode the content. In case of the 'text/xml' type the parser takes
>>care of the encoding at the cooking stage. We can do it somewhere inside
>>PageTemplate.pt_edit()/PageTemplate.write() methods.
> 
> 
> This is probably right; I'll have to look at the code again.
> 
> 
>>2. Cooking stage. Nothing interested for our case.
> 
> 
> Wrong; this is when the "bytecode" is generated.  At this point, we
> can remove the encoding markers (since we've already used them for
> input).
> 
> 
>>3. Rendering stage. Now we can strip the <?xml?>/<meta> declarations. We can do
>>it somewhere inside PageTemplate.pt_render()/PageTempalte.__call__() methods.
> 
> 
> Rendering is the most costly stage, so we want to reduce the work done
> here.  Avoiding it entirely is best.  By removing the encoding markers
> at compilation time, we manage to have nothing else to do at this
> stage.

Ok. Now I think that all this can be done somewhere inside zope.tal. I need to 
write a proposal...

>>BTW, just curious why we need to read HTML files in the text mode (See
>>PageTemplateFile._read_file())?
> 
> 
> I don't remember, but it seemed important at the time.  It likely has
> something to do with newline normalization; the XML parser handles
> that for us since the XML specification requires it to, but the HTML
> parser doesn't bother.
> 
> I doubt this is important in practice, but may be relied on in the tests.

Maybe we can use "universal newlines" mode instead?

-- 
Dmitry Vasiliev (dima at hlabs.spb.ru)
     http://hlabs.spb.ru