[Zope3-dev] MessageID as rocks?

16 Apr 2003 11:21:04 -0400

So I checked in a bunch of updates to the Zope3 i18n support yesterday,
and updated the jobboardi18n product as well.  I had one issue come up
that I'm not sure I implemented correctly.

In Zope3 we have things called MessageIDs (in zope.i18n.messageid) which
is a subclass of the built-in unicode type.  MessageID has three
additional attributes, domain, default, and mapping.  domain is the
application domain, default is the default text to use if the msgid
wasn't found in the catalog, and mapping is a dictionary to use for
interpolation purposes.

The one example I have of this is in the job board.  When entering a new
job, you enter a start date, which must be parseable in the locale's
date format.  Say you enter "now" as the start date.  This will raise an
exception which the JobCreateView.py code catches:

        fmt = self.request.locale.getDateFormatter('short')
        try:
            date = fmt.parse(startdate.strip())
        except DateTimeParseError:
            self.error = _('Bad date string: $date')
            self.error.mapping['date'] = startdate
            return self.preview()

And in the preview.pt file I have:

<p tal:condition="view/error"
   tal:content="view/error">Error messages go here</p>

So all this works because the talinterpreter now knows about MessageIDs
and when it's about to insert some text, but the text is a MessageID, it
calls on the engine to translate it, and uses the returned text.

This means tal is now dependent on zope.i18n, but I don't see a way
around that.

The real issue is that I had to add MessageID to
zope/security/checker.py's BasicTypes so that MessageIDs wouldn't get
proxied.  I found that if they were proxied, then MessageIDs would get
"magically" transformed into unicode objects, stripping them of the
attributes necessary to properly translate them.

This was happening in zope.tales.tales.Context.evaluateText(), which
examines the text returned from evaluating an expression (in this case
"view/error").  evaluateText() does an isinstance() check on the object,
such that if it isn't a StringTypes, it unicode() coerces the text
before returning it.  Of course, this is the built-in isinstance() and
in Python 2.2.2, this isn't "proxy aware".  In Python 2.3, isinstance()
will consult __class__ if that isn't the same as an object's type, so
for Python 2.3, isinstance(MessageID(''), StringTypes) would pass.

But in Python 2.2.2, this fails because __class__ isn't consulted.  By
not proxying MessageIDs, self.evaluate() would return a real MessageID
that would pass the isinstance test and not get unicode()'ified -- it's
already a(n instance of a subclass of) unicode.

Maybe there's a better way to go about all this.  On the one hand, we
could add a proxy-aware isinstance-like function to tales and use that
where ever a type test is used.  I started to go down that path until it
got too ugly.  OTOH, maybe MessageIDs really should be rocks.  The
question then is whether a MessageID could be exploited for some
nefarious purpose or whether it is secure.  

The possible hole could be the mapping, which is intended to be
populated in trusted code, i.e. the code that contained the _() wrapper
in the first place.  What I don't know is whether untrusted code can get
at a MessageID's mapping attribute and stick evil stuff in it.  I'm not
sure it can.  The interpolation is done with a fairly straightforward
textual substitution, but it may be possible to stick some object there
with an __str__() that does something privileged.  I'm not sure how you
would get such an object or how you would stick it in the mapping
though.

Comments?
-Barry