[Zope] Two issues for Z2.2: XHTML & malicious tags

Evan Simpson evan@4-am.com
Fri, 2 Jun 2000 13:50:36 -0400


----- Original Message -----
From: Alexander Limi <alexander@limi.net>
> 2. Malicious HTML tags - is anything being done here? Filtering of these
is
> one of the features Zope 2.2 really shouldn't go without. Most Zope sites
> have user interaction in some way, and the concept of a post containing a
> stray </html>, or even worse - script-tags, destroying a page is totally
> unacceptable IMHO. I'd just like to query what the status is on this, as I
> think it is one of the most overlooked areas that are lacking in Zope.
>
> I know Evan Simpson (malicious tags) and Christopher Petrilli (HTML
quality
> of zope) have been talking about this earlier, any comments?

I've got a rather crude module going which parses an input string for
HTML-ish tags.  It allows only tags from an explicit list, and ensures that
non-empty tags are closed (either by complaining or adding closing tags).
If 'script' is not one of the allowed tags, it also disallows all "On*"
attributes and "javascript:*" attribute values in any tag.

Unfortunately, it isn't very efficient (based on sgmllib.py) and is rather
crude.  I had wanted to make it use SAX to do the parsing, so that sgmlop or
another high-performance library could be plugged in, but never got there.
Also, it has no DTML-level interface; you'd have to wrap it in an External
Method to use it from DTML.

I've gone ahead and put it up at http://www.zope.org/Members/4am/SafeHTML to
see if anyone can make anything of it.

Cheers,

Evan @ digicool & 4-am