[ZCM] [ZC] 813/38 Resolve "host header as a vector for cross site scripting"

Mon Jan 19 18:46:10 EST 2004

Issue #813 Update (Resolve) "host header as a vector for cross site scripting"
 ** Security Related ** (Public)
 Status Resolved, Zope/bug critical
To followup, visit:
  http://collector.zope.org/Zope/813

==============================================================
= Resolve - Entry #38 by efge on Jan 19, 2004 6:46 pm

 Status: Accepted => Resolved

Resolved per poster's request.
________________________________________
= Comment - Entry #37 by leper on Jan 19, 2004 6:15 pm

ok, this bug can be resolved now and we can continue the
core issue at http://collector.zope.org/Zope/1191
________________________________________
= Accept - Entry #36 by Brian on Jan 19, 2004 5:31 pm

 Status: Resolved => Accepted

 Supporters added: Brian

Sorry - didn't mean to totally close it 
out (though maybe we should close it and 
make another issue with less to read 
through - this one is getting a little 
unwieldy...)

-BRL
________________________________________
= Comment - Entry #35 by leper on Jan 19, 2004 4:46 pm

yikes - this bug isn't resolvable yet. pls repen

To reitterate the core nature of this bug is that the Host header from the client request is trusted when it shouldn't be.  The XSS issues were merely one way to exploit this bug.
Other exploit techniques include cache poisoning, log tampering, etc.

What needs to happen is that Zope needs to know the domains its responsible for ahead of time, and when it receives requests for resources outside of those domains it needs to do something sensible.
________________________________________
= Resolve - Entry #34 by Brian on Jan 19, 2004 2:59 pm

 Status: Pending => Resolved

applied / ported to 2.6 branch, 2.7 branch and head.

-BL
________________________________________
= Comment - Entry #33 by leper on Jan 18, 2004 5:12 am

The 2.6.3/2.7.0b4 releases incorporated a great deal of the work I've done on the XSS issues mentioned in this bug report, but there continue to be some areas which have remained unpatched.  My megapatch is still maintained against CVS HEAD, and can still be found at http://audible.transient.net/zope/813.diff

The issues it continues to address at this time are:
- bad HTML in the FindSupport DTML
- XSS in FindSupport (bug #407)
- bad DTML logic in FindSupport (bug #706)
- horribly inefficient HTTPResponse.quoteHTML() method
- other various XSS fallout from the trusted Host header which didn't make it into the 2.6.3/2.7.0b4 update

(The 2.6.3/2.7.0b4 update doesn't address the misplaced trust in the Host header in any way, which is the *core* issue behind this bug report, despite the amount of effort thats gone into fixing related XSS holes.)
________________________________________
= Comment - Entry #32 by leper on Jun 2, 2003 4:08 am

bah.

^cleaning^cleanly

^406^407

If I could spell, I'd be more dangerous than this issue.
________________________________________
= Comment - Entry #31 by leper on Jun 2, 2003 4:03 am

I've combined all previous patches and updated them to patch cleaning against CVS HEAD as of 20030601.  This überdiff may be retrieved from http://audible.transient.net/zope/813.diff

As a side effect, this patch fixes the bugs reported in issues 406, 706, and 734 as well.
________________________________________
= Comment - Entry #30 by htrd on Mar 14, 2003 8:29 am

Wow. Thankyou both for the correction.
________________________________________
= Comment - Entry #29 by tseaver on Mar 14, 2003 6:37 am

> = Comment - Entry #28 by leper on Mar 14, 2003 4:37 am
> 
> > But your patches also use cgi.quote in contexts that render
> > into other contexts where cgi.escape does the wrong thing. For
> > example, <img src="X"> and <a href="X">. A URL containing a &
> > character should not be converted into &amp; in that context.
> 
> Ah yes, this is a common misunderstanding.  I'm afraid you're wrong.
> 
> See section C.12 of the XHTML1.0 spec.
> http://www.w3.org/TR/2000/REC-xhtml1-20000126/#guidelines

More to the point is the "Basic HTML Datatypes" portion
of the HTML4 spec:

  http://www.w3.org/TR/html401/types.html#type-uri

Section B.2.2 states:

  B.2.2 Ampersands in URI attribute values

  The URI that is constructed when a form is submitted may be
  used as an anchor-style link (e.g., the href attribute for the
  A element). Unfortunately, the use of the "&" character to
  separate form fields interacts with its use in SGML attribute
  values to delimit character entity references. For example, to
  use the URI "http://host/?x=1&y=2" as a linking URI, it must
  be written <A href="http://host/?x=1&#38;y=2"> or
  <A href="http://host/?x=1&amp;y=2">.

  We recommend that HTTP server implementors, and in particular,
  CGI implementors support the use of ";" in place of "&" to
  save authors the trouble of escaping "&" characters in this
  manner.
________________________________________
= Comment - Entry #28 by leper on Mar 14, 2003 4:37 am

> But your patches also use cgi.quote in contexts that render
> into other contexts where cgi.escape does the wrong thing. For
> example, <img src="X"> and <a href="X">. A URL containing a &
> character should not be converted into &amp; in that context.

Ah yes, this is a common misunderstanding.  I'm afraid you're wrong.

See section C.12 of the XHTML1.0 spec.
http://www.w3.org/TR/2000/REC-xhtml1-20000126/#guidelines

Anyway, I'm happy to discuss this stuff but lets do it on zope-dev, the collector makes a lousy forum.
________________________________________
= Comment - Entry #27 by htrd on Mar 14, 2003 4:19 am

> Correct, because *URL*, BASE*, etc. can be subverted and need to be quoted.

If we consider these to be arbitrary string properties then yes, I agree. 
However we know they are URLs, and rfc 2396 defines a set of characters
that MUST be escaped in a URL. This includes all the characters that are
relevant to this security issue, < > and ".

The CGI spec states that all our url properties must be url encoded.

Note that this is not magic. rfc 2396 section 2.4.3 states that the
rationale for this requirement on URLs is to allow simplifying
assumptions in applications like Zope.

I am proposing two options:

Either ZPublisher checks for unquoted characters that violate the url
quoting requirements, and rejects the request. 

Alternatively, ZPublisher *normalises* all URL properties following the
definition of in rfc 2396. This means adding any necessary quoting. (This
option feels a little more like magic, but may be necessary to fulfil our
"generous in what you accept" obligations)

(Either way I think we should also remove unnecessary quoting too. A bug
like this was recently fixed in squid.)

>> Has consideration been given to the option of getting ZPublisher to
>> validate the host header, and quote all urls before passing them to
>> the application?

Yes. You can't do that. That would be making the assumption that you know
the context the variable will be rendered in

No, I am assuming knowledge of the data type, and taking advantage of
restrictions on the format of a URL.

DTML that renders URL1 (etc) into html or xml without any quoting are
also taking advantage of knowledge about the data type. A properly
quoted URL is can not be used to subvert html, so this dtml should be
safe. Today is unsafe to assume that URL1 (etc) is properly quoted,
but only because of poor input validation.

DTML that renders URL1 into a different (non-html) markup may need to
apply its own quoting, or course. Thats a seperate issue - I dont think
we have any examples of that here. (btw, I have same code that renders
urls into csv that needs attention)

>> Also, I think many of those calls to cgi.quote should really be
>> urllib.quote. right?

> Not really. This, again, is an issue of context. urllib.quote can be
> used in a variety of non-presentation contexts when you know exactly what
> you're quoting (path segments, query string variables, etc.).

Yes, I agree.

> you're able to use contextual quoting to ensure special characters don't
> funge anything. In the case of HTML thats cgi.escape.

Yes, I agree cgi.quote can be used in contexts that render into html text.
(I argued above that it should be unnecessary, except from a
belts-and-braces point of view. I probably agree it is worth doing even
if unnecessary.)

But your patches also use cgi.quote in contexts that render into other
contexts where cgi.escape does the wrong thing. For example, 
<img src="X"> and <a href="X">. A URL containing a & character should not
be converted into &amp; in that context. (badapplication.diff is an 
example of a bad patch in this category)

If belts-and-braces is desirable then I think we need a *new* function.
Im not yet sure what.... Possibly something that validates rather than
transforms.

________________________________________
= Comment - Entry #26 by leper on Mar 13, 2003 4:59 pm

> This proposed suite of patches isnt the fix that I was expecting to
> see to this problem. These patches work under the assumption that
> URL1 etc can be subverted by a malicious user, and need to be quoted
> everywhere that it is used.

Correct, because *URL*, BASE*, etc. can be subverted and need to be quoted.

> Has consideration been given to the option of getting ZPublisher to
> validate the host header, and quote all urls before passing them to
> the application?

Yes. You can't do that. That would be making the assumption that you know the context the variable will be rendered in, and thats not a valid assumption. It is, at first, the most obvious solution, sadly its not an acceptable one.

> Also, I think many of those calls to cgi.quote should really be
> urllib.quote. right?

Not really.  This, again, is an issue of context.  urllib.quote can be used in a variety of non-presentation contexts when you know exactly what you're quoting (path segments, query string variables, etc.).  It can also be used in presentation contexts, but only if what your quoting hasn't already been quoted, and only if you're selective about the parts of the URI you quote.
By the time you've reached presentation context its common that you know longer now the exact nature of the URI (it may have already undergone a degree of urllib.quote()ing, which makes it unsafe to use urllib.quote.  cgi.escape OTOH is of a different nature.  You can't use cgi.escape in non-presentation contexts because it doesn't make any sense to do so. (places where it is used anyway are generally bugs)  By the time your variables have trickled down to the presentation layer (Woah Reaganomics!) you're able to use contextual quoting to ensure special characters don't funge anything.  In the case of HTML thats cgi.escape. (Although, IMO, "cgi" isn't the correct module, an HTML module would be better, but then HTML modules tend to carry more baggage.)
________________________________________
= Comment - Entry #25 by efge on Mar 13, 2003 12:15 pm

FWIW I agree with the sample of patches I've read (it's a bit of a
PITA to read them all) and the approach taken.

I don't like "magic" quoting in ZPublisher.

________________________________________
= Unrestrict_pending - Entry #24 by htrd on Mar 13, 2003 11:44 am

This proposed suite of patches isnt the fix that I was expecting to see to this problem. These patches work under the assumption that URL1 etc can be subverted by a malicious user, and need to be quoted everywhere that it is used.

Has consideration been given to the option of getting ZPublisher to validate the host header, and quote all urls before passing them to the application? Applications could continue using BASEPATH1 (etc) in the traditional manner.

(Note I am not arguing that this would be better. I havent thought about it enough to reach a conclusion. Im just stating that this approach was unexpected)

Also, I think many of those calls to cgi.quote should really be urllib.quote. right?

________________________________________
= Comment - Entry #23 by leper on Mar 6, 2003 5:00 am

Uploaded:  "badaqueduct2.diff"
 - http://collector.zope.org/Zope/813/badaqueduct2.diff/view
Finished the grep audit for PATH_INFO, HTTP_HOST, PATH_TRANSLATED, HTTP_REFERER, SERVER_NAME, and QUERY_STRING.  Only found one instance of HTTP_REFERER in aqueduct that got past me the first time.  This is the last grep audit I'm conducting for this issue.

Given the results of this restricted mini-audit, I think its fair to say Zope could use a serious full-scale audit with respect to contextual escaping.

A while back I wrote a long message to the Zope list about how to use a VHM to protect against this vulnerability, and what the scope of the problem is.  You can find that at http://marc.theaimsgroup.com/?l=zope&m=104639584701163&w=2 should there be any doubts as to why this bug was filed.
________________________________________
= Comment - Entry #22 by leper on Mar 5, 2003 9:08 pm

Uploaded:  "sillyzcatvocab.diff"
 - http://collector.zope.org/Zope/813/sillyzcatvocab.diff/view
*URL* fallout final

patches prefixed with "silly" don't address security problems, just needless dtml-var usage when entity quoting is technically required

It has now been 3 weeks since I opened this bug and I haven't heard a thing in response to it.  Why?
________________________________________
= Comment - Entry #21 by leper on Mar 5, 2003 9:05 pm

Uploaded:  "sillyzcatview.diff"
 - http://collector.zope.org/Zope/813/sillyzcatview.diff/view
*URL* fallout
________________________________________
= Comment - Entry #20 by leper on Mar 5, 2003 9:04 pm

Uploaded:  "sillyscripttry.diff"
 - http://collector.zope.org/Zope/813/sillyscripttry.diff/view
*URL* fallout

heh, just when you think you an entity quoted context is what you want...
________________________________________
= Comment - Entry #19 by leper on Mar 5, 2003 9:01 pm

Uploaded:  "sillypivocab.diff"
 - http://collector.zope.org/Zope/813/sillypivocab.diff/view
*URL* fallout
________________________________________
= Comment - Entry #18 by leper on Mar 5, 2003 8:59 pm

Uploaded:  "badvhosting.diff"
 - http://collector.zope.org/Zope/813/badvhosting.diff/view
*URL* fallout
________________________________________
= Comment - Entry #17 by leper on Mar 5, 2003 8:59 pm

Uploaded:  "badtuttopic.diff"
 - http://collector.zope.org/Zope/813/badtuttopic.diff/view
*URL* fallout
________________________________________
= Comment - Entry #16 by leper on Mar 5, 2003 8:57 pm

Uploaded:  "badfindresult.diff"
 - http://collector.zope.org/Zope/813/badfindresult.diff/view
*URL* fallout
________________________________________
= Comment - Entry #15 by leper on Mar 5, 2003 8:55 pm

Uploaded:  "baddtui.diff"
 - http://collector.zope.org/Zope/813/baddtui.diff/view
*URL* fallout

This file also uses  in the midst of a textarea.
I figure thats bad, but exactly what __str__ was in this context was a little hard to discern, and I'm not even sure it works.  Anyway, it probably needs to be switched entity syntax too... but my patch doesn't make that change.
________________________________________
= Comment - Entry #14 by leper on Mar 5, 2003 8:52 pm

Uploaded:  "baddtin.diff"
 - http://collector.zope.org/Zope/813/baddtin.diff/view
*URL* fallout
________________________________________
= Comment - Entry #13 by leper on Mar 5, 2003 8:51 pm

Uploaded:  "badcdefrep.diff"
 - http://collector.zope.org/Zope/813/badcdefrep.diff/view
*URL* fallout
________________________________________
= Comment - Entry #12 by leper on Mar 5, 2003 8:49 pm

Uploaded:  "badaqueduct.diff"
 - http://collector.zope.org/Zope/813/badaqueduct.diff/view
*URL* fallout

this whole file is begging for review/reworking - I only fixed the dtml misuse, the string interpolation bits are still a complete mess
________________________________________
= Comment - Entry #11 by leper on Mar 5, 2003 8:45 pm

Uploaded:  "badappmgr.diff"
 - http://collector.zope.org/Zope/813/badappmgr.diff/view
OK, grep audit for *URL* is done.  First some general comentary, then the patches.

lib/python/HelpSys/dtml/button.dtml - this jams insufficently untainted data into the midst of some javascript, I didn't want to deal with it, because the easiest way to fix it just remove all javascript, but I figure I'll face opposition if I submit a patch that does so.

lib/python/Shared/DC/ZRDB/dtml/customDefaultZPTReport.dtml - this is a page template, yet its named .dtml, silly

lib/python/TreeDisplay/TreeTag.py - the anchors this makes (around line 346) don't really undergo correct contextual quoting, but I'm not sure if things have come to rely on that or not, thus, no patch.

I skipped the webdav library entirely, for reasons stated earlier.  There are multiple areas where urls are thrown into the protocol output with no attempt at escaping, and a few where an attempt is made.  That should be fixed.
________________________________________
= Comment - Entry #10 by leper on Mar 3, 2003 8:53 pm

Uploaded:  "badmessagedialog.diff"
 - http://collector.zope.org/Zope/813/badmessagedialog.diff/view
Tangentially related to all this is the misuse of the MessageDialog class.  Before we get back to the meat of the issue, here is a bonus patch that addresses said misuse.
________________________________________
= Comment - Entry #9 by leper on Mar 2, 2003 3:12 am

Uploaded:  "badsimptree.diff"
 - http://collector.zope.org/Zope/813/badsimptree.diff/view
BASE* fallout final
________________________________________
= Comment - Entry #8 by leper on Mar 2, 2003 3:11 am

Uploaded:  "badapplication.diff"
 - http://collector.zope.org/Zope/813/badapplication.diff/view
BASE* fallout
________________________________________
= Comment - Entry #7 by leper on Mar 2, 2003 3:10 am

Uploaded:  "badmanagement.diff"
 - http://collector.zope.org/Zope/813/badmanagement.diff/view
Finished the "grep audit" for BASE* and DestinationURL.
DestinationURL came up clean, patches for BASE* attached.
________________________________________
= Comment - Entry #6 by leper on Mar 1, 2003 6:47 am

Uploaded:  "badsearch.diff"
 - http://collector.zope.org/Zope/813/badsearch.diff/view
absolute_url fallout final

________________________________________
= Comment - Entry #5 by leper on Mar 1, 2003 6:45 am

Uploaded:  "badramcachemgr.diff"
 - http://collector.zope.org/Zope/813/badramcachemgr.diff/view
absolute_url fallout
________________________________________
= Comment - Entry #4 by leper on Mar 1, 2003 6:44 am

Uploaded:  "badhelpsys.diff"
 - http://collector.zope.org/Zope/813/badhelpsys.diff/view
absolute_url fallout
________________________________________
= Comment - Entry #3 by leper on Mar 1, 2003 6:42 am

Uploaded:  "badaccelcachemgr.diff"
 - http://collector.zope.org/Zope/813/badaccelcachemgr.diff/view
I've finished a mini-"grep audit" for absolute_url abuse.

This is, of course, only one of the *many* places tainted host information (and more) can come from.  There's still URL*, BASE*, BASEPATH*, plus secondary functions like DestinationURL--the list is seemingly endless.  woo.  Nevertheless, this kind of thing has to be done sooner or later.  Even after some form of hostname canonicalization mechanism is inplace (which I believe is inevitable) raw URIs being placed into contexts that require certain quoting is bad and must be fixed.

There are certain absolute_url abuses I'm not going to issue patches for, and those are:

lib/python/OFS/Image.py in Image.tag() - Image.py is mess, I refuse to touch it, sorry.

lib/python/DocumentTemplate/DT_Var.py in render() - this code is hard to follow, I think its safe, but I'm not 100% sure.

lib/python/OFS/PropertySheets.py in dav__source() - definately unsafe, but the XML context of a WEBDAV transaction is a bit outside of my familiarity - I'm pretty sure a simple entity quoting like:
-            url=urlbase(vself.absolute_url())
+            url=escape(urlbase(vself.absolute_url()))
using escape from cgi will be adequate but a second set of eyes on that would be good.

lib/python/Products/PythonScripts/www/default_py - this is the default text of a new script (python) instance, and its broken, it can't decide what context it wants to return, its either markup or not, but just quoting script.title doesn't cut it, either quote all the variables or none of them.  Anyway, pick one and stick with it.

The rest of the problems I found I'm attaching patches for.  Patches are made against CVS, but apply to 2.6.1 without problems.
________________________________________
= Comment - Entry #2 by leper on Feb 26, 2003 12:00 am

Uploaded:  "badbase.diff"
 - http://collector.zope.org/Zope/813/badbase.diff/view
Here's the simple fix for the base href.  As base injection doesn't get cached by sites using RAM caches they were immune to the xss, however fronting proxy caches that index on URI and don't include the Host header in the key data would be vulnerable (in theory).

Sloppy use of URLx variables and such are still problematic, but thats not the zope machinery's fault.  What remains zope's fault are things like Image.py and lord knows what other modules, that are allowing untainted data to be injected into any part of a page which *IS* cached by RAM caches.

Again, please make this bug visible to all, while all the code hasn't been fixed yet, Zope users need to be aware of the work-around techniques they can use to protect their installations.
________________________________________
= Request - Entry #1 by leper on Feb 18, 2003 4:38 am

Please leave this issue public, its come up on the zope lists already, hiding it now is pointless. (subject: "Zope & SSL & IE6 redirection bug" on zope list)

HTTP host headers may be used as an injection point for cross site scripting code.  This is a critical problem for sites utilizing caches.

Information used for generating absolute URLs is obtained from the host header, in at least one case (that of the base href tag frequently injected into published html documents by ZPublisher) this information does not undergo the requisite contextual quoting, its possible there are more.  For sites which rely on caches to improve performance this means its possible to poison a cache with a document containing malicious markup.

Workarounds:
Using a VirtualHostMonster with a proxy server, hardcode the host information into the URI used for the proxy request. The VHM will override the Host header with the hardcoded data. (tested with Zope 2.6.1 and Apache+mod_rewrite+mod_proxy)

==============================================================