[ZWeb] DISCUSS: Monitoring Zope.org

Paul Everitt Paul@digicool.com
Fri, 1 Sep 2000 07:16:36 -0400


As many of you know, Zope.org has been sluggish and unresponsive at
times this week.  For the most part, this is related to our decision to
use it for quickly baking important software like ZEO.  As an aside, the
irony is that ZEO, once baked, will make Zope.org more immune to
downtime.  Go figure.

Anyway, we should have a discussion related to this question:

"How can the community find out the health of Zope.org when things are
flaky?"

I think it would be pretty useful if the community could get self-help
on answering the "what's wrong" question.  Imagine this exchange:

a. Some Zope newcomer posts to the zope mailing list saying Zope.org is
down.

b. One of the ten people on this list that knows the correct procedure
replies with, "oh, it is a DNS problem" or whatever is the explanation.

This certainly helps Digital Creations, as we don't have to answer all
the time.  It helps the community as people get accurate answers, and
get them quickly.

With that in mind, I have a specific proposal to help.  I think we
should:

1) Zope.org sits behind Apache using mod_proxy for integration.  We
should find out what is the timeout for proxy connections.  If it is
configurable, we should dial it _way_ down (e.g. 20 seconds).  If Zope
doesn't respond, get it to say so in a reasonable period of time.

2) Next, we should hack the Apache error page to be meaningful.  For
instance:

  a. It should have some of the Zope.org look and feel, so people feel
like 
  professionals are involved. :^)

  b. It should explain to them that, if they see this page, this likely
means 
  that DNS is OK, routing is OK, and Apache is OK, but Zope.org isn't
responding.

  c. The error should give them a link telling them how to report the
problem.

  d. Finally, the page could give them a link to a 'mod_status' page, if

  we decide that page doesn't have any sensitive information on it.

--Paul