[Zope] System performance threads/proccesses & random crashes (SIGPIPE)

Doyon, Jean-Francois Jean-Francois.Doyon@CCRS.NRCan.gc.ca
Fri, 22 Mar 2002 16:17:06 -0500


Chris,

Thanks for the great help.

After doing some invertigation, I am now pretty sure the behavior mention=
ned
in the FastCGI docs is what is happening here, a request isn't given time=
 to
finish.  In my case this can be replicated by users clicking VERY fast on
the web pages, sending a request to the server before the first one is
finished. (I have some slow processes, drawing dynamically generated maps=
).
I can replicate the SIGPIPE almost 100% using this method.

Anyways, I took a look at the code you mentionned, and that will hopefull=
y
help, although I'm venturing in extremely unknown territory for me here!

A couple of more things:

This, so far as I can tell is a bug in the FastCGI implementation (Not
handling SIGPIPE as suggested).  Should I report it somewhere?

Also I was reading the signal handling stuff for python and came upon thi=
s:

Python installs a small number of signal handlers by default: SIGPIPE is
ignored (so write errors on pipes and sockets can be reported as ordinary
Python exceptions)=20

Now I'm confused.  If Python ignores SIGPIPE by default, why is Zope
complaining ? This would mean there there is allready a SIGPIPE handler
defined somwehere overriding the default, in which case that is probably
where the change should be made, instead of adding yet another handler.

And finally, how do I "ignore" a signal ? I guess just writing a "pass" w=
ill
work ? I'll try it out, I guess on reception of a signal, only one handle=
r
is a called once?

Thanks again and again :)
J.F.

-----Original Message-----
From: Chris McDonough [mailto:chrism@zope.com]
Sent: Friday, March 22, 2002 3:05 PM
To: Doyon, Jean-Francois; zope@zope.org; matt@zope.com
Subject: Re: [Zope] System performance threads/proccesses & random
crashes (SIGPIPE)


You could register a SIGPIPE handler for Zope that just ignores the signa=
l.
See the chrism-logrotate-branch in CVS at
http://cvs.zope.org/?only_with_tag=3Dchrism_logrotate_branch and take a l=
ook
at z2.py's "installsighandlers" function... maybe use this branch but add=
 a
SIGPIPE handler to the function that mimics the others except uses the
function SIG_IGN as a callback instead of the current signal handler
function.

----- Original Message -----
From: "Doyon, Jean-Francois" <Jean-Francois.Doyon@CCRS.NRCan.gc.ca>
To: "'Chris McDonough'" <chrism@zope.com>; <zope@zope.org>; <matt@zope.co=
m>
Sent: Friday, March 22, 2002 2:45 PM
Subject: RE: [Zope] System performance threads/proccesses & random crashe=
s
(SIGPIPE)


Hello,

Thanks for the help!

Well, I've determined it most likely isn't PostgreSQL, since I switched t=
he
connections from socket based to TCP based, and the problem still occurs.

So, I turn my attention to FastCGI ...

I just read this on the FastCGI Website:

If an http client aborts a request before it completes, mod_fastcgi does =
too
- this results in a SIGPIPE to the FastCGI application. At a minimum,
SIGPIPE should be ignored (applications spawned by mod_fastcgi have this
setup automatically). Ideally, it should result in an early abort of the
request handling within your application and a return to the top of the
FastCGI accept() loop.

I guess Zope isn't handling the SIGPIPE the way it is suggested here?
Anyways this seems to be the most likely cause of the problems I'm having.
That AND possibly the problem Matt describes.  Matt, where can I find mor=
e
information on this, and possible solutions?

For now, I'm guessing switching to using TCP instead of sockets for FastC=
GI
connections might help solve the problem? I am getting *A LOT* of these
errors, every 5 to 10 minutes!!! And it *IS* traffic related ... when the
business day dies down, the errors stop occuring (Normal usage pattern at
this time would suupport the theory that the rrors are therefore directly
related to the amount of usage).

I'm also thinking of playing the -restart-delay option of the FastCgiServ=
er
directive ...

Help!!!

Thank you,
J.F.

-----Original Message-----
From: Chris McDonough [mailto:chrism@zope.com]
Sent: Thursday, March 21, 2002 11:08 AM
To: Doyon, Jean-Francois; zope@zope.org
Subject: Re: [Zope] System performance threads/proccesses & random
crashes (SIGPIPE)


SIGPIPE is raised by the OS when a UNIX pipe is broken in the application.
UNIX takes this exception seriously which is why it sends the signal to t=
he
process telling it "you've got a broken pipe".

As you say it started happening when you began using the database adapter=
,
it may be that some piece of the database adapter opens a pipe that is la=
ter
broken (for whatever reason, that's the $10,000 question ;-), causing the=
 OS
to send Zope a SIGPIPE.

It may be possible to install a signal handler for SIGPIPE to get rid of =
the
problem, but I'm not exactly sure what it should/would do during this
failure state, and it would be more useful to try to pin down the pipe th=
at
is getting broken by making the problem replicable.

The ZODB pool_size parameter is controlled via the pool_size argument to
ZODB.DB.DB's constructor.  It signifies how many database connections its
willing to place in the pool.  When Zope starts up, each Zope thread need=
s
to use its own database connection.  So you should likely never have a
smaller pool_size than number of threads (the -t parameter to z2.py).
Adjusting these values up and down may improve performance but there has =
to
this day not been any empirical studies as to how performance is impacted
when you do. It's probably something you need to try out in a load testin=
g
environment.  If you find something interesting, let us know! ;-)

----- Original Message -----
From: "Doyon, Jean-Francois" <Jean-Francois.Doyon@CCRS.NRCan.gc.ca>
To: <zope@zope.org>
Sent: Thursday, March 21, 2002 9:57 AM
Subject: [Zope] System performance threads/proccesses & random crashes
(SIGPIPE)


Hello,

I'm running into random crashes of my zope processes, but I'm not finding
any reference anywhere in the mailing list archives or on the site about
this specific one:

I'm getting:

2002-03-21T14:48:52 ERROR(200) zdaemon zdaemon: Thu Mar 21 09:48:52 2002:
Aiieee! 20070 exited with error code: 13

Every now and then, for now apparent reason.  signal 13 is a SIGPIPE ...

This is Zope 2.5.0 with CMF 1.2 on a severly upgraded/updated/patched RH6=
.2
... with a Python 2.1.2 built with defaults. It runs with FastCGI to Apac=
he
1.3.2x ...

Usually I just wait a couple of seconds, hit referesh in my browser and
things come back to normal, but it's still annoying, and doesn't look goo=
d
to the public.  Note that when this happens, it ususally seems to happen =
to
ALL processes.  It looks to me like the PIPE's between the master zope
process and it's children dies, and they all have to restart for some
reason. Could this be ? and if so  , why ?

Note that I started noticing this when I for the first time started using
Psycopg to create RDBMS connections to my PostgreSQL ... Could there be a
relation somehow?

On a slightly similar topic, How to I manage performance? I plan on using
Zope for a fairly high demand web site .. I noticed I can control how man=
y
processes/threads start, but then I also read somethign about the ZODB
pool_size ... What is the relation between the two exactly ?

Thank you,

Jean-Fran=E7ois Doyon
Internet Service Development and Systems Support
GeoAccess Division
Canadian Center for Remote Sensing
Natural Resources Canada
http://atlas.gc.ca
Phone: (613) 992-4902
Fax: (613) 947-2410


_______________________________________________
Zope maillist  -  Zope@zope.org
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )


_______________________________________________
Zope maillist  -  Zope@zope.org
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )