[Zope] ZServer stops responding !? Help !?

Jean-Francois.Doyon at CCRS.NRCan.gc.ca Jean-Francois.Doyon at CCRS.NRCan.gc.ca
Sat Apr 24 16:42:18 EDT 2004


Dennis,

I'm using a 4 CPU machine PIII's at 700Mhz each, with 2 GB of RAM. (Dell
PE6400)

I've got the check interval at 200 (Pystones/50, as I once saw somewhere).

I runs very fast the rest of the time though :)

As tyime goes by I'm compiling some data as to what the server is doing, and
I'm focusing
on one particular content type.

This type does avariety things depending on the condition.  The "simplest"
is that it does
nothing particular, just uses it's attributes and methods.

In one case, it could run a local system command using os.popen() ...

And in the really worst case, it starts an FTP connection.  To a server
local to our shop,
but that I don't control.

I'm wondeing if there's problems with the FTP connection or the system call
maybe in some
circumstances or for specific instances. Maybe osme "bad data" gets returned
?

Personally I would tend to look more towards the network I/O as a potential
source of "blocking" ...

Anybody the intricacies of how the network I/O is handled in Python 2.3
and/or Zope 2.7 ? Any
differences with previous versions ?

The hunt continues ...

Much thanks to all who are helping !

J.F.

-----Original Message-----
From: Dennis Allison [mailto:allison at sumeru.stanford.EDU]
Sent: April 24, 2004 4:31 PM
To: Jean-Francois.Doyon at CCRS.NRCan.gc.ca
Cc: chrism at plope.com; zope at zope.org
Subject: RE: [Zope] ZServer stops responding !? Help !?


Jean-Francois,

What processor are you using?  How many CPUs?  

	-d

On Sat, 24 Apr 2004 Jean-Francois.Doyon at CCRS.NRCan.gc.ca wrote:

> Chris,
> 
> Thanks for tips.
> 
> Here's what I've tried:
> 
> When the sites looks like it's no longer responding, I connect with
> the monitor.
> 
> Running Zope.app() is very slow, sometimes so slow I give up and try
again.
> 
> Once I get that setup, I run:
> 
> app.Control_Panel.DebugInfo.dbconnections()
> 
> >From what I gather, not all connections are taken up.  I have at least 2
> that are free:
> 
> [{'info': ' (1391)', 'version': '', 'opened': 'Sat Apr 24 15:38:59 2004
> (31.63s)'}, {'info': ' (7290)', 'version': '', 'opened': 'Sat Apr 24
> 15:30:36 2004 (534.16s)'}, {'info': "({'HTTP_ACCEPT': ...
> 
> I try again a couplle of minutes later and I see:
> 
> [{'info': ' (1391)', 'version': '', 'opened': 'Sat Apr 24 15:38:59 2004
> (181.71s)'}, {'info': ' (7290)', 'version': '', 'opened': 'Sat Apr 24
> 15:30:36 2004 (684.24s)'}, {'info': "({'HTTP_ACCEPT': ...
> 
> The 4 other requests (I have 6 threads) are the same, they haven't
changed.
> So I think we can exclude running out of threads/db connections as a
source
> of the problem.  The other 4 requests are for various content types. the
> content types themselves work fine, so I'm going ot take note of which
ones
> they are and see if one keeps recurring or something.  Actually 2 of those
> do backend http calls, could there be some socket/timeout issue ? The call
> is to a CGI on the very same server though, so I'm confident it's running
> fine.
> 
> At this point the monitor stops responding, in the middle of outputting
that
> second list. I hit enter and I get:
> 
> error: uncaptured python exception, closing channel
<__main__.monitor_client
> connected at 0x400d5c0c> (socket.error:(9, 'Bad file descriptor')
> [/usr/local/lib/python2.3/asynchat.py|initiate_send|218]
> [/usr/local/lib/python2.3/asyncore.py|send|337])
> 
> I go look at my trace log, and stuff is still appearing in there ...
Though
> nothing gets returned.
> 
> CPU usage right now is not particularely big ...
> 
> [zope at tincup log]$ ps auxw | grep Zope
> zope     31498  0.0  0.2  6484 4576 ?        S    15:26   0:00
> /usr/local/bin/python2.3 /usr/local/Zope-2.7-Core/lib/python/zdaemon/
> zope     31499  0.9  8.5 192848 176496 ?     S    15:26   0:13
> /usr/local/bin/python2.3 /usr/local/Zope-2.7-Core/lib/python/Zope/Sta
> zope     31500  0.0  8.5 192848 176496 ?     S    15:26   0:00
> /usr/local/bin/python2.3 /usr/local/Zope-2.7-Core/lib/python/Zope/Sta
> zope     31501  7.0  8.5 192848 176496 ?     S    15:26   1:37
> /usr/local/bin/python2.3 /usr/local/Zope-2.7-Core/lib/python/Zope/Sta
> zope     31502  5.8  8.5 192848 176496 ?     S    15:26   1:21
> /usr/local/bin/python2.3 /usr/local/Zope-2.7-Core/lib/python/Zope/Sta
> zope     31503  4.2  8.5 192848 176496 ?     S    15:26   0:58
> /usr/local/bin/python2.3 /usr/local/Zope-2.7-Core/lib/python/Zope/Sta
> zope     31504  0.5  8.5 192848 176496 ?     S    15:26   0:07
> /usr/local/bin/python2.3 /usr/local/Zope-2.7-Core/lib/python/Zope/Sta
> zope     31505  5.1  8.5 192848 176496 ?     S    15:26   1:11
> /usr/local/bin/python2.3 /usr/local/Zope-2.7-Core/lib/python/Zope/Sta
> zope     31506  1.6  8.5 192848 176496 ?     S    15:26   0:22
> /usr/local/bin/python2.3 /usr/local/Zope-2.7-Core/lib/python/Zope/Sta
> 
> I'm going ot go do a requestprofiler now see what comes out ...
> 
> Thanks again,
> J.F.
> 
> 
> -----Original Message-----
> From: Chris McDonough [mailto:chrism at plope.com]
> Sent: April 24, 2004 2:15 PM
> To: Jean-Francois.Doyon at CCRS.NRCan.gc.ca
> Cc: zope at zope.org
> Subject: Re: [Zope] ZServer stops responding !? Help !?
> 
> 
> On Sat, 2004-04-24 at 13:53, Jean-Francois.Doyon at CCRS.NRCan.gc.ca wrote:
> > G'day,
> > 
> > I've got a rather bizarre but catastrophic problem.
> > 
> > ZServer seems to stop responding.  Sometimes it does so after days of
> > running, sometimes after a few seconds or minutes of uptime.
> > 
> > I know it's ZServer because I can talk to the monitoring port without
> > problem.
> 
> That may be a bit of flawed logic, because ZServer also runs the monitor
> port.
> 
> > Also, the apache processes just pile up up to the limit allowed,
> > suggesting the proxying is not getting replies from the downstream
> > server.
> > 
> > The strange thing is the cause seems to be occacional, or vary.  For
> > hours on end I can sit there and restart it, and within minutes it stops
> > responding ... Then suddenly the problem "disappears", I restart it, and
> > I wait .... and nothing happens, it just keeps running.  Nothing else
> > abnormal is going on the server so far as I can tell, there is very
> > little memory swapped, and the CPU usage is not abnormally high.
> 
> It sounds as if Zope is doing something which blocks, consuming all
> database threads.
> 
> > I used to have this problem very very rarely in the past, but since I
> > upgraded to Zope 2.7, it seems to have gotten much worse :(
> > 
> > I tried accesing the DebugPanel from the monitor, but can't seem to get
> > it to do anything useful ... I don't know where else to look to find the
> > cause of this.
> > 
> > This causes serious uptime problem on our main, high traffic site, which
> > is Very Bad.
> > 
> > I'm on RedHat 7.3 (fully patched)
> > Python 2.3.3 (custom compiled)
> > Zope 2.7
> > CMF 1.4.x (I forget ... the latest!)
> > Psycopg (Latest also)
> > And a variety of other products.
> 
> I'd suggest using the "big M" or "trace" logging features along with the
> requestprofiler script to find out where the problem might be.
> 
> - C
> 
> _______________________________________________
> Zope maillist  -  Zope at zope.org
> http://mail.zope.org/mailman/listinfo/zope
> **   No cross posts or HTML encoding!  **
> (Related lists - 
>  http://mail.zope.org/mailman/listinfo/zope-announce
>  http://mail.zope.org/mailman/listinfo/zope-dev )
> 



More information about the Zope mailing list