[ZODB-Dev] Very Weird Behaviour :-(

Chris Withers chrisw at nipltd.com
Tue Jun 10 15:39:26 EDT 2003


Toby Dickenson wrote:
> On Monday 09 June 2003 10:22, Chris Withers wrote:
> 
>>I awoke to a customer complaint that their site wasn't responding. On going
>>there, it was indeed not responding. ssh'ed to the box to find one process
>>using about 1GB of memory.
> 
> Im sure you know this already, but you could have stopped this early using 
> resource limits (to protect the rest of the machine from this rogue process) 
> plus something like autolance (to restart the process before it hits the hard 
> limit).

Not had need to play with those yet, but thanks for the hints.
The machine was fine, just Zope that was screwed.

>>Turned out to be one of the threads on the
>>web-serving ZEO client.
> 
> Using any CPU time, or stalled? If it was spinning, then I would approach the 
> problem by attaching strace to it.

Loads of CPU time, so spinning?

Where can I find out about strace?

>>Zope's stop script didn't work, so I had to kill -9 one of the worker
>>threads for the web client to die. (This on it's own seems to be quite a
>>common pattern, why is that?)
> 
> Zope 2.6 handled shutdown signals by raising an exception. Anything that 
> swallows python exceptions can block a shutdown.

And I'm guessing a normal kill just raises an exception which got swallowed 
by??? Hmmm... what could be swalling those exceptions?

> 2.7 will do this different..... the signal handler sets a global, which is 
> checked in the main medusa loop. Im not sure if this would have helped in 
> this case.

Be interesting to see when 2.7 is out...

>>Now, I don't know if this was a symptom of the memory bloat or the cause,
>>but these errors started at pretty much the same time as the server was
>>first reported as being unresponsive...
> 
> Do you have a proxy in front of this zope? 

Yep...

> anything in its logs? 

Lots of Google, nothing obviouly out of the ordinary...

> Jamie Heilman 
> posted a very effective memory-eating DOS script to zope-dev over the 
> weekend.

What would requests from that look like?

>>Interesting to note that the storage server appears to have survived
>>unscathed throughout this.
> 
> nothing in the storage server logs? 

Lots of:

2003-06-09T08:13:55 INFO(0) zrpc:6217 zeoLoad() raised exception: 0000000000121d92

...which corresponds nicely.

> that oid is loadable now?

It would appear so, since I haven't had any similar errors since.

>>So, I'm left rather nervous and wondering:
>>
>>1. What caused the memory bloat, which apparently arrived out of the blue?
>>
>>2. What those POSKeyErrors meanand should I be worried about them?

And I'm still wondering...

Chris




More information about the ZODB-Dev mailing list