[ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?

Dario Lopez-Kästen dario at ita.chalmers.se
Thu Apr 14 03:38:39 EDT 2005


Chris Withers wrote:
> Dario Lopez-Kästen wrote:
> 
>> I am in need for some help. We are using Zope 2.6.2, DBTab on the 
>> clients (4 of them on 2 servers) and Directory storage on teh ZEO side.
> 
> 
> Out of interest, why are you using DirectoryStorage?

I chose it for several reasons:

1) we are storing large amounts of binary files (PDF, Word, Matlab, Zip, 
tar-balls, etc) in this particular application (it's a student portal, 
course admin portal and an LMS). While we are not yet in the 
multigigabyte realm, we are storing archive copies of all the previous 
year's materials, which will eventually grow to be a lot of stuff.

2) There is the issue of huge Data.fs fiels and making daily backups. We 
need to have incremental backups

3) HA - while DirStor is not a HA-tool per se, it provides the necessary 
tools for building something that provide some aspects of HA, ie. the 
replication features, etc.

4) Maintenance. While I have not yet dared to pack the DB, the mere size 
of the database will make packing a non-trivial operation memorywise in 
FielStorage. DirStor does not have the same memory requirements when 
packing.

5) POSKeyErrors. We where getting quite a few of those, and that scared 
me. with DirStor, I do not see them as much as before.

> Well, ZEO storage server is single threaded, so I guess something could 
> lock there causing the other clients to wait infinitely for the storage 
> server. Never heard of it though. You sure you're using the latest 
> DirectoryStorage? Can you reproduce this using just plain FileStorage?

No, I am not using the latest, because of my *n*l sysadmin (actually 
there are two of them, but only one whines :-). I'll try beating him in 
the head a few times, er, I mean, "discuss the issue with him" to make 
him change his mind.

For practical reasons, there is no way I can replicate this behaviour 
with FileStorage. That would entail taking the system down for a few 
days and then take it back up again after we've moved all the contents 
to FileStorage.

Today I discovered that the problem may not be in the Zope layer but in 
the Oracle layer, not because DCOracle or Oracle suck, but because we 
share our instance with another app that is known for it's bad 
programming style.

And our app is not the most brilliant piece of code either. Several 
parts of it have not been touched since before the introduction of 
Script(Python) in Zope, and back then we were all Zope newbies using 
DTML for everything.

So there may be DB congestion issues at the core of it all. The reason I 
sent the mail is that this is something that has been happenning all of 
a sudden for three weeks now. I'll report back when I get a full report 
from the DBA guys and see if there is any change in usage pattern on the 
DB level.

thanks,

/dario

-- 
-- -------------------------------------------------------------------
Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech.
"...and click? damn, I need to kill -9 Word again..." - b using macosx


More information about the ZODB-Dev mailing list