[ZODB-Dev] ClientDisconnected error on windows
JohnD.Heintz
JohnD.Heintz
Wed, 8 Aug 2001 18:50:07 -0500
On Wednesday 08 August 2001 18:01, Jeremy Hylton wrote:
> >>>>> "JDH" =3D=3D John D Heintz <jheintz@isogen.com> writes:
>
> JDH> Thanks, hopefully with the ZCF 0.6 code you should be able to
> JDH> replicate this on your own boxes as well.
>
> Will you give me a recipe for running it to generate high load?
Sure, the multiThreadTest.py is the ticket. Right now it is pretty limit=
ed:=20
it must be run in the foreground and multiple times to hit the CORBA Serv=
er=20
mith multiple client threads.
In the directory with the ZCF files run the following in one shell:
python startZeo.py &
python startServer.py zeo
This will start the ZEO and CORBA servers. =A0This also writes out the=20
SampleServer.ior file used by clients.
Now, execute in multiple other shells:
python multiThreadTest.py
This doesn't like being run in the background right now, sorry. =A0It has=
a=20
raw_input() call so just hit enter to kill them.
We can run more of these multiThreadTest.py processes against Linux than =
we=20
can against Windows.
>
> >> The tracebacks you show seem like part of the documented behavior
> >> of ZEO. You can get a ClientDisconnected error when you commit a
> >> transaction. You can also get a POSException, a thread.error, or
> >> a socket.error. [I'll treat your POSException suggestion
> >> separately.]
>
> JDH> Where is this documented? My ideal would be something along
> JDH> the lines of BTrees.Interfaces with all exceptions declared.
> JDH> My suggestion about POSException was based primarily on the
> JDH> belief that *POSException* was the documentation.
>
> Sigh. I don't know if it's documented, but Jim's answer the last time
> it came up was something like "Zope catches all exceptions and retries
> the transaction." In order words, we haven't documented the
> exceptions carefully, nor have we been careful to keep track of what
> exceptions should be raised.
>
> I agree that this is a problem, but fixing it is a post 1.0 issue.
Hmm. I don't like the "catches all exceptions and retries the transactio=
n"=20
bit. I would much rather have a category of Exception that signal a retr=
y=20
and all others are explicity raised. This might be me just prematurely=20
optimizing, but I don't like wasting server resources retrying transactio=
ns=20
that are always failures.
>
> >> It may take a moment for a client to reconnect, so under heavy
> >> load I can imagine getting this error several times before the
> >> client reconnects. When you get this error, do you retry the
> >> transaction? If so, does it succeed (eventually)? Or is this
> >> client disconnected permanently?
>
> JDH> Umm, I think on Linux we are getting reconnected, but I'm not
> JDH> so sure about on Win2k.
>
> How can we find out what happens one way or another on Win2k? :-)
Would looking at the log file for the ZEO Server or ZEO Client be appropr=
iate=20
to easily find that out? I can dump logging info wherever we need it, I'=
ve=20
avoided it so far because I'm already feeling a little infomation overloa=
d.
Let me know where and what logging you want and we can get it though.
>
> JDH> Regarding the reconnect time: The scenerio we have is: [ZEO
> JDH> Server] <-> [ZEO Client / CORBA Server] <=3D=3D> [many CORBA
> JDH> Clients]
>
> JDH> With this setup why would there be any waiting to reconnect?
> JDH> The ZEO Server is only serving one ZEO Client and therefore
> JDH> should be able to respond immediately to a reconnect request.
> JDH> Right?
>
> Sounds right to me.
>
> JDH> Right. We can run all the processes on a single box and still
> JDH> experience the problem. We have no network failures that I'm
> JDH> aware of and I'm not doing anything but throw data at the CORBA
> JDH> Server (single ZEO Client) fast. I would expect either
> JDH> asyncore or zrpc to handle the problem more gracefully.
>
> Agreed. This may be a bug, but I'm not sure who to blame yet.
We're not into blame here, just getting to the right solution. ;-)
>
> JDH> When the socket runs out of room asyncore should block further
> JDH> pushes until more room is available. Are there tunable
> JDH> parameters in zrpc / asyncore / the OS to specify how much data
> JDH> should be cached for a socket?
>
> I think the OS has some tunable parameters, but I don't think that
> should enter into it. The zrpc mechanism should queue things up until
> asyncore (really the OS via select/poll) says the socket is ready. It
> may be that asyncore and the OS aren't agreeing on what the various
> error returns from socket calls are supposed to mean.
>
> Jeremy
>
>
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list - ZODB-Dev@zope.org
> http://lists.zope.org/mailman/listinfo/zodb-dev
--=20
=2E . . . . . . . . . . . . . . . . . . . . . . .
John D. Heintz | Senior Engineer
1016 La Posada Dr. | Suite 240 | Austin TX 78752
T 512.633.1198 | jheintz@isogen.com
w w w . d a t a c h a n n e l . c o m