[Zope] xmlrpc slowness

Fri, 18 May 2001 23:07:57 +1000

[Shilad]
The new release is up on sourceforge.  It should be compatible with the
Zope client/server it was tested against.  It is at:

http://www.sourceforge.net/projects/py-xmlrpc

Let me know how it goes.  I'm curious to see what kind of speed increase
you see.  My guess is that the implementation at the other xmlrpc end will
be the bottle-neck pretty soon.

[Albert]
Thanks *very* much for this!!!

1000 xml-rpc calls per second with commodity hardware sounds pretty
attractive!

I've had to just pass it on to a friend to checkout at the moment, but
will try to get back to you soon with speed results, as I'm sure the
other 3 CCs will. I've added the Zope list back to the addresses as
there may be others there interested in trying it out now that Phil Harris
has confirmed the new version interworks correctly with the Zope
implementation. Also added [Zope-dev] list as Zope developers should
be more interested and Matt Hamilton as issues below may be closely
related to his posting "[Zope-dev] Asyncore in an external method".

http://lists.zope.org/pipermail/zope-dev/2001-May/011274.html

In a previous email CC you said:

"I'm really excited that zope people may be using this.  Let me know if you
have any questions/concerns/requests."

As you've also done binary releases for Windows and the various unixen that
Zope does releases for, it should be suitable for actually integrating
into Zope rather than just as an add-on "Product" or it's present form
as just a separate python process talking to Zope rather than implementing
Zope's xml-rpc itself.

So here goes with some questions/concerns/requests...

In the README you mention:

"Some time I should document everything better, especially the nonblocking
aspects of the server and client."

and

"
* Non-Blocking IO:  This becomes important when you have critical
applications (especially servers) that CANNOT be held up because a client
is trickling a request in or is reading a response at 1 byte per second.

* Event based:  This goes hand in hand with non-blocking IO.  It lets you
use an event based model for your client/server application, where the
main body is basically the select loop on the file descriptors you are
interested in.  The work is done by registering callbacks for different
events you received on file descriptors.  This can be very nice.  There is
some work to be done here, but all the hooks are in place.  See TO DO for
a list of the features that have yet to be incorporated."

(BTW there is currently no "TO DO" file.)

These aspects are a very big advantage. For Zope as a client, I suspect
that the trickling issue may also be very important since it
could be blocking an expensive Zope thread while waiting for a
response from a slow remote server to pass on some information in
response to relayed request (ie http/xmlrpc relay mode or even straight
xmlrpc/xmlrpc proxy mode).

As you also mentioned, "the implementation at the other xmlrpc
end will be the bottle-neck pretty soon".

Sorry I'm not very familiar with how to do this stuff myself, so I have
3 questions which maybe you could answer when you get around to doing
those docs (or even better by also providing implementations in the
next release ;-) or perhaps someone else on CC list knows the answers
or is planning to do something about it?

1) I'm not sure if I've got it right, but my understanding is that despite
being based on a Medusa Reactor design that can handle many web hits
with a single thread, Zope also maintains a small pool of threads to
allow for (brief) blocking on calls to the file system for ZODB and
for (also brief) external connections. I suspect these threads are
"expensive" because they each end up keeping separate copies of
cached web pages etc (to avoid slow python thread switching). So
simply increasing the number of such threads is not recommended for
improving Zope performance - performance relies on the non-blocking
async Medusa Reactor design of Zserver, not on the threading, which
just seems to be a sort of extra workaround.

If that is correct, then a few concurrent external calls to slow external
xmlrpc servers (eg for credit card authorization taking 30 seconds
or more) could easily tie up a lot of Zope resources. The non-blocking
py-xmlrpc client could presumably surrender it's turn in the main event
loop for it's thread until a response is received and then be woken up
by the response, thus improving things greatly.

Unfortunately I have no idea how to do this - whether it would just happen
automatically or there are built in facilities for doing that sort of thing
easily in Zope already, or whether it is difficult to do.

I am just guessing that there would be some special
tricks needed to wake up a channel when a response comes back (eg using
the stuff in Zserver/medusa/select_trigger.py and Zserver/PubCore/.
which I don't fully understand, but looks relevant).

Maybe I have misunderstood, but it looks to me like existing
use of xmlrpc clients *from* Zope to external servers may not take this
into account, but just block a thread. So even though the *server* side of
xmlrpc used to "publish" zope objects is non-blocking with Zserver/Medusa,
it may be just blocking a Zope thread when either an xmlrpc or http web
"hit" results in a blocking xmlrpc call to an external server *from* Zope
as a client. Some difficulty handling a separate set of socket selects
for client calls to external servers may be the reason for this, and may
relate to Matt Hamilton's problem.

That might not show up in simple testing of the server side, but could
become a serious bottleneck with a heavy load of such external calls
arising from multiple web hits concurrently when Zope is acting as a
proxy/relay to a slow external server rather than just responding itself.

eg with 10 Zope threads and 30 second external delays you might end up
with every thread blocked when a hit arrives every 3 seconds if only the
server side of Zope is non-blocking but the client side blocks. That
would be a shame if the server alone could handle 1000 calls per second,
which might translate to 500 relayed or proxied calls per second,
(regardless of the external delay of 30 seconds) if both sides were
non-blocking (ok 30 x 500 = 15,000 micro-thread "channels" is pushing
it even though they are just dormant waiting for a response most of the
time - but references from the Medusa site indicate that sort of thing
is at least *theoretically* possible ;-)

http://www.kegel.com/c10k.html

At any rate, a *lot* more than 1 every 3 seconds should be feasible,
without wasting ridiculous amounts of RAM for per thread Zope caches.
(It would still be less than 4 per second instead of 1000 per second
even if the external delay was only 3 seconds - the critical thing
is whether a non-blocking client can be integrated with Zope, taking
into account the multi-threading as well as async Reactor design
of Zope).

Any comments?

2) Medusa has an async non-blocking "pure python" xmlrpc handler
using asyncore - based on "xmlrpcserver.py" by Fredrik Lundh
(www.pythonware.com).

http://www.nightmare.com/medusa/

medusa-20010416.tgz//xmlrpc_handler.py

As far as I can make out this seems to be for the server side
only, although there are also both an rpc_server.py *and* an
rpc_client.py (not using xmlrpc), so I imagine it would
be easy enough to implement a client and a proxy relay
for Medusa as well.

The examples in the tutorial include an http client and an
http proxy relay with both server and client:

http://www.nightmare.com/medusa/programming.html

I imagine it is quite straight forward for anyone who knows
what they are doing to somehow tie the faster py-xmlrpc C
version into the Medusa main event loop even though it has
it's own set of sockets to select from. Unfortunately I
don't know how to do this. It might be generally useful to
contribute the necessary wrapper and/or installation
instructions to Medusa as well as add them to py-xmlrpc itself
to use it for both server and client channels with Medusa, to
encourage faster takeup, and as starting point for stuff below
without having to get into Zope internals.

That would also provide a base reference point for speed tests
of what Zope should be capable of, using py-xmlrpc for client
as well as server side of plain Medusa with such proxied calls - to
compare with tests doing nothing special on Zope for the Zope client
side (with various numbers of 1 or more Zope threads and various
delays at the remote server).

Any chance of this? When? ;-)

3) As far as I can see Zope comes with the following from
the top of the src tree:

lib/python/xmlrpclib.py

lib/python/ZPublisher/xmlrpc.py

But nothing in Zserver despite that containing and being
based on Medusa. So it is extending the Medusa implementation
of xmlrpclib rather than just layering above it.

Is anyone planning to provide a replacement based on
py-xmlrpc? If so, when - and will it include a non-blocking
client side as described above?

I've tried to track down some documentation on how
XML-RPC fits into Zope. Here's the list I came up
with in case it helps or prompts somebody to point
out docs or threads with other info.

The start of some documentation on how Zope publishes
with xml-rpc server is:

http://www.zope.org/Members/teyc/howtoXMLRPC2

Others include:

http://www.zope.org/Members/Amos/XML-RPC

http://www.zope.org/Members/teyc/pipermailXMLRPCWoes

http://www.zope.org/Members/teyc/howxmlrpc

There is also an XMLRPCProxy:

http://www.zope.org/Members/dshaw/XMLRPCProxy

And quite a few other references from the search
button on www.zope.org for both "xmlrpc" and xml-rpc".

Anyway, thanks again for 1000 calls per second!