[ZODB-Dev] ZEO and time.sleep
jim at zope.com
Wed Mar 28 18:34:12 EDT 2007
On Mar 28, 2007, at 6:28 PM, Benji York wrote:
> Last week I spent a very enjoyable day (no kidding) debugging a
> very, very slow cold-start situation (more than 15 minutes to
> return from the first request). When making the first request to
> the app (Zope 3 based), the app server and storage server would
> show virtually no CPU utilization, and there would be about a
> megabit of network traffic (on a gigabit link). There was no
> obvious bottleneck.
> After liberal application of strace, tcpdump, wireshark (aka
> ethereal), and the Python profiler we discovered that while waiting
> for an outstanding request for an object to load, ZEO calls a
> instance's wait method with a timeout. When given a timeout that
> method enters a wait loop with a time.sleep to sleep for a while
> and then see if the condition has been met.
> We found that time.sleep on that box had a minimum granularity of
> 10ms (when passed a non-zero value), thus causing each object load
> to take approximately that long. As you can imagine, that somewhat
> slowed down the retrieval of the several thousand objects required
> to satisfy the initial request(s) (until the ZEO cache was
> sufficiently warm).
> The fix? Short-term: bump the operating system's timer interrupt
> on that box to 1000Hz from 100Hz, increasing time.sleep's
> granularity to 1ms (this was on Linux, Window's time.sleep appears
> have a much higher resolution).
> Long-term: Jim has found that the timeout call in the wait-for-
> result code can be avoided, side-stepping the call to time.sleep
What was the time to return the first request after the fix?
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
More information about the ZODB-Dev