[ZODB-Dev] ZEO and time.sleep

Jim Fulton jim at zope.com
Wed Mar 28 18:34:12 EDT 2007

On Mar 28, 2007, at 6:28 PM, Benji York wrote:

> Last week I spent a very enjoyable day (no kidding) debugging a  
> very, very slow cold-start situation (more than 15 minutes to  
> return from the first request).  When making the first request to  
> the app (Zope 3 based), the app server and storage server would  
> show virtually no CPU utilization, and there would be about a  
> megabit of network traffic (on a gigabit link).  There was no  
> obvious bottleneck.
> After liberal application of strace, tcpdump, wireshark (aka  
> ethereal), and the Python profiler we discovered that while waiting  
> for an outstanding request for an object to load, ZEO calls a  
> threading.Connection

threading.Condition :)

> instance's wait method with a timeout.  When given a timeout that  
> method enters a wait loop with a time.sleep to sleep for a while  
> and then see if the condition has been met.
> We found that time.sleep on that box had a minimum granularity of  
> 10ms (when passed a non-zero value), thus causing each object load  
> to take approximately that long.  As you can imagine, that somewhat  
> slowed down the retrieval of the several thousand objects required  
> to satisfy the initial request(s) (until the ZEO cache was  
> sufficiently warm).
> The fix?  Short-term: bump the operating system's timer interrupt  
> on that box to 1000Hz from 100Hz, increasing time.sleep's  
> granularity to 1ms (this was on Linux, Window's time.sleep appears  
> have a much higher resolution).
> Long-term: Jim has found that the timeout call in the wait-for- 
> result code can be avoided, side-stepping the call to time.sleep  
> altogether.

What was the time to return the first request after the fix?


Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org

More information about the ZODB-Dev mailing list