[ZODB-Dev] Use of fsync in FileStorage

Tim Peters tim at zope.com
Wed Jul 28 17:47:21 EDT 2004


OK, I timed different fsync() strategies on WinXP Pro.  Remember that
Windows doesn't actually have fsync(), Python maps its os.fysnc() to the
Win32 FlushFileBuffers(), indirectly via MS C's _commit() function.

The box here is a beefy laptop, WinXP Pro SP1, 3.2GHz P4 hyper-threaded, 1GB
RAM, 80GB IDE disk w/ 8MB cache.

Using current (Zope 2.7 branch HEAD) FileStorage code:

C:\Code\ZODB3.2>timefsync.py
Doing 10000 transactions, timed with time.clock()
323.649 seconds, 30.8977 txn/sec

The process never got above 1% CPU usage, and the disk was busy the whole
time, so this was clearly I/O-bound.

After adding a second fsync, in tpc_vote():

                self._file.flush()   # existing line
                if fsync is not None: fsync(self._file.fileno()) # new

C:\Code\ZODB3.2>timefsync.py
Doing 10000 transactions, timed with time.clock()
666.169 seconds, 15.0112 txn/sec

So, contrary to hopes, adding a second fsync() cut the txn rate in half.

Finally, commenting out both fsync()'s in FileStorage.py:

C:\Code\ZODB3.2>timefsync.py
Doing 10000 transactions, timed with time.clock()
3.49118 seconds, 2864.36 txn/sec

Yikes!  That's near a factor of 100 higher than the one-fsync case.  In this
case it appeared CPU-bound.

Here's the driver.  Before running it each time, I deleted all Data.* files:

"""
import sys
if sys.platform == "win32":
    from time import clock as now
else:
    from time import time as now

import ZODB
from ZODB.FileStorage import FileStorage

from BTrees.IIBTree import IIBTree

N = 10000

st = FileStorage('Data.fs')
db = ZODB.DB(st)
cn = db.open()
rt = cn.root()

rt['tree'] = t = IIBTree()
get_transaction().commit()

print "Doing %d transactions, timed with time.%s()" % (N, now.__name__)
start = now()
for i in xrange(N):
    t[i] = i
    get_transaction().commit()
finish = now()

elapsed = finish - start
print "%g seconds, %g txn/sec" % (elapsed, N / elapsed)
"""

This commits more-or-less "average size" transactions, but on the high side
(about 2KB per transaction record).

Any use of os.fsync is clearly a txn-rate disaster on my WinXP box.  It
would be great if readers here tried it on their boxes and reported results:
various Linux flavors with various filesystems, Solaris variants, Windows
boxes with "serious" disk systems, whatever matters to you in practice.



More information about the ZODB-Dev mailing list