[ZODB-Dev] Use of fsync in FileStorage

Paul Roe paul at artifact-imaging.co.nz
Thu Aug 5 07:27:45 EDT 2004


On Tue, 2004-08-03 at 00:00 -0400, Tim Peters wrote: 
> [Paul Roe]
> > Some more stats 10000 transactions per run both boxes are single Opteron
> > 246
> >
> > Debian Unstable Zope2.7 Filestorage.py Rev 1.135.6.5
> > 4x36G WD360GD SATA 10K RAID 10 Array (Intel SRCS14L)
> >
> > 0 fsync 2.90725 seconds, 3439.68  txn/sec
> 
> Congratulations!  That's the only box so far to beat my Gateway Windows
> laptop on this measure <wink>.
> 
Seems ultimate speed is CPU and memory bandwidth limited.

I've not got local access to the RAID 10 box (and its a production box).
But I ran some tests on the Dev box with raid and disk cache disabled.

RAID Cache off
RAID Delay Write off
Individual Disk Write Caches off
10000 transactions
(cpu usage was observed over 100000 transactions for 0 fsyncs)

0  fsync CPU 99%
2.94156 seconds, 3399.55 txn/sec
2.79107 seconds, 3582.86 txn/sec
1  fsync  CPU <2%
Doing 10000 transactions, timed with time.time()
102.854 seconds, 97.2253 txn/sec
101.747 seconds, 98.2829 txn/sec
2  fsync CPU <2%
Doing 10000 transactions, timed with time.time()
162.46 seconds, 61.5535 txn/sec
163.419 seconds, 61.1922 txn/sec

With 1 and 2 fsyncs and all caching disabled the disks sound very
busy ;-). Probably also worth stating we're running reiserfs on top of
LVM on these machines for all tests.


Other Tests.

I also ran some apache bench tests on the dev box (Debian Pured64) and
in the few I ran there was no significant difference between 0,1,2
fsyncs. 

Probably need to come up with something is more update intensive!

So I've now run some tests with a real standalone zodb app that does a
reasonable sized import. It reads a compressed file then creates, stores
(IOBTree) and catalogues (ZCatalog) a list of objects (simple class with
a few attributes). 

No noticeable difference for 0,1,2 fsync. We are CPU limited anyhow 
and we're only managing 200 txn/sec here for cataloguing.

Commenting out the catalogue step. Nothing significant between 0,1,2
fsyncs. Getting about 500 txn/sec.

So unless we're adding something pretty simple we're unlikely to get
near the territory where it actually makes any difference.

I didn't test the above with caching disabled, but in retrospect it
would have been worthwhile. As we're definitely doing more transactions
than could be managed with a simple IIBTree and no caching and 1 fsync.


> > Debian Pure64  Zope2.7  Filestorage.py Rev 1.135.6.5
> > 2x70G WD740GD SATA 10K RAID 1 Array (Intel SRCS14L)
> >
> > 0 fsync 2.92782 seconds, 3415.52 txn/sec
> > 1 fsync 8.1819  seconds, 1222.21 txn/sec
> > 2 fsync 9.50206 seconds, 1052.4  txn/sec
> 
> So 1 fsync slows by a factor ~3.5, and 2 is minor marginal loss beyond that.
> Given what you know so far, would you rather run in production with 0, 1, or
> 2 of these puppies?  You're not going to throw 1000 txn/sec at ZODB, so even
> your slowest rates here are "more than enough".  OTOH, do you believe the
> fsyncs buy you something worth having?
> 
Since disabling the caching substantially slows things down. It's
apparent that the fsyncs don't seem to really guarantee anything (as
expected) when caching is enabled.

There presumably are speed gains to be made with some hardware with 0
fsyncs but it doesn't seem to matter for the disk subsystem we're using,
with transactions we're typically throwing at it. It seems worthwhile to
have the option.

> At this point it seems the choice has at least as much to do with perception
> as with technical merit, so I'm also keen to know what perceptions are.
> 
I'd vote for it being configurable. But I don't think it matters as it
doesn't seem to guarantee anything.

With regard to the comments about corrupted Data.fs files we've not had any
trouble over the years. We've been running zope in a production environment
since early 2000. We've deployed zodb backed windows apps in the past and
had no trouble with corruption on these either.



Paul






More information about the ZODB-Dev mailing list