[ZODB-Dev] ZEO 2.0a1 release

Toby Dickenson tdickenson@geminidataloggers.com
Mon, 3 Jun 2002 15:44:53 +0100


>   TD> 2. I am suprised that you have a DelayedCommitStrategy thing
>   TD>    inside ZEO,
>   TD> rather than extending the storage interface to allow concurrent
>   TD> commits.  (that would be a more obvious way to do it, but maybe
>   TD> not better). Thinking aloud, maybe the storage object could be
>   TD> allowed to provide his own DelayedCommitStrategy object?
>
> I had considered this but found two difficulties with it:
>
> 1. It means changing the storage API and updating other storages to
>    take advantage of it.  I don't want ZEO2 to require changes to the
>    storage API,

I was thinking about an API extension, like when transactional undo was a=
dded.=20

>    and I want to get the advantages of sending data for
>    multiple transactions regardless of whether the underlying storage
>    supports the feature.

I am sure this is a good choice, having studied the ZEO2a1 implementation=
 in a=20
little more detail. If concurrent commit code was added to *every* storag=
e I=20
guess it would be rarely-excercised, and would be a good place for bugs t=
o=20
hide.

>    But FileStorage's tempfile format is deeply connected with the
>    FileStorage log format.  It needs to include serialnos for each of
>    the data records, and the serialno depends on the transaction
>    timestamp.  So there's no easy way to figure out the serialno for a
>    transaction that is delayed.

Any concurrent commit scheme would need to be intimate with the Storage, =
so I=20
dont think serial numbers would be a problem. The Storage could use the=20
timestamp at which it started spooling the transaction (which is probably=
 a=20
few seconds earlier than the timestamp at which it started replaying the=20
spooled transaction into a _real_ transaction). This works as long as spo=
oled=20
transactions are committed in FIFO order, which is true in ZEO2a1

For FileStorage I suspect a bigger problem is the embedded seek positions=
=20
embedded throughout the file.


I can think of other storages where pipelined transactions would be a big=
 win.=20
I did some work on a ReplicatedFileStorage last year, where the transacti=
on=20
would not commit before data was synced to a disk in two different machin=
es.=20
This necessarility leads to transaction latency. At the time, a bigger=20
problem was the consequential decrease in transaction throughput.

I suspect DirectoryStorage may benefit too. Maybe not much on some operat=
ing=20
systems, but pipelining commits would be easier on Linux's IO scheduler. =
That=20
might apply to other storages too.

> I think it's fair to revisit the issue for ZODB 4 / Zope 3, but I
> don't know when I'll have time for that.

That sounds reasonable.

I might have time for some experiments to see whether it is worth persuin=
g in=20
ZODB4/Zope3.... Ill let you know what happens.