[ZODB-Dev] Looking for sponsors to finish ZEORaid 1.0

Mon Sep 10 04:56:08 EDT 2007

Hi,

we at gocept started an open source project (ZPL) in the beginning of this year
to make a free implementation of a fail-over/replication mechanism for ZODB/ZEO
available.

We envisioned a solution by applying RAID techniques to ZEO servers. The
project is therefore called "ZEORaid".

We have a working system that I showed off at EuroPython this year and
mentioned at some other places. We received interest from some people
and parties that would like to see this finished.

The current state of ZEORaid is almost alpha quality:
two features missing, a few known edge cases, lots of bugs.

We've set up a road map for what we propose to have available in ZEORaid 1.0
(see attached ROADMAP file).

In the last weeks we were contacted by one sponsor who wanted to help
out with this. We estimate an effort that requires 12k EUR for us to
finish ZEORaid 1.0 according to the ROADMAP.

Our plan involves three participating sponsors who cover 4k EUR each.
One of those has been found already, we need two more.

Our project plan lays out to have all sponsors on board by 30th September.
Work will start in the middle of October and be done in the middle of November.

For details I'd be happy to answer your questions (in private or on the
list).

Christian

PS: If there is a preferred feature you'd like to see in ZEORaid but isn't in 1.0,
we'd be happy to do another round of funded open source work later.
-------------- next part --------------
====
TODO
====

1.0
===

Stabilization
-------------

 - Check edge cases for locking on all methods so that degrading a storage
   works under all circumstances.

 - The second pass of the recovery isn't thread safe. Ensure that only one
   recovery can run at a time. (This is probably a good idea anyway because of
   IO load.)

 - Make sure that opening a ZEO client doesn't block forever. (E.g. by using a
   custom opener that sets 'wait' to True and timeout to 10 seconds )

   Workaround: do this by using "wait off" or setting the timeout in
   the RAID server config.

 - Run some manual tests for weird situations, high load, ...

Feature-completeness
--------------------

 - Rebuild storage using the copy mechanism in ZODB to get all historic
   records completely. (Only rebuild completely, not incrementally)

 - Create a limit for the transaction rate when recovering so that the
   recovery doesn't clog up the live servers.

Cleanup
-------

 - Remove print statements and provide logging.

 - Make manager script that works like zopectl and has a buildout recipe that
   can talk to a specific RAID server.

2.0
===

- Support packing?

- Windows support

- Make the read requests come from different backends to optimize caching and
  distribute IO load.

- Allow adding and removing new backend servers while running.