[ZODB-Dev] HA Setup and zodb index file of FileStorage

Fri Sep 22 04:17:57 EDT 2006

On 9/22/06, Christian Theune <ct at gocept.com> wrote:
> Hi,
>
> Patrick Gerken wrote:
> > Hello,
> >
> > its funny, like Garth in may this year I am looking into making a HA
> > system with ZEO for an ERP5 deployment. In my case I don't need to
> > care for data replication, all is stored on a SAN considered HA by the
> > customer already.
> >
> > So my data.fs and index and all that stuff will already be available
> > on my backup server.
> > The idea is that the backup server will watch the real ZEO, and start
> > all services if the server goes down. It seems quite safe for me to
> > take over the files from the filesystem and restart everything (While
> > writing I realise I should be really really sure the other zeo is
> > down, but well, that is out of scope for this mail).
> >
> > The thing which scared me to hell was the "rumour" that it can happen
> > that the index can get corrupted and then everything has to be indexed
> > again. With a large number of objects, this will be slow, either
> > because of millions of seeks or many many blocks written to memory. So
> > I wanted to look into the FileStorage implementation to see if this
> > could be optimised. But I did not find a single place where killing
> > the server would result in a corrupt index file (Simply assuming that
> > we have journaling filesystems). Also if the index file is not up to
> > date, restarting the zeo server would result in updating the zeo
> > server starting from the latest transaction written to the index. That
> > can take up to 5 minutes (assuming 10000 objects, the distance between
> > writes, multiplied with 20 ms for processing and seeking, rounded up
> > to the next prime.
> >
> > Given that writing a file and renaming a file can be considered
> > atomic, and that no solar winds or similar things can screw up my
> > filesystem, how can I screw up my index file?
>
> One of the last things I remember is that indexes can not be rebuilt
> partially but are rebuilt completely. I had a request on my plate to
> modify the index code to take old indexes into account as well. I'm not
> sure if anybody else already did that (but that would be available only
> on trunk anyway)

Even the first checkin of the filestorage makes a
self.read_index(...start = start...)
which updates the index from the oldest tid known to the index.
I can only imagine that the index gets invalidated if the
self._check_sanity() returns 0 thus saying the index is not valid. But
from my reading through the websvn this should not be the case in the
scenarios I can imagine (journaled filesystem, no solar winds....).

Is there a way to reproduce this behaviour easily?

best regards,
            Patrick Gerken