[Zope-CMF] backup strategies

sean.upton@uniontrib.com sean.upton@uniontrib.com
Fri, 07 Feb 2003 10:59:42 -0800

Cliff's notes version of ZSS HA clustering, assuming you have some sort of
DirectoryStorage replication in place:

re: ZSS clustering.  You could use mon as well as heartbeat.  Heartbeat
would be set up on a 2-node cluster of ZSS machines.  If the primary
seriously died, then the backup would take over its IP address via
gratuitous ARP.  

Heartbeat also manages resource with init-like scripts, when the takeover
started, it would start up a ZSS process on a replicated DirectoryStorage
after taking over the IP.  For safety, you would want to likely kill the
primary server to keep it from replicating to the backup after the takeover.
You could do this by using a power-device (STONITH: Shoot The Other Node In
The Head).

The takeover that heartbeat performs could be initiated under 2 conditions:

- totally dead primary ZSS box.  Heartbeat does this when the primary stops
sending heartbeats that the secondary can see.
- Half-dead box: lack of ZSS service on tcp socket.  Use a mon alert script
to trigger heartbeat takeover.

ZEO clients would retry connection after a timeout from a severed TCP
connection to the ZSS, and see that it is back up.


-----Original Message-----
From: Paul Winkler [mailto:pw_lists@slinkp.com]
Sent: Friday, February 07, 2003 10:13 AM
To: zope-cmf@zope.org
Subject: Re: [Zope-CMF] backup strategies

On Fri, Feb 07, 2003 at 04:09:17PM +0000, Sally Owens wrote:
> Is there any way for example to *test* Data.fs when you back it up (to be 
> sure that you are not backing up corrupt data)?

there's a utility in utilities/ZODBTools/fstest.py that checks
for errors. Run a cron job that runs this tool and mails you the

There is also another utility, fscheck.py that gives more extensive
reports, and IIRC is new to Zope 2.6.

> What backup strategies do other organisations employ to minimize the risk 
> of your Zope/CMF sites falling over?

be sure not to overlook the advantage of running zope
behind a proxy, especially if your system will ever be 
exposed to the internet.  ZServer is not designed to handle
DOS attacks, or very high traffic in general.  A proxy
such as Squid can both mitigate this, and with proper
caching greatly decrease the load on your Zope system.

As for protecting our data, currently much the same as you're suggesting.
We make regular backups of Data.fs (using cp), and regularly
run fstest.py.
Strongly considering a move to DirectoryStorage on reiserfs.

This would enable, among other things, incremental backups and 
semi-live replication using a custom shell script (and soon
there will be a feature that supports live replication.)

The cool thing about live replication is that you could in
theory eliminate the single point of failure with Zope / ZEO: the ZEO
By running monitoring software such as mon, you could
set up a system that fails over from one ZEO cluster to another.
Not trivial to set up, but I think it should work.


Paul Winkler
Look! Up in the sky! It's ANDREA SNOWBALL!
(random hero from isometric.spaceninja.com)

Zope-CMF maillist  -  Zope-CMF@zope.org

See http://collector.zope.org/CMF for bug reports and feature requests