[ZODB-Dev] Backing up Data.fs and blob directory

Tres Seaver tseaver at palladion.com
Thu Sep 4 11:54:49 EDT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christian Theune wrote:
> Hi Laurence,
> 
> On Wed, 2008-09-03 at 08:06 -0700, Laurence Rowe wrote:
>> Backing up a ZODB has always been fairly easy in the past, but with the
>> introduction of blobs things have got a little more complex.
>>
>> How should I create a consistent backup of my Data.fs and blob directory?
>>
>> My inital guess would be to take a copy of the Data.fs, then take a copy of
>> the blob directory to ensure I have all blobs referenced in the Data.fs.
>> Would I be able to restore from such a backup safely? (it may contain blobs
>> from transactions that were newer than the backed up Data.fs).
>>
>> This should be safe because committed blobs are immutable and any dangling
>> blobfiles would not interfere with the creation of blobs from new
>> transactions in the restored zodb, as transaction ids would not overlap.
>>
>> I would be greatful if anyone could point out holes in my reasoning or has
>> experience of this.
> 
> Snapshotting a blob directory after taking a copy of your Data.fs should
> be safe, as long as you don't pack in between.
> 
> Note that at the design stage we imagined that blob directories might
> become really large making backups unfeasable. For those situations we
> handwaved a "very reliable storage" for this directory, like a
> self-contained SAN/NAS solution that keeps your data safe.

Assuming we can avoid the race condition induced by packing during the
backup (see below), it should be possible to write a script which
combines of 'repozo' and 'rsync' in such a way as to get a "pure" copy
of the blob directory which corresponds to the repozo dataset.
Somethiing like:

 #!/bin/sh
 touch .repozo_start
 /path/to/repozo -B -f var/Data.fs -r backups/
 find /path/to/blobs ! -newer .repozo_start |\
   rsync -av --include-from=- /path/to/blobs/ backups/blobs/
 rm .repozo_start

Packing *during* backup creates problems for repozo + blobs, because a
pack may cause "old" blob files to be unlined.  Furthermore, it is not a
use case I think we should support.  However, I don't think that
FileStorage supports the idea of a "pack lock" which would be acquirable
 by repozo, so I don't know how to prevent the race.



Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIwATJ+gerLs4ltQ4RAm0WAKC/jDUPqBnMTpwkDpBX0mKidGTxvwCghjnM
QYnz9dRolzOdvZX2t9fxM3k=
=/h4K
-----END PGP SIGNATURE-----



More information about the ZODB-Dev mailing list