[ZODB-Dev] Backing up Data.fs and blob directory

Sidnei da Silva sidnei at enfoldsystems.com
Thu Sep 4 12:17:48 EDT 2008


Keep in mind rsync is not erm, trivial to get going on Windows.

On Thu, Sep 4, 2008 at 1:54 PM, Tres Seaver <tseaver at palladion.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Christian Theune wrote:
>> Hi Laurence,
>>
>> On Wed, 2008-09-03 at 08:06 -0700, Laurence Rowe wrote:
>>> Backing up a ZODB has always been fairly easy in the past, but with the
>>> introduction of blobs things have got a little more complex.
>>>
>>> How should I create a consistent backup of my Data.fs and blob directory?
>>>
>>> My inital guess would be to take a copy of the Data.fs, then take a copy of
>>> the blob directory to ensure I have all blobs referenced in the Data.fs.
>>> Would I be able to restore from such a backup safely? (it may contain blobs
>>> from transactions that were newer than the backed up Data.fs).
>>>
>>> This should be safe because committed blobs are immutable and any dangling
>>> blobfiles would not interfere with the creation of blobs from new
>>> transactions in the restored zodb, as transaction ids would not overlap.
>>>
>>> I would be greatful if anyone could point out holes in my reasoning or has
>>> experience of this.
>>
>> Snapshotting a blob directory after taking a copy of your Data.fs should
>> be safe, as long as you don't pack in between.
>>
>> Note that at the design stage we imagined that blob directories might
>> become really large making backups unfeasable. For those situations we
>> handwaved a "very reliable storage" for this directory, like a
>> self-contained SAN/NAS solution that keeps your data safe.
>
> Assuming we can avoid the race condition induced by packing during the
> backup (see below), it should be possible to write a script which
> combines of 'repozo' and 'rsync' in such a way as to get a "pure" copy
> of the blob directory which corresponds to the repozo dataset.
> Somethiing like:
>
>  #!/bin/sh
>  touch .repozo_start
>  /path/to/repozo -B -f var/Data.fs -r backups/
>  find /path/to/blobs ! -newer .repozo_start |\
>   rsync -av --include-from=- /path/to/blobs/ backups/blobs/
>  rm .repozo_start
>
> Packing *during* backup creates problems for repozo + blobs, because a
> pack may cause "old" blob files to be unlined.  Furthermore, it is not a
> use case I think we should support.  However, I don't think that
> FileStorage supports the idea of a "pack lock" which would be acquirable
>  by repozo, so I don't know how to prevent the race.
>
>
>
> Tres.
> - --
> ===================================================================
> Tres Seaver          +1 540-429-0999          tseaver at palladion.com
> Palladion Software   "Excellence by Design"    http://palladion.com
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFIwATJ+gerLs4ltQ4RAm0WAKC/jDUPqBnMTpwkDpBX0mKidGTxvwCghjnM
> QYnz9dRolzOdvZX2t9fxM3k=
> =/h4K
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev at zope.org
> http://mail.zope.org/mailman/listinfo/zodb-dev
>



-- 
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214


More information about the ZODB-Dev mailing list