[ZODB-Dev] Reliability of 'repozo --quick' option

Tim Peters tim.peters at gmail.com
Sun Oct 19 20:03:14 EDT 2008


[Tres Seaver]
> Does anybody have evidence or belief that the "probabalistic" part of
> the '--quick' optoin (as of ZODB 3.2.8, if it matters) is likely to
> guess wrong on a setup where incremental backups are run frequently?

Last time I did a repozo bug hunt (years ago), I wrote a stress test
that exhaustively checked results across thousands of high-rate
backups, with and without -Q.  There were no problems with -Q in those
tests, but that's not production use so who really knows.

In the absence of HW errors, it's hard to think of realistic cases in
which -Q would do a wrong thing.  For example, here's one:  you patch
your Data.fs manually, by overwriting some bytes in old transactions
via a binary editor.  Then -Q is likely to miss that Data.fs has been
changed.  But that's not "realistic" for most people (I hope ;-)).

> The installation in question has a moderately large filestorage (40 Gb
> or so) and would like to put the backup target on a SAN, but the cost to
> figure out whether to do an incremental or not is higher than doing the
> full backup, due to the extra I/O overhead of the standard, "slow" method.

Which is why the standard method would /not/ do a wrong thing in the
case described above:  it reads every byte of Data.fs, comparing
checksums incrementally saved in the .dat file.  -Q only checks the
single slice of Data.fs last saved by an incremental backup (so misses
any change made to Data.fs occurring before that slice).

> Before I hack the backup script up to do the incrementals against a
> local directlry, I'd like to know that the '-Q' option would or wouldn't
> be a viable choice.

I expect (but don't know) -Q is fine for routine use by non-cowboys
;-).  Anyone out there use it routinely?


More information about the ZODB-Dev mailing list