[ZODB-Dev] extreme clock skew

Tue Sep 20 19:44:35 EDT 2005

[Lalo Martins]
...
> When upgrading hardware in one of our servers, our admins accidentally
> set the clock to the right hour, minute, day and month, but 2015.
>
> We took a few days to notice that our zope timestamps were not showing up
> (or sorting) quite as expected. "Oops." They quickly fixed the clock.
>
> Of course, the esteemed ZODB hackers know that this is when the problem
> actually *started*.
>
> A few days later, we realized that pack wasn't working anymore

What does "wasn't working" mean?  Examples of things it might mean:  you get
exceptions when you try to pack (and if so, what?); or you don't get
exceptions, but Data.fs stays the same size.

> (when one of the sites maxed out the server's hard disk.  Yes, there
> is some bad code in there.  Never mind that part.)
>
> For the first time, I introduced myself to the internals of FileStorage
> and fspack, and after much careful testing, figured out how to make pack
> work again.  Good.
>
> However, I wasn't aware of the fact that timestamps and transaction IDs
> are the same and one thing, and never go backwards.

Which version of ZODB are you using?  ZODB always intended that tids be
monotonically increasing, but some older versions have known bugs where that
can be violated.  Under any current ZODB release, you should see a
CRITICAL-level msg logged whenever FileStorage detects a tid more than 30
minutes in the future.

> So now (a few weeks later), we suddenly realized we have our whole
> database with neatly uniform timestamps, all nicely contained within
> the same one-second interval nearly 10 years in the future.

So you've ignored critical-level log messages for a few weeks?  I'm not
trying to pick a fight <wink>, I'm trying to understand how you got into
this.  A critical-level log msg is the most severe thing we can do without
plain refusing to run at all.

Note that the ZODB timestamp format uses 32 bits to store seconds, so there
are 2^32/60 ~= 72 million distinct timestamps possible in each 1-second
interval.  When a new tid is needed, and current time is less than the
largest tid known so far, the new tid is obtained by incrementing the
largest tid known so far in the last of those 32 bits.  So-- alas --you can
probably do dozens of millions transactions more and _still_ be getting
timestamps in the same one-second interval.

> Needless to say, anything that tries to use bobobase_modification_time
> for anything - except maybe sorting - gets nonsense results.  Then
> sorting doesn't work either, due to a funny interaction with caching that
> I won't explain here to save space.
>
> Finally, in the light of tid = timestamp, I'm not anymore sure my packing
> fix even makes sense... it still *seems* to me like it should do the
> right thing (only store transactions that are reachable), but I can't be
> completely sure, and I can swear the Data.fs is much bigger than it
> should be.

You didn't tell us anything about what your packing fix did, so there's no
way we can guess.

> Soooo... is there any way to recover this data?

"Recover" seems an odd choice of word here.  Is your data lost or corrupted?
I didn't get the impression above that it was.  I'll continue to assume that
it isn't lost or corrupt.

> I don't mind losing all past timestamps (maybe replacing them with
> current timestamps at recovery time), as long as zodb goes back to
> normal from recovery on.
>
> I do, actually, know how to write a script that does this;  what I'm
> asking here is if there is already something out there, if anyone
> encountered some similar situation before, and if anyone has some words
> of wisdom to share in case I actually have to do it myself.

Look at the code for BaseStorage.copyTransactionsFrom().  You should be able
to fiddle that easily to reset tids to anything you like, while making a
copy of your Data.fs.  The old Data.fs won't be changed.  The new Data.fs
will have the tids you force it to have.

In fact, while I haven't tried this, I _suspect_ that changing

            self.tpc_begin(transaction, tid, transaction.status)
                                        ^^^

to

            self.tpc_begin(transaction, None, transaction.status)
                                        ^^^^

may be all that's required.  So try making that change, create a new Data.fs
`new`, open your old Data.fs as `old`, then do

    new.copyTransactionsFrom(old)

and close all your files.  If it works, the new Data.fs will have tids
ranging across the clock time consumed while running copyTransactionsFrom().