[ZODB-Dev] Pack complete: 80% of a 425GB database packed

Shane Hathaway shane at hathawaymix.org
Mon Feb 28 16:12:33 EST 2011


On 02/28/2011 07:45 AM, Martijn Pieters wrote:
> Early this morning, after packing through the weekend, our somewhat
> overweight Oracle RelStorage ZODB pack was completed. I am still
> waiting for the final size from the customer DBAs, but before the pack
> this beast was occupying 425GB. The pack removed 80% of the
> object_state rows, so hopefully this is now reduced to a more
> manageable 85GB or so.

Is 425GB a new record? ;-)  Good work!

> I think we can remove holding the commit lock during the pack cleanup
> altogether. For starters, the object_ref and object_refs_added tables
> are only ever used during packing, so the packing lock is more than
> enough to protect these.

That's correct.

> And unless I am missing something, we don't
> need to worry about the transactions table either. The rest of
> RelStorage ignores transactions where the packed flag has been set to
> TRUE, so deleting the packed and empty transactions from the table
> will never lead to deadlocks, right?

It might lead to deadlocks, actually.  Deadlocks occur when concurrent 
clients try to acquire locks in different orders.  SQL databases handle 
deadlocks of that type by returning an error to all but one of the 
deadlocking clients, similar to the way ZODB deals with conflict errors.

Also, in SQL, even reading a table usually acquires a lock, so it's 
really hard to prevent deadlocks without something like a commit lock.

> Or am I missing something about
> how RelStorage implements polling and caching? In any case, the
> history-free version doesn't need to lock at all, because it only ever
> touches the pack-specific tables in the pack cleanup.

True.

> If the
> transaction table needs no lock protection either, we can get rid of
> the lock during cleanup altogether. I'd like to hear confirmation on
> this though.

I wouldn't do that.  Too risky.  (Many database architects say that 
deadlocks are normal and should be handled in application code, but in 
my experience, deadlocks kill applications under load and I prefer to 
put in the effort to avoid them entirely if possible.)

> On a side note: I see that the history-preserving object_ref and
> object_refs_added deletion SQL statements have been optimized for
> MySQL, but not for the history-free version. Wouldn't those statements
> not also benefit from using a JOIN?

Perhaps, but I prefer not to optimize until I see what happens in 
practice.  Mis-optimized MySQL statements frequently turn out to have 
something like O(n^3) performance.

Shane


More information about the ZODB-Dev mailing list