[ZODB-Dev] FileStorage.pack: why do we need the commit_lock (so long)

Thu Jun 12 23:03:45 EDT 2003

On Fri, 2003-06-06 at 13:23, Dieter Maurer wrote:
> The next question: why do we need the commit lock at all?
> 
>   I see the necessity that we must protect our check of
>   the terminiation condition. If we see that we have
>   copied the complete content to the new file, we must
>   keep the lock until the files are renamed and the new
>   file copied.
>   The storage lock seems to be used for this purpose.
> 
>   However, I do not see why we must held the commit lock.
>   The fact, that we release it temporarily (during the copying
>   of a single transaction)  indicates
>   that it does not need to be held during the copying.
>   Thus, it seems to be somehow relevant for the check
>   of the termination condition. But for this, we
>   need the storage lock anyway.
>   Thus, what's to modivation for tampering with both locks?

There are two locks in any storage that extends BaseStorage.  I'll call
one the regular lock and the other the commit lock.  In FileStorage, the
regular lock guards access to the file pointer.  All methods that read,
write, or seek (operations that change the file position) acquire the
lock.  The commit lock guards access to the two-phase commit logic; a
thread can't commit a transaction unless it holds that lock.

One goal of pack is to minimize the amount of disruption it causes for
other activities.  No client can read the database while the regular
lock is helding, so we never want to hold that if we can avoid it.

During the copying phase, we need to hold the commit lock to prevent
another thread from committing a transaction when the pack finishes. 
When pack finishes, it swaps in the new file and index.  If another
thread was in the middle of a two-phase commit, it would have written
data with bogus locations to the temp file; the locations would refer to
positions in the old file.  We could hold the commit lock during the
entire copying phase, but that would disable writes for a long time. 
Instead, we release the lock periodically.

Another alternative, I think, would be to acquire the commit lock only
when pack is finishing.  Once the lock is acquired, we'd need to check
and make sure there isn't any more data to copy.  This solution would
probably be better.

I also agree that the code could use more comments.  Your suggestions
about specific comments are helpful.

Does this explanation make sense to you?  I'm happy to elaborate, as we
may discover some more issues that deserve comments.

Jeremy