[ZODB-Dev] possible race condition in ZODB/lock_file.py?

Jim Fulton jim at zope.com
Sat Dec 22 10:59:44 EST 2007


On Dec 22, 2007, at 9:38 AM, Chris Bainbridge wrote:

> I have a number of processes running on hosts with a common NFS /home.
> I was using a file on this shared NFS as a ZODB database. I had
> thought that this wouldn't be a problem, since it would be impossible
> for any process to open the zodb file while another process has it
> locked, but the zodb file kept getting corrupted anyway.

NFS locking is notoriously fragile or broken.  I would never store a  
database on an NFS file system.

> I removed the
> transactions, in effect using the zodb as read only, I then got this
> error:
>
> ERROR:root:Traceback (most recent call last):
>  File "go.py", line 116, in ?
>    db.close()
>  File "/exports/home/s9734229/phd/src/db.py", line 47, in close
>    conn.db().close()
>  File "/exports/home/s9734229/lib/python/ZODB3-3.7.2-py2.4-linux- 
> x86_64.egg/ZODB/DB.py",
> line 444, in close
>    self._storage.close()
>  File "/exports/home/s9734229/lib/python/ZODB3-3.7.2-py2.4-linux- 
> x86_64.egg/ZODB/FileStorage/FileStorage.py",
> line 400, in close
>    self._lock_file.close()
>  File "/exports/home/s9734229/lib/python/ZODB3-3.7.2-py2.4-linux- 
> x86_64.egg/ZODB/lock_file.py",
> line 74, in close
>    os.unlink(self._path)
> OSError: [Errno 2] No such file or directory:
> '/exports/home/s9734229/gozeo/datastore.fs.lock'
>
> Looking at the code, it does:
>
>    def close(self):
>        if self._fp is not None:
>            unlock_file(self._fp)
>            self._fp.close()
>            os.unlink(self._path)
>            self._fp = None
>
> So the lock is released before being unlinked. Shouldn't this be the
> other way around? As far as I can see, releasing the lock allows a
> second process to acquire the lock, start using the zodb, then the
> first process will unlink the lock, allowing a third process to
> acquire it and also open the zodb, resulting in parallel writing and
> corruption.


You are right that there is a race condition in the locking code.   
This is fixed in ZODB 3.8.  I really should have backported this to  
3.7. :(  The new code doesn't remove the file at all.  (I don't  
remember the details, but there was a race condition to which the file- 
removal contributed.)

It might be good to post this as a bug to 3.7.  If someone backports  
the fix I'd be happy to make a new release. (I was planning to make a  
bug-fix release of 3.7 anyway to include aother recent bug fix.))

Jim

--
Jim Fulton
Zope Corporation




More information about the ZODB-Dev mailing list