[ZODB-Dev] Errors running a Stress Test

Tim Peters tim at zope.com
Tue Mar 23 14:59:37 EST 2004


More on Brandon's stress test:  a few of these appear to be expected
consequences of using pack in an unnatural way, starting with all the
repetitions of this kind of traceback:

> 2004-03-16T13:15:20 ERROR(200) Conflict Resolution Unexpected error
> Traceback (most recent call last):
>  File "C:\Python23\Lib\site-packages\ZODB\ConflictResolution.py",
         line 120, in tryToResolveConflict
>    old = state(self, oid, oldSerial, prfactory)
>  File "C:\Python23\Lib\site-packages\ZODB\ConflictResolution.py",
         line 51, in state
>    p = p or self.loadSerial(oid, serial)
>  File "C:\Python23\Lib\site-packages\ZODB\FileStorage\FileStorage.py",
         line 580, in loadSerial
>    raise POSKeyError(oid)
> POSKeyError: 0000000000000001

The test has many threads and clients all slamming on a single OOBucket (oid
1, which always occurs in these tracebacks), and every few minutes each
client does a pack to current time.  What happens:

    transaction T1 begins; the OOBucket is at revision R1
    transaction T2 begins; the OOBucket is still at revision R1
    transaction T2 commits; the current revision becomes R2
    a pack to current time occurs, and removes revison R1 (it's
        "ancient" history by now)
    transaction T1 tries to commit revision R3
    conflict resolution tries to read revision R1 (that's oldSerial
        in the traceback above), but it no longer exists
    POSKeyError results

This is an unexpected event in conflict resolution because it shouldn't
normally happen when pack is used -- pack() is usually used once a week or
month, and you usually pack to a day or so in the past.  It's extremely
unlikely then that a transaction currently in progress began before the pack
time, so it's also extremely unlikely then that any transaction in progress
will need to access a non-current revision before the pack time (non-current
revisions before the pack time are the ones destroyed by a pack).

The kind of POSKeyError traceback above can't be prevented if a pack does
destroy non-current revisions eventually needed by transactions currently in
progress.  In real life, these don't occur, but when they do <wink> they're
not harmful -- it gets reported back to the client as a ConflictError, and
the client retries the transaction then.  The next time it will start with
the then-current revision pack didn't destroy.

> This one which is loosely related to the other two pack exceptions
> and could be argued as a client side bug.
>
> Traceback (most recent call last):
>   File "stress.py", line 78, in ?
>     db.pack()
>   File "C:\Python23\Lib\site-packages\ZODB\DB.py",
            line 581, in pack
>     self._storage.pack(t, referencesf)
>   File "C:\Python23\Lib\site-packages\ZEO\ClientStorage.py",
            line 853, in pack
>     return self._server.pack(t, wait)
>   File "C:\Python23\Lib\site-packages\ZEO\ServerStub.py",
            line 161, in pack
>     self.rpc.call('pack', t, wait)
>   File "C:\Python23\Lib\site-packages\ZEO\zrpc\connection.py",
            line 374, in call
>     raise inst # error raised by server
> FileStorageError: Already packing

That's also an artifact of the test driver, and won't change.  Because pack
in real life is invoked rarely, and can take a very long time (hours, even a
day for a huge .fs), it's unreasonable for the implementation to "stack up"
pack requests and just sit there waiting for them to become runnable.  It's
also unreasonable to ignore a pack request -- if a pack ain't gonna happen,
an exception is appropriate.  Of course the test driver could avoid this by
invoking pack from only one client at a time (or from only one client
period).

I suspect that the other tracebacks (those not talked about here or repaired
previously) point to "real" problems.




More information about the ZODB-Dev mailing list