[Zope] zope operations atomic?

Chris McDonough chrism@zope.com, chrism@zope.com
Thu, 01 Nov 2001 17:01:01 -0500


Clark OBrien wrote:
 > It is what happens with this retrying that interests me. As the example
 > given
 > below shows, this retry may never succeed. Thus long running 
operations may
 > never complete and are effectively starved by faster running operations.
 >
 >
 >
 > Assume in Zope I have a folder structure like this.
 >
 > Folder-1
 >   -Folder-2
 >       -Folder-3
 >          ....
 >             ..
 >               Folder-1000
 >
 >
 > Suppose I have a script traverseFolder(root) that starts at a given 
root and
 > traverses sub-folders adding the attribute foo.
 >
 > If I continuously call traversefolder(Folder-1000) how could
 > traverseFolder(Folder-1) ever complete. I mean, by the time
 > traverseFolder(Folder-1) could complete traverseFolder(Folder-1000) would
 > have already committed several times. What would happen with the request
 > that started traverseFolder(Folder-1), would it keep being retried ad
 > infinitum.
 > It is basically starved out of contention by a faster running operations.

It does get interesting when you consider that the longer-running 
transaction might always tend to lose on read conflicts because:

a) read conflicts can't be resolved.
b) the longer-running transaction *might* "ghost out" Folder-1000
    to conserve RAM during the traversal after the first (aborted)
    commit.

If b) is true in your example for every run of traverseFolder(Folder-1), 
your contention that it will never commit might prove correct.  It'd be 
slightly interesting to try it to see what the behavior actually is. 
Would you be willing to do so?

In any case, we have an answer to this problem if you're willing to lose 
a bit of consistency.  CoreSessionTracking's LowConflictConnection class 
comes in handy here.  If your application is very write-intensive and 
you've carefully coded a massive hotspot ala your example into it, you 
can turn off read conflicts by using this class at the expense of some 
consistency.

With read conflicts turned off,  I'm positive your example will 
eventually resolve itself.  Maybe it'll take a few minutes of retries, 
but it will eventually finish. I say that because the longer running of 
the two scripts will eventually be able to do the commit because it's 
statistically as likely to "win" a commit as the shorter-running script 
when there's a write conflict; it just has fewer opportunities to do so.

And actually if you used a special FooFolder instead of a Folder for 
this demonstration, the only thing that changed was foo, and the value 
of foo was often the same in both connections on the object upon which 
the transaction conflicted, you could resolve most of the conflicts that 
could potentially occur here (save for the read conflicts) by giving it 
a _p_resolveConflict that looked something like this:

class FooFolder:
    def _p_resolveConflict(self, old, saved, new):
       marker = []
       for k,v in saved.items():
          if new.get(k, marker) != v:
              return None
       for k,v in new.items():
          if saved.get(k, marker) != v:
              return None
       return new


However, the real answer is:  Dont design your application like this if 
you can help it.  This is not a good pattern.  It's best to avoid 
hotspots like Folder-1000 in your example.  It's no different than 
continually beating the snot out of a field in a row in a relational 
database table with writes from multiple threads, where one of the 
threads is running an overnight transaction and the others are just 
incrementing the field every second.  You have the same consistency, 
contention, and timeliness issues there, AFAICT, except that it's 
expressed in terms of pages, locks, and dirty reads.  (I'm sure there's 
an Oracle person waiting around to "WRONG!" me to death, however.  ;-)

- C