[ZODB-Dev] Re: design issue: job queue is concurrency bottleneck

Sat Mar 27 01:28:34 EST 2004

Steve Alexander wrote:
> 
>> Perhaps you could create a layered structure of BTrees to lessen the
>> chance of conflict. For example you could have a top-level btree whose
>> keys are "time slices" each once containing all of the jobs scheduled
>> during that slice. If the slices were fixed size then this could be an
>> IOBTree perhaps.
> 
> 
> Why not make a layered structure of only BTree Buckets.  If you make it 
> so that you only ever add new buckets most of the time, then you won't 
> get write conflicts.  A separate single thread can remove defunct 
> buckets when there is no chance of them being used again.
> 

I'm doing roughly the same thing as John, inserting jobs in an OOBTree 
for some workers (seperate clients) to grab open jobs.  I ran into a 
similar issue last year with conflicts.  I found that because I usually 
have the same couple thousand jobs running over and over again, I 
decided to optimize knowing this behavior would remain constant.

My solution was to index the tree based on the a hash of the job 
operation, so that the same job running a couple hours later used the 
same key.  Instead of adding and removing the job from the BTree, I 
marked it as inactive when it was done.  The next time I wanted to run 
that job again, I need only reset it to active and updated any out of 
date parameters in the job specification.  Every weekened, I had a 
process that went through and removed all the inactive jobs so that the 
database didn't get too bloated.

For the clients to look for new jobs, I had a seperate PersistentList 
based object that only held the hash used to look into the job BTree. 
This was a pretty decent win because if I got a ConflictError accessing 
this object, it was relatively cheap to retry the operation because you 
are only pushing one integer to the tail or popping one off the head.

I'm currently working on a refactoring of this system and this is one of 
the areas that I would like to revisit.  The Bucket method sounds 
interesting and I'll give that a shot and see what kind of results I get.