[ZODB-Dev] ZEO and replication of BTree based objects

Casey Duncan casey at zope.com
Fri Jun 6 00:01:57 EDT 2003


On Thursday 05 June 2003 02:02 pm, Jeremy Hylton wrote:
> On Thu, 2003-06-05 at 13:53, Shane Hathaway wrote:
> > Jeremy Hylton wrote:
[snip]
> 
> > What isn't clear is how you fill the OID queue.  How can you know the 
> > OIDs of the objects that are important enough to prefetch?
> 
> I agree that this is the harder problem.  The best bet is probably
> heuristics based on references within an object.  If you access object
> A, and it has references to B and C, it increases the likelihood that
> you'll access B or C.
> 
> BTW, I blogged a little about this subject in April:
> http://www.python.org/~jeremy/weblog/030418.html

The problem is a bit like instruction prefetching in CPUs to keep the pipeline 
full at all times. There will times when this is more expensive then simply 
letting the bandwidth go to waste since backtracking when you make a mistake 
is expensive, but the hope is that more often than not you can guess which 
instructions will come next while previous instructions are actually 
executing.

The obvious choice would be a sequence. If the first item is accessed, then it 
would make sense for the client to prefetch a few more to try to keep the 
latency down.

Another thought would be mapping access. What if the application could tell a 
BTree, for example that it was about to (or likely to) load all of the keys 
in a given set.

Perhaps the application could similarly signal that it would probably load all 
of the subobjects referenced from a given object sometime soon.

Perhaps this should be an application level API that the store will provide to 
allow the application to explicitly pass hints like the above. The default 
implementation would do nothing with the request. Other storages could begin 
an asynchronous fetch. This would need to be interruptable though so that the 
application could still go about its normal business without too much 
interruption when it needed to load another arbitrary object.

I can think of other "pie-in-the-sky" approaches, but honestly it would 
probably be really hard to outperform the current "wait and see" approach 
since the machine would guess wrong a lot, wasting time, and therefore 
upsetting the delicate economic balance we find ourselves in; since time, as 
we all know is actually money. But I digress...

-Casey



More information about the ZODB-Dev mailing list