[ZODB-Dev] Increasing MAX_BUCKET_SIZE for IISet, etc

Thu Jan 27 03:00:32 EST 2011

Jim Fulton <jim <at> zope.com> writes:

> 
> On Wed, Jan 26, 2011 at 3:15 PM, Matt Hamilton <matth <at>
> netsight.co.uk> wrote:

> > So, with up to 300,000 items in some of these IISets, it means to
> > iterate over the entire set (during a Catalog query) means loading
> > 5,000 objects over ZEO from the ZODB, which adds up to quite a bit
> > of
> > latency. With quite a number of these data structures about, means
> > we
> > can end up with in the order of 50,000 object in the ZODB cache
> > *just*
> > for these IISets!
> 
> Hopefully, you're not iterating over the entire tree, but still. :)

Alas we are. Or rather, alas, ZCatalog does ;) It would be great if it
didn't but it's just the way it is. If I have 300,000 items in my 
site, and everyone of them visible to someone with the 'Reader' 
role, then the allowedRolesAndUsers index will have an IITreeSet 
with 300,000 elements in it. Yes, we could try and optimize out that 
specific case, but there are others like that too. If all of my 
items have no effective or expires date, then the same happens with 
the effective range index (DateRangeIndex 'always' set).

> > So... has anyone tried increasing the size of MAX_BUCKET_SIZE in
    real
> > life?
> 
> We have, mainly to reduce the number of conflicts.
> 
> > I understand that this will increase the potential for conflicts
> > if the bucket/set size is larger (however in reality this probably
> > can't get worse than it is, as currently as the value inserted is
    99%
> > of the time greater than the current max value stored -- it is a
> > timestamp -- you always hit the last bucket/set in the tree).
> 
> Actually, it reduces the number of unresolveable conflicts.
> Most conflicting bucket changes can be resolved, but bucket
> splits can't be and bigger buckets means fewer splits.
> 
> The main tradeoff is record size.

Ahh interesting, that is good to know. I've not actually checked the 
conflict resolution code, but do bucket change conflicts actually get
resolved in some sane way, or does the transaction have to be 
retried?

Actually... that is a good point, and something I never thought
of... when you get a Conflict Error in the logs (that was 
resolved) does that mean that _p_resolveConflict was called and 
successful, or does it mean that the transactions were retried 
and that resolved the conflict?

> > I was going to experiment with increasing the MAX_BUCKET_SIZE on
    an IISet
> > from 120 to 1200. Doing a quick test, a pickle of an IISet of 60
    items
> > is around 336 bytes, an of 600 items is 1580 bytes... so still
    very
> > much in the realms of a single disk read / network packet.
> 
> And imagine if you use zc.zlibstorage to compress records! :)

This is Plone 3, which is Zope 2.10.11, does zc.zlibstorage work on
that, or does it need newer ZODB? Also, unless I can sort out that 
large number of small pickles being loaded, I'd imagine this would 
actually slow things down.

> > I'm not sure how the current MAX_BUCKET_SIZE values were
    determined,
> > but looks like they have been the same since the dawn of time, and
    I'm
> > guessing might be due a tune?
> 
> Probably.
> 
> > It looks like I can change that constant and recompile the BTree
> > package, and it will work fine with existing IISets and just take
> > effect on new sets created (ie clear and rebuild the catalog
    index).
> >
> > Anyone played with this before or see any major flaws to my
    cunning plan?
> 
> We have.  My long term goal is to arrange things so that you can
> specify/change limits by sub-classing the BTree classes.
> Unfortunately, that's been a long-term priority for too long.
> This could be a great narrow project for someone who's willing
> to grok the Python C APIs.

I remember you introduced me to the C API for things like this waaaay 
back in Reading at the first non US Zope 3 sprint... I was trying to
create compressed list data structures for catalogs.... I never could 
quite get rid of the memory leaks I was getting! ;) Maybe I'll be 
brave and take another look.

> Changing the default sizes for the II ad LL BTrees is pretty
  straightforward.
> We were more interested in LO (and similar) BTrees. For those,
> it's much harder to guess sizes because you don't know generally
> how big the objects will be, which is why I'd like to make it
  tunable at the
> application level.

Yeah, I guess that is the issue. I wonder if it would be easy for the
code to work out the total size of the bucket in bytes and then 
split based upon that. Or something like 120 items, or 500kB, 
whichever comes first.

Just looking at the cache on the site at the moment, and we have a
total of 
978,355 objects in cache, of which:

312,523  IOBucket
274,025  IISet
116,136  OOBucket
114,626  IIBucket

So 83% of my cache is just those four object types.

-Matt