[ZODB-Dev] ZODB-level indexing

Christian Robottom Reis kiko at async.com.br
Wed Nov 5 07:46:21 EST 2003


On Wed, Nov 05, 2003 at 10:27:42AM +0200, Roché Compaan wrote:
> they actually changed but legitimate changes to properties still cause
> a lot of invalidations to catalog indexes.

We have the same problem, and I've considered many times having the
indexing moved behind the ZEO -- I actually exchanged email on a related
subject with Shane a while back. Of course, I'm talking about
IndexedCatalog and not ZCatalog, but the same principle applies.

> This can surely not only be bad with a slow connection. I can imagine
> that a ZEO cluster might take a perfomance hit in apps where ZCatalog is
> integral to the app. The idea I am toying with at the moment is to have
> indexing happen at a ZODB storage level similar to what you will get
> when using a RDBMS. I don't know if this is even feasible because the
> storage only sees pickles and you'd have to unpickle before indexing
> becomes possible. Or maybe index just before an object is pickled.

This is more or less the idea I had: do the unindex/index step just
before an object is pickled. It's actually what IC does, except it is
done upon setattr(). 

This points to one of the fundamental issues with this approach, however
-- the indexes will only be updated at commit() time, which is when the
ZEO server actually receives the change notifications. If your
application does updates and queries without commit()ing, the query will
not return the updated objects. 

You'll have to reason whether this constraint is acceptable to you.

> Another way would be to put the Catalog on a separate mount point and
> have a hook that indexes invalidated objects as soon as they are
> updated. I don't know and still need to explore all options - at the
> moment I am just curious if this is concern for anybody else?

Sidnei and I ended up boosting performance by storing (IndexedCatalog)
IDs instead of object references as my index values -- the changes were
confined to IndexedCatalog.Indexes, and the size of the indexes went
down significantly. It also sped up BTrees operations because I delay
unghosting the objects to the last second.

I'd consider moving indexing behind the ZODB, still, for an extra
performance enhancement, but I haven't thought yet about how my
application would face the fact that indexes are only updated
post-commit.

Take care,
--
Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331



More information about the ZODB-Dev mailing list