[ZODB-Dev] How to check for setting the same values on persistent objects?

Hanno Schlichting hanno at hannosch.eu
Wed May 4 07:29:17 EDT 2011


On Wed, May 4, 2011 at 1:09 PM, Laurence Rowe <l at lrowe.co.uk> wrote:
> Persistent objects are also used as a cache and in that case code
> relies on an object being invalidated to ensure its _v_ attributes are
> cleared. Comparing at the pickle level would break these caches.

So you would expect someone to store _v_ attributes on objects as
caches, where that cached data is dependent on more than the data of
the object? Do you know of any examples of this? I would expect to see
_v_ attributes only being used if the cache data is dependent on the
object state itself, i.e. if that doesn't change, then the cached data
doesn't have to change either.

> I suspect that this is only really a problem for the catalogue.
> Content objects will always change on the pickle level when they are
> invalidated as they will have their modification date updated. I
> imagine you also see archetypes doing bad things as it tends to store
> one persistent object per field, but that is just bad practise.

When editing a content object in Plone, there's more than 20 different
persistent objects being set to _p_changed = True. There's a couple of
them which should change dependent of modification date, but a whole
lot which doesn't have to change. These other ones include: The
container, the container's position map, the persistent mapping
containing the workflow history, all base units, the at_references
folder, the annotations storage btree, OOBuckets inside that btree ,
... and a lot more.

We can add code to deal with all of these, but it's a lot of places.
Essentially any place that does "persistentobject.attribute = value"
should do the check - that's a whole lot of them. Maybe this is the
best we can do and document this as a best practice for ZODB
development - I was just trying to see if there's a better way.

> It would be interesting to see the performance impact of adding
> newvalue != oldvalue checks on the catalogue data structures. This
> would also prevent the unindex logic being called unnecessarily.

The catalog isn't a problem, it already has these checks in all
places. It is less efficient than it could be, as it needs to do:

old = btree.get(key, None)
if old != new:
    btree[key] = new

So it ends up traversing the btree to the right bucket twice. The
int/float based buckets can do the check inside their setattr, so they
avoid the extra traversal. It could be interesting to allow buckets
with object values to do the check inside the __setattr__ via some
additional flag, so the extra traversal could be avoided.

Hanno


More information about the ZODB-Dev mailing list