[Grok-dev] Re: Keeping indexes up to date

Mon Aug 13 14:12:30 EDT 2007

On 13 Aug 2007, at 13:59 , Martijn Faassen wrote:
> On 8/12/07, Philipp von Weitershausen <philipp at weitershausen.de>  
> wrote:
>> Martijn Faassen wrote:
>>> Philipp von Weitershausen wrote:
> [snip]
>>>> This data may be persisted by the ZODB and the ZODB may have  
>>>> ways to
>>>> find out which attributes changed, but this is rarely enough
>>>> information for an IObjectModifiedEvent (remember, this event  
>>>> contains
>>>> information about which fields of which schema were changed).
>>>
>>> IObjectModifiedEvent doesn't have to contain this information,  
>>> right?
>>
>> This is a matter of intepretation. I think it should always  
>> contain this
>> information, especially if you can actually easily determine what
>> changed exactly. zope.formlib and zope.app.form unfortunately  
>> don't do
>> this. I've filed a bug report, but haven't done anything about it  
>> yet.
>> At least I've made sure grok.formlib sends all the information.
>
> Why do you think it should always send this information? In a later
> response you say the indexing story doesn't need the information. In
> fact, if I read the code correctly, this information is completely
> ignored by the catalog.

That doesn't make it better...

>>> The minimum case is just to say the object changed, and that's it.
>>
>> Yes, but there are lots of ways an object can change. Indexing can
>> likely be an expensive operation (e.g. indexing lots of text in a
>> unicode attribute), so it would make sense to have this extra
>> information so that expensive stuff can be skipped.
>
> But nothing inside of Zope that we know of actually makes any use of
> this information, right?

I've been wanting to write code that made use of this information but  
couldn't because very few places actually do send the information.  
One of the use cases I had was to avoid expensive indexing, the other  
one was updating an AJAX UI only if really necessary.

> So taking all this care is currently actually
> useless in practice, and in my mind, even potentially harmful as you
> can introduce buggy code, think it works, and not have any feedback
> that you're wrong.

That doesn't change the fact that you introduced buggy code (as you  
say yourself). It just means we'll have to make it easier to detect  
that fact.

> *If* anything then later on starts making use of
> this information, your code is very likely to be broken anyway.

I'm afraid I can't follow that one.

Either way, if there's no information contained in the event, you  
have to assume the worst and assume everythign changed. If there's  
information in the event, I think it's fair to assume that it's  
complete.

> [snip]
>>> I think asking "why don't ObjectModifiedEvents get sent by the
>>> persistence layer where possible?" is a good question and we should
>>> ponder it a little bit more.
>>
>> We should certainly document it clearly that ZODB persistence  
>> gives you
>> a lot of transparency, but not that kind of transparency with the  
>> event
>> framework and the indexing machinery.
>
> So, why is it a bad idea for ZODB persistence to send
> ObjectModifiedEvents? (if it sends only one per transaction if the
> object changed)
>
> You have given some potential answers:
>
> * the ObjectModifiedEvent can't then include which attributes changed.
> Since this doesn't appear to be tracked anything, I don't see this is
> as a current practical blocker.
>
> * the ZODB is deliberately dumb about this and this would break this
> philosophy. This is a philosophical argument. If it's very practical
> to break this philosophy, why not do it? Any pragmatic objections
> against breaking this philosophy?

A pragmatic reason might be that the ZODB probably doesn't have the  
right kind of hooks for it (please correct me if I'm wrong). A  
conceptual one would be orthogonality (between persistence and  
triggered events). Then again, RDBMS don't give you that choice  
either...

> * the ObjectModifiedEvent will need to be sent in some other cases
> (such as for annotations or other relations between objects that
> affect index state but not object state). This is the strongest
> argument I've heard so far - people might be misled into thinking they
> don't need to send them, so perhaps it should never be sent. That
> said, people are currently also misled into thinking they don't need
> to send them - the origin of this thread. Why can't the ZODB send this
> event if the form framework can?

Because the form framework is the one that changes objects.

If anything, I think we need to make people aware of the  
orthogonality of our frameworks more than we do now. For example,  
people don't code their forms themselves but prefer to use an  
automated solution. Yet they simply assume that it isn't the  
automated solution they're using that takes care of the events but  
the persistency layer, without actually knowing anything. I suppose  
they just guess based on prior experience with RDBMS...

> I can think some more arguments:
>
> * in some cases, the object state changes but the index state doesn't.
> The application designer can know this, and then not send the
> ObjectModifiedEvent to avoid performance issues. I'm not sure how
> strong this argument is - how often does this situation occur in
> practice? Isn't it misleading to have an object being modified without
> an event being sent?
>
> * automatically sending the modified event means the framework makes
> some choices for you that the developer can't opt out of anymore. Zope
> 3 is all about not forcing people into choices. True, but Grok on the
> other hand is about making people's life easier by taking some choices
> out of their hands.

True, though the choice has to be reasonable. I need to ponder about  
this for a while to make up my mind on this. Because of the  
synchronous nature of events in Zope 3, my gut reaction would be  
"make it explicit"...