[Zope] Non-bloating counter/access logger solution

Chris McDonough chrism@zope.com
12 Jun 2003 11:46:33 -0400


On Thu, 2003-06-12 at 11:09, Wankyu Choi wrote:
> >Use a BTrees.Length.Length object.  See Products.ZCatalog.Catalog for an
> example of its usage.
> 
> Guess that would work for the counter problem since the counter property
> holds an integer value.
> 
> But what if I wanted to implement a complete access logger which requires
> string properties for IP addresses or usernames, etc?

Writing to the ZODB on every request is probably not a good idea. 
Better to write to a logfile.

> I took a look at the source code of the Length object and the _p_independent
> method looks interesting.
> 
> If I take the Length source code and make it work for other types than
> integer, say strings, lists, dictionaries, would that work too?


Note that the "meat" of the conflict resolution code is in the
_p_resolveConflict method of the class, which is written for
Btree.Length objects as:

def _p_resolveConflict(self, old, s1, s2): return s1 + s2 - old

'old' is current state of the object in the database (for example, 5)

's1' is the state that thread1 thinks the state should be (for
example,4)

's2' in the state that thread2 thinks the state should be (for example,
6).

For integers, the conflict resolution is simple... we just perfom the
additions/subtractions to the current state implied by the value in each
thread.  To do this, we add s1 and s2 ( 4 + 6) and subtract old (5).  We
come up with 5, which is (5 - 1 + 1).

The existence of the _p_independent method indicates that read access to
this object by a connection should not raise a read conflict under any
circumstance.  This is mostly appropriate for this kind of object
because it's an "independent" object (its state is not relied on by
other objects; except of course that it is relied upon in the case of
the Catalog, but we'll ignore that for now ;-).  This however is a
dangerous declaration for most objects.  For example, in the sessioning
machinery up until Zope 2.6.2, I effectively turned off read conflicts,
but as I now know, that was a very bad idea as consistency problems
resulted because I had deep dependencies between objects for which
ignoring a read conflict was fatal and caused synchronization errors
under high load.

So in short, you'll definitely be able to come up with some conflict
resolution strategy that works for you, but it is impossible to come up
with something that works in the general case for more complicated
objects, because there is no "right answer".  For example, there is no
generally applicable conflict resolution strategy for the state of a
string.

A better strategy might be to write this data to somewhere else, like a
logfile or relational database.

If you don't want to do this (perhaps because you don't want to require
a relational database for your app), you're brave, and you feel the need
to write to a ZODB on every request, I'd suggest using the sessioning
machinery (which does indeed use the ZODB).  You could use a strategy
which writes a particular user's URL visits to session data and then, at
the end of the session (which is hookable), dump the session state into
another object in a batch.  This would "bunch up" main database writes
so (in theory) you'd receive less conflicts.

The sessioning machinery makes it deceptively easy to do this but faces
all of the same problems as app code you might write.  For example,
there will be conflict errors that happen as a result of session
writes.  However, the sessioning machinery tries to takes pains to
reduce the number of conflicts by using a very specific conflict
resolution strategy, so it's likely you'll have fewer conflicts than if
you were to just to try to write to a database on every request.

That said, it still may fail under very high load.  But if it doesn't
work for you for some reason, at least you'd stand a shot at getting
some help from the larger Zope community in the form of improvements to
the sessioning code.  Localizing conflict resolution strategies to
sessions also gives us a choke point and reduces the burden on you as an
application programmer to care much about them.

- C