[ZODB-Dev] Weak References

Casey Duncan c.duncan@nlada.org
Fri, 11 Jan 2002 15:31:17 -0500


--------------Boundary-00=_5OJS7Y228EH1DJEDI3NH
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 8bit

Sorry, I had a biblical email descibing my thought process here that I didn't 
send to spare you all. I attached it though for the purposes of completeness.

My terminology was also incorrect, circular refs aren't the problem. Actually 
I'm not completely sure I need weak refs. What I am trying to do is make it 
so that an unwrapped object can identify where it is mounted in Zope. So that 
a wrapper can be constructed around the object that would be equivilant to 
the wrapper you would get through acquisition by name (but without needing to 
know the name or mounted location of the object).

The reason is simple. I want to reference a persistent object that is mounted 
in some arbitrary location in Zope from another persistent object. The 
mounted location of the original object should be able to change without 
breaking the reference.

However I want to be able to access the referenced object so that it is 
wrapped based on where it is mounted, not relative to the object containing 
the reference.

Clear as mud? 8^)

On Friday 11 January 2002 02:53 pm, Jim Fulton allegedly wrote:
> Casey Duncan wrote:
> > I've going up oneside and down the other trying to figure out a better
> > way for Zope objects to robustly refer to one another. I am constantly
> > battling either Acquisition or Circular references (which the former is
> > meant to alleviate).
> >
> > Acquisition is all swell and quite convenient, but sometimes you need to
> > store a persistent direct reference to an object. This in and of itself
> > is easy. The hard part is doing that and having the ability to use the
> > object in its original containment environment (for lack of better
> > terminology on my part).
>
> What prevents you from using it in it's original containment environment?
>
> > In other words you can have the cake (the object), but you can't eat
> > it too (use it wrapped with its original container).
>
> This doesn't actually clarify anything, since I really don't know
> anything about the cake you want. ;)
>
> > I have come to the conclusion that there is not general application level
> > way to make this work. So, that leads me here. Maybe there is something
> > that can be done at the database level to help. I have a general
> > solution, but it creates a large amount of circular references, so I
> > think that a system level weak reference implementation would be very
> > helpful.
>
> Your problem description is too unclear to offer any help.
>
> > I see that there is a PEP (205) to get weak refs directly in Python,
> > which is encouraging.
>
> Python has weak references.
>
> > I also saw several other implementations. I was wondering though
> > how hard it would be for me to implement such a thing leveraging the ZODB
> > storage cycle as a trigger.
> >
> > I figure that in memory, the object references could be the same as
> > regular references. The application would not know better. However, when
> > the object was removed from memory, the ZODB would drop references marked
> > as weak and eliminate those objects no longer referenced. When the object
> > was reloaded the __setstate__ or what have you would reestablish the
> > references to those objects that still exist. Any weak references made to
> > since garbage collected objects would no longer be valid and would raise
> > an exception when accessed.
>
> The ZODB alreadt does something like this. In fact, circular references
> among persistent objects don't cause memory leaks.
>
> > So I guess my questions are:
> >
> > Am I right that an application level solution is not practical and if so
> > what could be done at the DB level to facilitate this?
>
> I can't tell what problem you are trying to solve.
>
> > Should I shut up and just wait for Python 2.x to support weak references?
> > If so, will ZODB4 support them?
>
> Python already has wek references. In fact, in the Zope3 branch,
> I'm using them in the ZODB cache, although I think I'll eventually
> switch to a ZODB-specific approach that uses less memory.
>
> Jim

-- 
/---------------------------------------------------\
  Casey Duncan, Sr. Web Developer
  National Legal Aid and Defender Association
  c.duncan@nlada.org
\---------------------------------------------------/
--------------Boundary-00=_5OJS7Y228EH1DJEDI3NH
Content-Type: text/plain;
  charset="iso-8859-1";
  name="References.txt"
Content-Transfer-Encoding: 8bit
Content-Description: My Thought Process
Content-Disposition: attachment; filename="References.txt"

Acquisition is a powerful tool as we know for allowing objects to discover and
 access one another. But, it depends on the hierarchy of instances and careful
 naming to be successful. Unfortunately for a product developer, the end-user
 has far more control over such things in their Zope application then you do.
 And seemingly innocent changes by the user can reek havok with your careful
 design.

An oft-requested and elusive Zope feature is a way to "symlink" or soft
 reference one object to another without depending on shakey associations based
 on path or worse by name alone. It is common in Zope to just throw a getattr
 request out there and hope the right thing comes back from the vacuums of the
 namespace. Luckily, many time it does, but I don't like to rely on luck too
 much in my run-time environment. I find myself reading Zope code and finding
 "waaas" and "yikes" and so-forth in comments whenever an object must acquire
 another by name.

In my mind I was playing with several solutions to this problem. One that came
 to me was to create a flat-file registry of all objects in the ZODB and assign
 a GUID to them. Then create a class that could be instanciated to represent an
 referenced object. When called it would lookup the object in the registry. Only
 the GUID would be stored in the referencing object itself. In many systems this
 would be fine, but there are terrible problems with this approach in Zope and
 the ZODB and with acquisition that I won't get into. I quickly realized it
 wouldn't work.

Another idea was to forget the registry, still use GUIDs, but only assign them
 directly to the objects. When a reference was made, the physical path to the
 object would be stored along with its GUID. When the reference was used, the
 path would be traversed. If the traversal failed or the GUID didn't match, then
 a brute force search for the matching object would ensue. This would shift the
 burden of the system, but I hate the traversal part and although brute force
 searches would be rare, they would be very expensive. So, ixnay that one.

My third idea which seemed brilliant at the time was to somehow fool Zope into
 keeping the old acquisition wrapper around and store a reference to this
 wrapper. This was actually surprisingly easy to do by putting the wrapper into
 another data structure such as a tuple. I started developing a Reference class
 to investigate this further, and quickly found that although I could
 persistently store the wrapper, it went stale and the REQUEST in it wasn't
 valid anymore when the object was retreived. Steeerike three!

This last voyage into he land of acquisition magic made me realize something
 though. I was trying to save all of the path information about a referenced
 object in the refering object, which is a fundamental flaw with all of the
 above ideas. The problem, is that the Referenced object itself knows nothing
 about where it is in the object hierarchy. Now it can discover this information
 through it's acquisition wrapper, but that is external to the object. If I
 don't have the right wrapper (because I am referring to the object from
 elsewhere), the object has no way to know where it is "really" stored from the
 user's point of view.

So, my idea is to create a Mix-in or extend an existing one that stores the
 object's physical path inside itself. This would probably consist of storing
 the container parent chain as a sequence of unwrapped object references. It
 would then have a method to reconstruct the correct aquisition chain based on
 these parents. This would allow you to store a reference to an object the
 normal pythonic way via setattr and yet retrieve and use the object in is
 original containment enviroment (via a method of this mix-in), just as though
 you had acquired it from the namespace at run-time.

Since most all Zope "user" objects are stored in ObjectManagers, some changes
 would be made there so that the path info is updated when objects are added or
 removed from a folderish object. This would mean that references could survive
 rename and move operations.

Now one caveat with this simplistic referencing approach is that additional
 Python references to the object are made, so that deleting referenced objects
 from the ObjectManager doesn't really get rid of them, but the users can't see
 them anymore. This might be considered a feature, but I doubt it 8^).

A solution would be to again introduce a Reference class, which would be a thin
 wrapper around the referenced object. When the reference instance was called,
 it would check to see if the object still "exists". This would rely on OMs to
 indicate this by removing the container parent chain when the object was
 "deleted". If the referenced object had been deleted then its internal
 reference would be removed and an AttributeError would be raised. Obviously
 referenced objects would still hang around a bit longer, but at least you
 wouldn't have "phantom" objects that the user can't see or get rid of. An issue
 with this solution though is that it involves a commit-on-read.

Another possible (and I think reasonable) solution would be make it so that you
 could not delete an object that was referenced by others, until those
 references were broken. Some more bookkeeping would be involved, but I think it
 could be solved. A problem with this though is the parent chain references,
 which would complicate things a good bit.

Well I am finally at a place where the theory sits well with me, and I have no
 strong misgivings about it. So what have I missed?

/---------------------------------------------------\
  Casey Duncan, Sr. Web Developer
  National Legal Aid and Defender Association
  c.duncan@nlada.org
\---------------------------------------------------/


--------------Boundary-00=_5OJS7Y228EH1DJEDI3NH--