[ZODB-Dev] Weak References
Casey Duncan
c.duncan@nlada.org
Fri, 11 Jan 2002 15:31:17 -0500
--------------Boundary-00=_5OJS7Y228EH1DJEDI3NH
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
Sorry, I had a biblical email descibing my thought process here that I didn't
send to spare you all. I attached it though for the purposes of completeness.
My terminology was also incorrect, circular refs aren't the problem. Actually
I'm not completely sure I need weak refs. What I am trying to do is make it
so that an unwrapped object can identify where it is mounted in Zope. So that
a wrapper can be constructed around the object that would be equivilant to
the wrapper you would get through acquisition by name (but without needing to
know the name or mounted location of the object).
The reason is simple. I want to reference a persistent object that is mounted
in some arbitrary location in Zope from another persistent object. The
mounted location of the original object should be able to change without
breaking the reference.
However I want to be able to access the referenced object so that it is
wrapped based on where it is mounted, not relative to the object containing
the reference.
Clear as mud? 8^)
On Friday 11 January 2002 02:53 pm, Jim Fulton allegedly wrote:
> Casey Duncan wrote:
> > I've going up oneside and down the other trying to figure out a better
> > way for Zope objects to robustly refer to one another. I am constantly
> > battling either Acquisition or Circular references (which the former is
> > meant to alleviate).
> >
> > Acquisition is all swell and quite convenient, but sometimes you need to
> > store a persistent direct reference to an object. This in and of itself
> > is easy. The hard part is doing that and having the ability to use the
> > object in its original containment environment (for lack of better
> > terminology on my part).
>
> What prevents you from using it in it's original containment environment?
>
> > In other words you can have the cake (the object), but you can't eat
> > it too (use it wrapped with its original container).
>
> This doesn't actually clarify anything, since I really don't know
> anything about the cake you want. ;)
>
> > I have come to the conclusion that there is not general application level
> > way to make this work. So, that leads me here. Maybe there is something
> > that can be done at the database level to help. I have a general
> > solution, but it creates a large amount of circular references, so I
> > think that a system level weak reference implementation would be very
> > helpful.
>
> Your problem description is too unclear to offer any help.
>
> > I see that there is a PEP (205) to get weak refs directly in Python,
> > which is encouraging.
>
> Python has weak references.
>
> > I also saw several other implementations. I was wondering though
> > how hard it would be for me to implement such a thing leveraging the ZODB
> > storage cycle as a trigger.
> >
> > I figure that in memory, the object references could be the same as
> > regular references. The application would not know better. However, when
> > the object was removed from memory, the ZODB would drop references marked
> > as weak and eliminate those objects no longer referenced. When the object
> > was reloaded the __setstate__ or what have you would reestablish the
> > references to those objects that still exist. Any weak references made to
> > since garbage collected objects would no longer be valid and would raise
> > an exception when accessed.
>
> The ZODB alreadt does something like this. In fact, circular references
> among persistent objects don't cause memory leaks.
>
> > So I guess my questions are:
> >
> > Am I right that an application level solution is not practical and if so
> > what could be done at the DB level to facilitate this?
>
> I can't tell what problem you are trying to solve.
>
> > Should I shut up and just wait for Python 2.x to support weak references?
> > If so, will ZODB4 support them?
>
> Python already has wek references. In fact, in the Zope3 branch,
> I'm using them in the ZODB cache, although I think I'll eventually
> switch to a ZODB-specific approach that uses less memory.
>
> Jim
--
/---------------------------------------------------\
Casey Duncan, Sr. Web Developer
National Legal Aid and Defender Association
c.duncan@nlada.org
\---------------------------------------------------/
--------------Boundary-00=_5OJS7Y228EH1DJEDI3NH
Content-Type: text/plain;
charset="iso-8859-1";
name="References.txt"
Content-Transfer-Encoding: 8bit
Content-Description: My Thought Process
Content-Disposition: attachment; filename="References.txt"
Acquisition is a powerful tool as we know for allowing objects to discover and
access one another. But, it depends on the hierarchy of instances and careful
naming to be successful. Unfortunately for a product developer, the end-user
has far more control over such things in their Zope application then you do.
And seemingly innocent changes by the user can reek havok with your careful
design.
An oft-requested and elusive Zope feature is a way to "symlink" or soft
reference one object to another without depending on shakey associations based
on path or worse by name alone. It is common in Zope to just throw a getattr
request out there and hope the right thing comes back from the vacuums of the
namespace. Luckily, many time it does, but I don't like to rely on luck too
much in my run-time environment. I find myself reading Zope code and finding
"waaas" and "yikes" and so-forth in comments whenever an object must acquire
another by name.
In my mind I was playing with several solutions to this problem. One that came
to me was to create a flat-file registry of all objects in the ZODB and assign
a GUID to them. Then create a class that could be instanciated to represent an
referenced object. When called it would lookup the object in the registry. Only
the GUID would be stored in the referencing object itself. In many systems this
would be fine, but there are terrible problems with this approach in Zope and
the ZODB and with acquisition that I won't get into. I quickly realized it
wouldn't work.
Another idea was to forget the registry, still use GUIDs, but only assign them
directly to the objects. When a reference was made, the physical path to the
object would be stored along with its GUID. When the reference was used, the
path would be traversed. If the traversal failed or the GUID didn't match, then
a brute force search for the matching object would ensue. This would shift the
burden of the system, but I hate the traversal part and although brute force
searches would be rare, they would be very expensive. So, ixnay that one.
My third idea which seemed brilliant at the time was to somehow fool Zope into
keeping the old acquisition wrapper around and store a reference to this
wrapper. This was actually surprisingly easy to do by putting the wrapper into
another data structure such as a tuple. I started developing a Reference class
to investigate this further, and quickly found that although I could
persistently store the wrapper, it went stale and the REQUEST in it wasn't
valid anymore when the object was retreived. Steeerike three!
This last voyage into he land of acquisition magic made me realize something
though. I was trying to save all of the path information about a referenced
object in the refering object, which is a fundamental flaw with all of the
above ideas. The problem, is that the Referenced object itself knows nothing
about where it is in the object hierarchy. Now it can discover this information
through it's acquisition wrapper, but that is external to the object. If I
don't have the right wrapper (because I am referring to the object from
elsewhere), the object has no way to know where it is "really" stored from the
user's point of view.
So, my idea is to create a Mix-in or extend an existing one that stores the
object's physical path inside itself. This would probably consist of storing
the container parent chain as a sequence of unwrapped object references. It
would then have a method to reconstruct the correct aquisition chain based on
these parents. This would allow you to store a reference to an object the
normal pythonic way via setattr and yet retrieve and use the object in is
original containment enviroment (via a method of this mix-in), just as though
you had acquired it from the namespace at run-time.
Since most all Zope "user" objects are stored in ObjectManagers, some changes
would be made there so that the path info is updated when objects are added or
removed from a folderish object. This would mean that references could survive
rename and move operations.
Now one caveat with this simplistic referencing approach is that additional
Python references to the object are made, so that deleting referenced objects
from the ObjectManager doesn't really get rid of them, but the users can't see
them anymore. This might be considered a feature, but I doubt it 8^).
A solution would be to again introduce a Reference class, which would be a thin
wrapper around the referenced object. When the reference instance was called,
it would check to see if the object still "exists". This would rely on OMs to
indicate this by removing the container parent chain when the object was
"deleted". If the referenced object had been deleted then its internal
reference would be removed and an AttributeError would be raised. Obviously
referenced objects would still hang around a bit longer, but at least you
wouldn't have "phantom" objects that the user can't see or get rid of. An issue
with this solution though is that it involves a commit-on-read.
Another possible (and I think reasonable) solution would be make it so that you
could not delete an object that was referenced by others, until those
references were broken. Some more bookkeeping would be involved, but I think it
could be solved. A problem with this though is the parent chain references,
which would complicate things a good bit.
Well I am finally at a place where the theory sits well with me, and I have no
strong misgivings about it. So what have I missed?
/---------------------------------------------------\
Casey Duncan, Sr. Web Developer
National Legal Aid and Defender Association
c.duncan@nlada.org
\---------------------------------------------------/
--------------Boundary-00=_5OJS7Y228EH1DJEDI3NH--