[Zope-CMF] A modest proposal to add a Unique ID to all conten t/folder objects in CMF

sean.upton@uniontrib.com sean.upton@uniontrib.com
Wed, 21 Aug 2002 11:16:08 -0700


A few thoughts:

When storing a relation to content by storing an ID, one should be flexible
in storing IDs generated from the following sources:

- Slug (Manually named unique ID string)
	- Example: '20020821-jury-verdict-in-at-11AM.1'
	- Likely to be unique no matter what path it is in, unless multiple
versions of same object from paste and/or workflow move
- Path To Object (not great, but should be supported)
	- Both inside Zope (i.e. '/cmfsite/myPortalFolder/foo-bar-123.jpg')
and outside Zope (i.e. 'c:\My Documents\foo-bar-123.jpg')
	- Brittle, but common; should be supported in case alternate id
generator is not available/appropriate
- Hash or Digest (likely guaranteed to be unique)
	- If no slug exists, and we are within a system that can manage
translation of a hash to an object reference, this is good
	- likely doesn't move beyond Zope

My current thinking in a customized CMF system implementation that I am
currently working on: Identifier() should output the path, but a method
called, say, getUniqueId() should output one of these; which doesn't matter;
a hash should only be generated at content creation if a slug doesn't exist,
or at content copy if there is another item on system with the same slug;
alternately, the unique ID should be editable, so that if a hash is
generated, but the content author doesn't like it (they want to use a more
descriptive, but likely still unique slug, they can).  This presents the
challenge of finding and fixing references to the old id referenced in other
objects, but this could be done by simply having a mechanism to find and fix
this at the time of a change of id.  A mixin class for content should
provide:
- An attribute to stor a unique id string
- A getUniqueId() method to get it
- a setUniqueId() method to set it, and call a tool to fix references in
refering objects
- convertReferencesTo(oldid,newid)

A lookup mechanism/tool should determine if the id is a path, hash, or slug,
and broker an object reference for an object passed an id.  This tool should
also provide hooks for a content object to ask it to assist in changing
other object's references to it:
- convertAllReferences(oldid,newid)
	"""
	query to find all objects, then call
	obj.convertReferencesTo(oldid,newid) for each object
	"""
- getObjectById(id)
	"""get obj ref and return, passed a slug/hash/path"""

My thoughts on relations are that some content objects should be able to
store their own relations, but only in one direction (i.e. a DCMES
refinement sense of "references" vs. the indirect "is referenced by").
Also, relations should be able to be externally stored in the system instead
of in the content item; perhaps a tool that manages unique ids should also
assist with relations.
- getIndirectReferencesTo(id)
	"""get a list of all objects referring to the one passed here; uses
catalog"""
- addRelation(id1,id2,'Text Label For Relation','Relation Model (in the
DCMES relation element refinement sense)')
- delRelation(id1,id2,'Text Label For Relation')
- queryManagedRelations(id)

Sorry if my ideas are all over the place here... Thoughts?

Sean

-----Original Message-----
From: Tim Hoffman [mailto:timhoffman@cams.wa.gov.au]
Sent: Tuesday, August 20, 2002 7:32 PM
To: Zope-CMF@zope.org
Subject: [Zope-CMF] A modest proposal to add a Unique ID to all
content/folder objects in CMF


Hi

I would like to elicit some discussion on the possibility of adding some
new functionality to CMFCore.

I would like to call it a UniqueZid for want of a better name. Basically
it would be a new mixin class (see below) which would be added to
CMFCore.PortalFolder, and CMFCore.PortalContent It would necessitate all
classes that Subclass PortalContent to call PortalContent.__init__(self)
in their init method to initialize the UniqueID. In addition we would
need to to the __init__ method to in PortalContent (and add an __init__ 
method to PortalFolder) to call UniqueZid.__init__(self)

This would give all content objects (folders, etc...) a uniqeid that
would be at a minimum unique within a CMF site and possibly unique
accross sites. That would be guaranteed to remain the same for the life
of the object. If the object is cloned then a new UniqueId would be
generated for the new object.

There are some real advantages to this (many discussion in the cmf-zope
list have talked about how to/not to use data_record_id_, paths etc as
unique identifiers) however all of these are transitory in nature and
can't be relied upon, data_record_id changes all the time, and the path
of an object will change the minute you move it, though it is the same
object.

By putting the a new index and metadata column in the portal_catalog you
can retrieve any specific object without needing to know it's location,
or having to worry that it's location might change.

I have used this capability to perform a similiar function to the new
CMFWorkspaces package (which is basically like a collection of
favourites) however links/relationships created by UniqueZid will still
be valid if the object moves (not the case with CMFWorkspaces or
favourites)

In addition if you create a property on an object such as
"related_objects" and it contains a list of UnqueZid's. You can also do
fairly efficient reverse lookups. ie if you add related_objects to the
portal_catalog, you can then easily find out "what objects relate/point
to this object" 

I am probably missing something major, but I have been using this
approach extensively on a couple of live sites to really good effect.

My approach before however was to monkey patch DefaultDublinCoreImpl
but I see a lot of value in this being added to the core of CMF.

What do people think?

Regards

Tim 

P.S. Below is a first cut at the UniqueZid mixin class, plus a simple 
Pythonscript to retrieve an object by it's UniqueZid  

UniqueZid.py

from Globals import InitializeClass,Persistent
import sha
from time import asctime,gmtime,clock
from Acquisition import aq_base

class UniqueZid(Persistent):
    """
        Mix-in class which provides a unique id for the object,
        and will remain Unique if this object is cloned
        if you are concerned about how unique the hash digest
        will be, add some additional information by 
        way of the hash_string argument. If you want to ensure
        Uniqueness across sites include a prefix (maybe)
        The prefix is preserved
    """
    
    def __init__( self,hash_string='',prefix='' ):
        self._zid = self._generateId(hash_string,prefix)
        self._prefix = prefix

    def _generateId(self,hash_string,prefix):
        seed_string = prefix + hash_string + asctime(gmtime()) +
str(clock())  
        return prefix+sha.new(seed_string).hexdigest() 
        

    def getZid(self):
        ''' return unique id '''
        return self._zid

    def ZID(self):
        ''' return unique id named nicely for 
            portal_catalog index names
        '''
        return self.getZid()

    def manage_afterClone(self, item):
        self._zid = self._generateId(self.ZID(),self._prefix)
        for object in item.objectValues():
            if hasattr(object, 'manage_afterClone'):
                object.manage_afterClone(object)
                
InitializeClass(UniqueZid)




getObjectByZid  python script

#parameter = zid

result=context.portal_catalog(ZID=zid)
if len(result):
     if len(result) > 1:
         raise LookupError,"More than one object has the same ZID!"
     result = result[0]
     object = result.getObject(result.data_record_id_)
     return object.view()
else:
  return None






_______________________________________________
Zope-CMF maillist  -  Zope-CMF@zope.org
http://lists.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests