[Zope-CMF] A modest proposal to add a Unique ID to all conten t/folder objects in CMF

Tim Hoffman timhoffman@cams.wa.gov.au
22 Aug 2002 09:46:19 +0800


Hi Sean


On Thu, 2002-08-22 at 02:16, sean.upton@uniontrib.com wrote:
> A few thoughts:
> 
> When storing a relation to content by storing an ID, one should be flexible
> in storing IDs generated from the following sources:

Yeah, I think your right, I do like the ability to pass an externally
generated UniqueID, except what happens if the object is cloned.

if for instance we have created an object with the externally provided
Id '20020821-jury-verdict-in-at-11AM.1' if I copy the resultant object
what happens to the id, it will be either

no longer unique

or bare little or resemblance to the originally passed id.

I would suggest that maybe if the uniqueid is supplied rather than
generated, that you would then have to pass as an argument a method or
class to be called on clone, or raise an exception and not allow the
object to be copied. 

> 
> - Slug (Manually named unique ID string)
> 	- Example: '20020821-jury-verdict-in-at-11AM.1'
> 	- Likely to be unique no matter what path it is in, unless multiple
> versions of same object from paste and/or workflow move
> - Path To Object (not great, but should be supported)
> 	- Both inside Zope (i.e. '/cmfsite/myPortalFolder/foo-bar-123.jpg')
> and outside Zope (i.e. 'c:\My Documents\foo-bar-123.jpg')
> 	- Brittle, but common; should be supported in case alternate id
> generator is not available/appropriate

What's the real different between slug or path to object, I view them as
both instances of an externally supplied Unique Id

> - Hash or Digest (likely guaranteed to be unique)
> 	- If no slug exists, and we are within a system that can manage
> translation of a hash to an object reference, this is good
> 	- likely doesn't move beyond Zope
> 

Agreed

> My current thinking in a customized CMF system implementation that I am
> currently working on: Identifier() should output the path, but a method
> called, say, getUniqueId() should output one of these; which doesn't matter;

I am not sure I am keen on this. My own goal was to make sure I had a
guranteed unique id. If you want to store the path to some external file
I would add another property to the object. If you where to rely on the
unique id being a path pointer, I feel you introduce all sorts of 
dependancies on synchronisation with the outside world, which aren't
part of uniquely identifying a content object inside zope.


> a hash should only be generated at content creation if a slug doesn't exist,
> or at content copy if there is another item on system with the same slug;
> alternately, the unique ID should be editable, so that if a hash is
> generated, but the content author doesn't like it (they want to use a more
> descriptive, but likely still unique slug, they can).  This presents the
> challenge of finding and fixing references to the old id referenced in other
> objects, but this could be done by simply having a mechanism to find and fix
> this at the time of a change of id.

This is what I would like to avoid. For my own part I want the solution
as simple as possible and not require any housekeeping data cleansing,
system integrity checking.

My proposal doesn't actually include references. I would like to keep
them out of it, as they may or may not be used, and in many different
ways, that we can't forsee. For instance using them in body text of a
wiki (how about new syntax for the wiki "text displayed":{someobjects
uniqueid}  This gives you a site wide link to an object which can move,
but the reference is in body text.

  A mixin class for content should
> provide:
> - An attribute to stor a unique id string
> - A getUniqueId() method to get it
> - a setUniqueId() method to set it, and call a tool to fix references in
> refering objects
> - convertReferencesTo(oldid,newid)
> 

convertReferences is exactly what I would like to avoid at this point.
KIS is my own view. 

> A lookup mechanism/tool should determine if the id is a path, hash, or slug,
> and broker an object reference for an object passed an id.  This tool should
> also provide hooks for a content object to ask it to assist in changing
> other object's references to it:
> - convertAllReferences(oldid,newid)
> 	"""
> 	query to find all objects, then call
> 	obj.convertReferencesTo(oldid,newid) for each object
> 	"""
> - getObjectById(id)
> 	"""get obj ref and return, passed a slug/hash/path"""
> 
> My thoughts on relations are that some content objects should be able to
> store their own relations, but only in one direction (i.e. a DCMES
> refinement sense of "references" vs. the indirect "is referenced by").
> Also, relations should be able to be externally stored in the system instead
> of in the content item; perhaps a tool that manages unique ids should also
> assist with relations.
> - getIndirectReferencesTo(id)
> 	"""get a list of all objects referring to the one passed here; uses
> catalog"""
> - addRelation(id1,id2,'Text Label For Relation','Relation Model (in the
> DCMES relation element refinement sense)')
> - delRelation(id1,id2,'Text Label For Relation')
> - queryManagedRelations(id)
> 
> Sorry if my ideas are all over the place here... Thoughts?
> 

I quite like may of the things you propose in theory, but I would like
to see those sorts of things in Zope 3. I would actually just like to
see UniqueID's in core CMF, and soon, and my code does work (still need
some unit tests ;-), and could be in to the next version of CMF with
little or no impact, and now new big complex tools or big complexities
added to CMF.

Regards

Tim


> Sean
> 
> -----Original Message-----
> From: Tim Hoffman [mailto:timhoffman@cams.wa.gov.au]
> Sent: Tuesday, August 20, 2002 7:32 PM
> To: Zope-CMF@zope.org
> Subject: [Zope-CMF] A modest proposal to add a Unique ID to all
> content/folder objects in CMF
> 
> 
> Hi
> 
> I would like to elicit some discussion on the possibility of adding some
> new functionality to CMFCore.
> 
> I would like to call it a UniqueZid for want of a better name. Basically
> it would be a new mixin class (see below) which would be added to
> CMFCore.PortalFolder, and CMFCore.PortalContent It would necessitate all
> classes that Subclass PortalContent to call PortalContent.__init__(self)
> in their init method to initialize the UniqueID. In addition we would
> need to to the __init__ method to in PortalContent (and add an __init__ 
> method to PortalFolder) to call UniqueZid.__init__(self)
> 
> This would give all content objects (folders, etc...) a uniqeid that
> would be at a minimum unique within a CMF site and possibly unique
> accross sites. That would be guaranteed to remain the same for the life
> of the object. If the object is cloned then a new UniqueId would be
> generated for the new object.
> 
> There are some real advantages to this (many discussion in the cmf-zope
> list have talked about how to/not to use data_record_id_, paths etc as
> unique identifiers) however all of these are transitory in nature and
> can't be relied upon, data_record_id changes all the time, and the path
> of an object will change the minute you move it, though it is the same
> object.
> 
> By putting the a new index and metadata column in the portal_catalog you
> can retrieve any specific object without needing to know it's location,
> or having to worry that it's location might change.
> 
> I have used this capability to perform a similiar function to the new
> CMFWorkspaces package (which is basically like a collection of
> favourites) however links/relationships created by UniqueZid will still
> be valid if the object moves (not the case with CMFWorkspaces or
> favourites)
> 
> In addition if you create a property on an object such as
> "related_objects" and it contains a list of UnqueZid's. You can also do
> fairly efficient reverse lookups. ie if you add related_objects to the
> portal_catalog, you can then easily find out "what objects relate/point
> to this object" 
> 
> I am probably missing something major, but I have been using this
> approach extensively on a couple of live sites to really good effect.
> 
> My approach before however was to monkey patch DefaultDublinCoreImpl
> but I see a lot of value in this being added to the core of CMF.
> 
> What do people think?
> 
> Regards
> 
> Tim 
> 
> P.S. Below is a first cut at the UniqueZid mixin class, plus a simple 
> Pythonscript to retrieve an object by it's UniqueZid  
> 
> UniqueZid.py
> 
> from Globals import InitializeClass,Persistent
> import sha
> from time import asctime,gmtime,clock
> from Acquisition import aq_base
> 
> class UniqueZid(Persistent):
>     """
>         Mix-in class which provides a unique id for the object,
>         and will remain Unique if this object is cloned
>         if you are concerned about how unique the hash digest
>         will be, add some additional information by 
>         way of the hash_string argument. If you want to ensure
>         Uniqueness across sites include a prefix (maybe)
>         The prefix is preserved
>     """
>     
>     def __init__( self,hash_string='',prefix='' ):
>         self._zid = self._generateId(hash_string,prefix)
>         self._prefix = prefix
> 
>     def _generateId(self,hash_string,prefix):
>         seed_string = prefix + hash_string + asctime(gmtime()) +
> str(clock())  
>         return prefix+sha.new(seed_string).hexdigest() 
>         
> 
>     def getZid(self):
>         ''' return unique id '''
>         return self._zid
> 
>     def ZID(self):
>         ''' return unique id named nicely for 
>             portal_catalog index names
>         '''
>         return self.getZid()
> 
>     def manage_afterClone(self, item):
>         self._zid = self._generateId(self.ZID(),self._prefix)
>         for object in item.objectValues():
>             if hasattr(object, 'manage_afterClone'):
>                 object.manage_afterClone(object)
>                 
> InitializeClass(UniqueZid)
> 
> 
> 
> 
> getObjectByZid  python script
> 
> #parameter = zid
> 
> result=context.portal_catalog(ZID=zid)
> if len(result):
>      if len(result) > 1:
>          raise LookupError,"More than one object has the same ZID!"
>      result = result[0]
>      object = result.getObject(result.data_record_id_)
>      return object.view()
> else:
>   return None
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Zope-CMF maillist  -  Zope-CMF@zope.org
> http://lists.zope.org/mailman/listinfo/zope-cmf
> 
> See http://collector.zope.org/CMF for bug reports and feature requests