[Checkins] SVN: z3c.vcsync/trunk/src/z3c/vcsync/ Adjust to:

Martijn Faassen faassen at infrae.com
Wed Jul 4 11:56:05 EDT 2007


Log message for revision 77405:
  Adjust to:
  
  * remove some superfluous tracking code. py.path already does this properly
    for SVN so we don't have to
  
  * add the concept of a state object which allows for serializing only
    those objects that have been changed or added, and removing those files
    and directories that are not in use anymore.
  

Changed:
  U   z3c.vcsync/trunk/src/z3c/vcsync/README.txt
  U   z3c.vcsync/trunk/src/z3c/vcsync/interfaces.py
  U   z3c.vcsync/trunk/src/z3c/vcsync/tests.py
  U   z3c.vcsync/trunk/src/z3c/vcsync/vc.py

-=-
Modified: z3c.vcsync/trunk/src/z3c/vcsync/README.txt
===================================================================
--- z3c.vcsync/trunk/src/z3c/vcsync/README.txt	2007-07-04 15:55:54 UTC (rev 77404)
+++ z3c.vcsync/trunk/src/z3c/vcsync/README.txt	2007-07-04 15:56:05 UTC (rev 77405)
@@ -29,6 +29,13 @@
 concluded, the state of the persistent objects and that of the local
 SVN checkout will always be perfectly in sync.
 
+During synchronizing, the system tries to take care only to
+synchronize those objects and files that have changed. That is, in
+step 1) only those objects that have been modified, added or removed
+will have an effect on the checkout. In step 4) only those files that
+have been changed, added or removed on the filesystem due to the
+``up`` action will change the persistent object state.
+
 SVN difficulties
 ----------------
 
@@ -114,13 +121,27 @@
 persistent python objects to the version control checkout directory in
 the form of files and directories. 
  
-Content is assumed to consist of two types of objects:
+Content is represented by an ``IState``. This supports two methods:
 
-* containers. These are represented as directories on the filesystem.
+* ``objects(dt)``: any object that has been modified since dt. Returning
+  'too many' objects (objects that weren't modified) is safe, though less
+  efficient as they will then be re-exported. 
 
-* items. These are represented as files on the filesystem. The files
-  will have an extension to indicate the type of item.
+  Typically in your application this would be implemented as the
+  result of a catalog search.
 
+* ``removed(dt)``: any path that has had an object removed from it
+  since dt.  It is safe to return paths that have been removed and
+  have since been replaced by a different object with the same
+  name. It is also safe to return 'too many' paths, though less
+  efficient as the objects in these paths may be re-exported
+  unnecessarily. 
+
+  Typically in your application you would maintain a list of removed
+  objects by hooking into IObjectRemovedEvent and recording the paths
+  of all objects that were removed. After an export it is safe to
+  purge this list.
+
 Let's imagine we have this object structure consisting of a container
 with some items and sub-containers in it::
 
@@ -143,9 +164,20 @@
   >>> testpath = create_test_dir()
   >>> checkout = TestCheckout(testpath)
 
+We also have a test state representing the object data::
+
+  >>> state = TestState(data)
+
+The test state will always return a list of all objects. We pass in
+``None`` for the datetime here, as the TestState ignores this
+information anyway::
+
+  >>> sorted([obj.__name__ for obj in state.objects(None)])
+  ['bar', 'foo', 'qux', 'root', 'sub']
+
 The object structure can now be saved into that checkout::
 
-  >>> checkout.save(data)
+  >>> checkout.save(state, None)
 
 The filesystem should now contain the right objects.
 
@@ -179,30 +211,12 @@
   >>> sub_path.join('qux.test').read()
   '3\n'
 
-We know that no existing files or directories were deleted by this save,
-as the checkout was empty before this::
-
-  >>> checkout.deleted_by_save()
-  []
-
-We also know that certain files have been added::
-
-  >>> rel_paths(checkout, checkout.added_by_save())
-  ['/root', '/root/bar.test', '/root/foo.test', '/root/sub', 
-   '/root/sub/qux.test']
-
 Modifying an existing checkout
 ------------------------------
 
 Now let's assume that the version control checkout is that as
-generated by step 1a). We will bring it to its initial state first::
+generated by step 1a). We will now change some data in the ZODB again.
 
-  >>> checkout.clear()
-
-We will now change some data in the ZODB again to test whether we
-detect additions and deletions (we need to inform the version control
-system about these).
-
 Let's add ``hoi``::
   
   >>> data['hoi'] = Item(payload=4)
@@ -211,53 +225,52 @@
 
   >>> del data['bar']
 
+Since we are removing something, we need inform the state about it. We
+do this manually here, though in a real application typically you
+would subscribe to the ``IObjectRemovedEvent``.
+
+  >>> removed_paths = ['/root/bar']
+  >>> state.removed_paths = removed_paths
+
 Let's save the object structure again to the same checkout::
   
-  >>> checkout.save(data)
+  >>> checkout.save(state, None)
 
-The checkout will now know which files were added and deleted during
-the save::
+We expect the ``hoi.test`` file to be added::
 
-  >>> rel_paths(checkout, checkout.added_by_save())
-  ['/root/hoi.test']
+  >>> root.join('hoi.test').read()
+  '4\n'
 
-We also know which files got deleted::
+We also expect the ``bar.test`` file to be removed::
 
-  >>> rel_paths(checkout, checkout.deleted_by_save())
-  ['/root/bar.test']
+  >>> root.join('bar.test').check()
+  False
 
 Modifying an existing checkout, some edge cases
 -----------------------------------------------
 
-Let's take our checkout as one fully synched up again::
-
-  >>> checkout.clear()
-
 The ZODB has changed again.  Item 'hoi' has changed from an item into
 a container::
 
   >>> del data['hoi']
   >>> data['hoi'] = Container()
 
-We put some things into the container::
+Let's create a new removed list. The item 'hoi' was removed before it
+was removed with a new container with the same name, so we have to
+remember this::
 
+  >>> removed_paths = ['/root/hoi']
+  >>> state.removed_paths = removed_paths
+
+We put some things into the new container::
+
   >>> data['hoi']['something'] = Item(payload=15)
 
 We export again into the existing checkout (which still has 'hoi' as a
 file)::
 
-  >>> checkout.save(data)
+  >>> checkout.save(state, None)
 
-The file ``hoi.test`` should now be removed::
-
-  >>> rel_paths(checkout, checkout.deleted_by_save())
-  ['/root/hoi.test']
-
-And the directory ``hoi`` should now be added::
-
-  >>> rel_paths(checkout, checkout.added_by_save())
-  ['/root/hoi', '/root/hoi/something.test']
-
 Let's check the filesystem state::
 
   >>> sorted([entry.basename for entry in root.listdir()])
@@ -270,37 +283,23 @@
   >>> something_path.read()
   '15\n'
 
-Let's now consider the checkout synched up entirely again::
-
-  >>> checkout.clear()
-
 Let's now change the ZODB again and change the ``hoi`` container back
 into a file::
 
   >>> del data['hoi']
   >>> data['hoi'] = Item(payload=16)
-  >>> checkout.save(data)
+  >>> checkout.save(state, None)
 
-The ``hoi`` directory (and everything in it, implicitly) is now
-deleted::
+This means we need to mark the path to the container to be removed::
 
-  >>> rel_paths(checkout, checkout.deleted_by_save())
-  ['/root/hoi']
+  >>> removed_paths = ['/root/hoi']
+  >>> state.removed_paths = removed_paths
 
-We have added ``hoi.test``::
-
-  >>> rel_paths(checkout, checkout.added_by_save())
-  ['/root/hoi.test']
-
 We expect to see a ``hoi.test`` but no ``hoi`` directory anymore::
 
   >>> sorted([entry.basename for entry in root.listdir()])
   ['foo.test', 'hoi.test', 'sub']
 
-Let's be synched-up again::
-
-  >>> checkout.clear()
-
 Note: creating a container with the name ``hoi.test`` (using the
 ``.test`` postfix) will lead to trouble now, as we already have a file
 ``hoi.test``. ``svn`` doesn't allow a single-step replace of a file
@@ -550,7 +549,8 @@
 
 Now we'll synchronize with the memory structure::
 
-  >>> checkout.sync(container2)
+  >>> state = TestState(container2)
+  >>> checkout.sync(state, None)
 
 We expect the checkout to reflect the changed state of the ``hoi`` object::
 

Modified: z3c.vcsync/trunk/src/z3c/vcsync/interfaces.py
===================================================================
--- z3c.vcsync/trunk/src/z3c/vcsync/interfaces.py	2007-07-04 15:55:54 UTC (rev 77404)
+++ z3c.vcsync/trunk/src/z3c/vcsync/interfaces.py	2007-07-04 15:56:05 UTC (rev 77405)
@@ -1,4 +1,4 @@
-from zope.interface import Interface
+from zope.interface import Interface, Attribute
 
 class IVcDump(Interface):
     def save(checkout, path):
@@ -47,6 +47,27 @@
         """Update modification datetime.
         """
 
+class IState(Interface):
+    """Information about Python object state.
+    """
+    root = Attribute('The root container')
+
+    def objects(dt):
+        """Objects present in state.
+
+        Not all objects have to be returned. At a minimum, only those
+        objects that have been modified or added since dt need to
+        be returned.
+        """
+
+    def removed(dt):
+        """Paths removed.
+
+        Any path that has been removed since dt should be returned. This
+        path might have been added again later, so it is safe to return
+        paths of objects returned by the 'objects' method.
+        """
+
 class ICheckout(Interface):
     """A version control system checkout.
     """
@@ -74,22 +95,6 @@
         """Commit checkout to version control system.
         """
 
-    def add(path):
-        """Add a file to the checkout (so it gets committed).
-        """
-
-    def delete(path):
-        """Delete a file from the checkout (so the delete gets committed).
-        """
-
-    def added_by_save():
-        """A list of files and directories that have been added by a save.
-        """
-
-    def deleted_by_save():
-        """A list of files and directories that have been deleted by a save.
-        """
-
     def added_by_up():
         """A list of those files that have been added after 'up'.
         """

Modified: z3c.vcsync/trunk/src/z3c/vcsync/tests.py
===================================================================
--- z3c.vcsync/trunk/src/z3c/vcsync/tests.py	2007-07-04 15:55:54 UTC (rev 77404)
+++ z3c.vcsync/trunk/src/z3c/vcsync/tests.py	2007-07-04 15:56:05 UTC (rev 77405)
@@ -30,11 +30,41 @@
     def commit(self, message):
         pass
 
+class TestState(object):
+    def __init__(self, root):
+        self.root = root
+        self.removed_paths = []
+
+    def objects(self, dt):
+        for container in self.containers(dt):
+            for item in container.values():
+                if not IContainer.providedBy(item):
+                    yield item
+            # yield container after items in container,
+            # to test creation of directories when items are
+            # thrown up that don't have directories yet
+            yield container
+
+    def removed(self, dt):
+        return self.removed_paths
+    
+    def containers(self, dt):
+        return self._containers_helper(self.root)
+
+    def _containers_helper(self, container):
+        yield container
+        for obj in container.values():
+            if not IContainer.providedBy(obj):
+                continue
+            for sub_container in self._containers_helper(obj):
+                yield sub_container
+
 class Container(object):
     implements(IContainer)
     
     def __init__(self):
         self.__name__ = None
+        self.__parent__ = None
         self._data = {}
 
     def keys(self):
@@ -51,6 +81,7 @@
             raise DuplicationError
         self._data[name] = value
         value.__name__ = name
+        value.__parent__ = self
         
     def __getitem__(self, name):
         return self._data[name]
@@ -83,6 +114,7 @@
 
 globs = {'Container': Container,
          'TestCheckout': TestCheckout,
+         'TestState': TestState,
          'create_test_dir': create_test_dir,
          'rel_paths': rel_paths}
 

Modified: z3c.vcsync/trunk/src/z3c/vcsync/vc.py
===================================================================
--- z3c.vcsync/trunk/src/z3c/vcsync/vc.py	2007-07-04 15:55:54 UTC (rev 77404)
+++ z3c.vcsync/trunk/src/z3c/vcsync/vc.py	2007-07-04 15:56:05 UTC (rev 77405)
@@ -1,8 +1,10 @@
 import os
+from datetime import datetime
 
 from zope.interface import Interface
-from zope.component import queryUtility
+from zope.component import queryUtility, queryAdapter
 from zope.app.container.interfaces import IContainer
+from zope.traversing.interfaces import IPhysicallyLocatable
 
 from z3c.vcsync.interfaces import (IVcDump, IVcLoad,
                                    ISerializer, IVcFactory,
@@ -21,8 +23,6 @@
     def save(self, checkout, path):
         serializer = ISerializer(self.context)
         path = path.join(serializer.name())
-        if not path.check():
-            checkout.add(path)
         path.ensure()
         f = path.open('w')
         serializer.serialize(f)
@@ -35,18 +35,7 @@
         
     def save(self, checkout, path):
         path = path.join(self.context.__name__)
-        if not path.check():
-            checkout.add(path)
         path.ensure(dir=True)
-        added_paths = []
-        for value in self.context.values():
-            added_paths.append(IVcDump(value).save(checkout, path))
-        # remove any paths not there anymore
-        for existing_path in path.listdir():
-            if existing_path not in added_paths:
-                checkout.delete(existing_path)
-                existing_path.remove()
-        return path
 
 class ContainerVcLoad(grok.Adapter):
     grok.provides(IVcLoad)
@@ -61,8 +50,6 @@
                 object_name = '' # containers are indicated by empty string
             else:
                 object_name = sub.ext
-            #if sub.read().strip() == '200':
-            #    import pdb; pdb.set_trace()
             factory = queryUtility(IVcFactory, name=object_name, default=None)
             # we cannot handle this kind of object, so skip it
             if factory is None:
@@ -89,18 +76,48 @@
     
     def __init__(self, path):
         self.path = path
-        self.clear()
 
-    def sync(self, object, message=''):
-        self.save(object)
+    def sync(self, state, dt, message=''):
+        self.save(state, dt)
         self.up()
         self.resolve()
-        self.load(object)
+        self.load(state.root)
         self.commit(message)
 
-    def save(self, object):
-        IVcDump(object).save(self, self.path)
+    def get_container_path(self, root, obj):
+        steps = []
+        while obj is not root:
+            obj = obj.__parent__
+            steps.append(obj.__name__)
+        steps.reverse()
+        return self.path.join(*steps)
 
+    def save(self, state, dt):
+        root = state.root
+
+        # remove all files that have been removed in the database
+        path = self.path
+        for removed_path in state.removed(dt):
+            # construct path to directory containing file/dir to remove
+            steps = removed_path.split('/')
+            container_dir_path = path.join(*steps[:-1])
+            # construct path to potential directory to remove
+            name = steps[-1]
+            potential_dir_path = container_dir_path.join(name)
+            if potential_dir_path.check():
+                # the directory exists, so remove it
+                potential_dir_path.remove()
+            else:
+                # there is no directory, so it must be a file to remove
+                # find the file and remove it
+                file_paths = list(container_dir_path.listdir('%s.*' % name))
+                assert len(file_paths) == 1
+                file_paths[0].remove()
+        # now save all files that have been modified/added
+        for obj in state.objects(dt):
+            IVcDump(obj).save(self,
+                               self.get_container_path(root, obj))
+
     def load(self, object):
         # XXX can only load containers here, not items
         names = [path.purebasename for path in self.path.listdir()
@@ -108,10 +125,6 @@
         assert len(names) == 1
         IVcLoad(object).load(self, self.path.join(names[0]))
         
-    def clear(self):
-        self._added_by_save = []
-        self._deleted_by_save = []
-        
     def up(self):
         raise NotImplementedError
 
@@ -121,18 +134,6 @@
     def commit(self, message):
         raise NotImplementedError
 
-    def add(self, path):
-        self._added_by_save.append(path)
-
-    def delete(self, path):
-        self._deleted_by_save.append(path)
-
-    def added_by_save(self):
-        return self._added_by_save
-
-    def deleted_by_save(self):
-        return self._deleted_by_save
-
     def added_by_up(self):
         raise NotImplementedError
 
@@ -149,4 +150,3 @@
     def modified_since(self, dt):
         # containers themselves are never modified
         return False
-



More information about the Checkins mailing list