[Checkins] SVN: z3c.vcsync/trunk/ use revision_nr instead of datetime to determine what changed (both in

Martijn Faassen faassen at infrae.com
Wed Nov 21 13:40:10 EST 2007


Log message for revision 81953:
  use revision_nr instead of datetime to determine what changed (both in
  SVN as well as in ZODB) since last synchronization.
  

Changed:
  U   z3c.vcsync/trunk/CHANGES.txt
  U   z3c.vcsync/trunk/setup.py
  U   z3c.vcsync/trunk/src/z3c/vcsync/README.txt
  U   z3c.vcsync/trunk/src/z3c/vcsync/interfaces.py
  U   z3c.vcsync/trunk/src/z3c/vcsync/svn.py
  U   z3c.vcsync/trunk/src/z3c/vcsync/tests.py
  U   z3c.vcsync/trunk/src/z3c/vcsync/vc.py

-=-
Modified: z3c.vcsync/trunk/CHANGES.txt
===================================================================
--- z3c.vcsync/trunk/CHANGES.txt	2007-11-21 13:31:41 UTC (rev 81952)
+++ z3c.vcsync/trunk/CHANGES.txt	2007-11-21 18:40:10 UTC (rev 81953)
@@ -1,8 +1,8 @@
 z3c.vcsync changes
 ==================
 
-0.8.2 (unreleased)
------------------
+0.9 (unreleased)
+----------------
 
 Bugs fixed
 ~~~~~~~~~~
@@ -19,6 +19,22 @@
   starts, but after up (as rm-ing a directory marked for removal
   before svn up will actually re-add this directory!).
 
+Restructuring
+~~~~~~~~~~~~~
+
+* Previously the datetime of last synchronization was used to
+  determine what to synchronize both in the ZODB as well as in the
+  checkout. This has a significant drawback if the datetime setting of
+  the computer the synchronization code is running on is ahead of the
+  datetime setting of the version control server: updates could be
+  lost. 
+
+  Changed the code to use a revision_nr instead. This is a number that
+  increments with each synchronization, and the number can be used to
+  determine both what changes have been made since last
+  synchronization in the ZODB as well as in the version control
+  system. This is a more robust approach.
+
 0.8.1 (2007-11-07)
 ------------------
 

Modified: z3c.vcsync/trunk/setup.py
===================================================================
--- z3c.vcsync/trunk/setup.py	2007-11-21 13:31:41 UTC (rev 81952)
+++ z3c.vcsync/trunk/setup.py	2007-11-21 18:40:10 UTC (rev 81953)
@@ -2,7 +2,7 @@
 import sys, os
 
 setup(name='z3c.vcsync',
-      version='0.8.1dev',
+      version='0.9dev',
       description="Sync ZODB data with version control system, currently SVN",
       package_dir={'': 'src'},
       packages=find_packages('src'),

Modified: z3c.vcsync/trunk/src/z3c/vcsync/README.txt
===================================================================
--- z3c.vcsync/trunk/src/z3c/vcsync/README.txt	2007-11-21 13:31:41 UTC (rev 81952)
+++ z3c.vcsync/trunk/src/z3c/vcsync/README.txt	2007-11-21 18:40:10 UTC (rev 81953)
@@ -123,19 +123,20 @@
  
 Content is represented by an ``IState``. This supports two methods:
 
-* ``objects(dt)``: any object that has been modified since dt. Returning
-  'too many' objects (objects that weren't modified) is safe, though less
-  efficient as they will then be re-exported. 
+* ``objects(revision_nr)``: any object that has been modified since
+  revision_nr. Returning 'too many' objects (objects that weren't
+  modified) is safe, though less efficient as they will then be
+  re-exported.
 
   Typically in your application this would be implemented as the
   result of a catalog search.
 
-* ``removed(dt)``: any path that has had an object removed from it
-  since dt.  It is safe to return paths that have been removed and
-  have since been replaced by a different object with the same
-  name. It is also safe to return 'too many' paths, though less
-  efficient as the objects in these paths may be re-exported
-  unnecessarily. 
+* ``removed(revision_nr)``: any path that has had an object removed
+  from it since revision_nr.  It is safe to return paths that have
+  been removed and have since been replaced by a different object with
+  the same name. It is also safe to return 'too many' paths, though
+  less efficient as the objects in these paths may be re-exported
+  unnecessarily.
 
   Typically in your application you would maintain a list of removed
   objects by hooking into IObjectRemovedEvent and recording the paths
@@ -169,7 +170,7 @@
   >>> state = TestState(data)
 
 The test state will always return a list of all objects. We pass in
-``None`` for the datetime here, as the TestState ignores this
+``None`` for the revision_nr here, as the TestState ignores this
 information anyway::
 
   >>> sorted([obj.__name__ for obj in state.objects(None)])
@@ -182,7 +183,7 @@
   >>> s = Synchronizer(checkout, state)
 
 We now save the state into that checkout. We are passing ``None`` for
-the dt for the time being::
+the revision_nr for the time being::
 
   >>> s.save(None)
 

Modified: z3c.vcsync/trunk/src/z3c/vcsync/interfaces.py
===================================================================
--- z3c.vcsync/trunk/src/z3c/vcsync/interfaces.py	2007-11-21 13:31:41 UTC (rev 81952)
+++ z3c.vcsync/trunk/src/z3c/vcsync/interfaces.py	2007-11-21 18:40:10 UTC (rev 81953)
@@ -39,24 +39,31 @@
     checkout = Attribute('Version control system checkout')
     state = Attribute('Persistent state')
     
-    def sync(dt, message=''):
+    def sync(revision_nr, message=''):
         """Synchronize persistent Python state with version control system.
 
-        dt - date since when to look for state changes. datestamp should
-             have actual timezone identifier (non-naive).
+        revision_nr - Revision number since when we want to synchronize.
+             Revision number are assumed to increment over time as new
+             revisions are made (through synchronisation). It is
+             possible to identify changes to both the checkout as well
+             as the ZODB by this revision number.  Normally a version
+             control system such as SVN controls these.
         message - message to commit any version control changes.
+
+        Returns the revision number of the version control system that
+        we have now synchronized with.
         """
-
-    def save(dt):
+        
+    def save(revision_nr):
         """Save state to filesystem location of checkout.
 
-        dt - timestamp after which to look for state changes.
+        revision_nr - revision_nr since when there have been state changes.
         """
 
-    def load(dt):
+    def load(revision_nr):
         """Load the filesystem information into persistent state.
 
-        dt - timestamp after which to look for filesystem changes.
+        revision_nr - revision_nr after which to look for filesystem changes.
         """
     
 class ICheckout(Interface):
@@ -68,6 +75,10 @@
         """Update the checkout with the state of the version control system.
         """
 
+    def revision_nr():
+        """Current revision number of the checkout.
+        """
+        
     def resolve():
         """Resolve all conflicts that may be in the checkout.
         """
@@ -76,14 +87,14 @@
         """Commit checkout to version control system.
         """
 
-    def files(dt):
-        """Files added/modified in state since dt.
+    def files(revision_nr):
+        """Files added/modified in state since revision_nr.
 
-        Returns paths to files that were added/modified since dt.
+        Returns paths to files that were added/modified since revision_nr.
         """
 
-    def removed(dt):
-        """Files removed in state since dt.
+    def removed(revision_nr):
+        """Files removed in state since revision_nr.
 
         Returns filesystem (py) paths to files that were removed.
         """
@@ -93,24 +104,25 @@
     """
     root = Attribute('The root container')
 
-    def objects(dt):
-        """Objects modified/added in state since dt.
+    def objects(revision_nr):
+        """Objects modified/added in state since revision_nr.
 
         Ideally, only those objects that have been modified or added
-        since dt should be returned. Returning more objects (as long
-        as they exist) is safe, however, though less efficient.
+        since the synchronisation marked by revision_nr should be
+        returned. Returning more objects (as long as they exist) is
+        safe, however, though less efficient.
         """
 
-    def removed(dt):
-        """Paths removed since dt.
+    def removed(revision_nr):
+        """Paths removed since revision_nr.
 
         The path is a path from the state root object to the actual
         object that was removed. It is therefore not the same as the
         physically locatable path.
 
-        Ideally, only those paths that have been removed since dt
-        should be returned. It is safe to return paths that were added
-        again later, so it is safe to return paths of objects returned
-        by the 'objects' method.
+        Ideally, only those paths that have been removed since the
+        synchronisation marked by revision_nr should be returned. It
+        is safe to return paths that were added again later, so it is
+        safe to return paths of objects returned by the 'objects'
+        method.
         """
-

Modified: z3c.vcsync/trunk/src/z3c/vcsync/svn.py
===================================================================
--- z3c.vcsync/trunk/src/z3c/vcsync/svn.py	2007-11-21 13:31:41 UTC (rev 81952)
+++ z3c.vcsync/trunk/src/z3c/vcsync/svn.py	2007-11-21 18:40:10 UTC (rev 81953)
@@ -1,9 +1,6 @@
 import py
 from datetime import datetime
 
-# amount of log entries to search through in a single step
-LOG_STEP = 5
-
 class SvnCheckout(object):
     """A checkout for SVN.
 
@@ -14,7 +11,7 @@
         self.path = path
         self._files = set()
         self._removed = set()
-        self._updated_dt = None
+        self._updated_revision_nr = None
     
     def _repository_url(self):
         prefix = 'Repository Root: '
@@ -33,7 +30,7 @@
     
     def up(self):        
         self.path.update()
-        self._updated_dt = None
+        self._updated_revision_nr = None
         
     def resolve(self):
         _resolve_helper(self.path)
@@ -41,53 +38,48 @@
     def commit(self, message):
         self.path.commit(message)
 
-    def files(self, dt):
-        self._update_files(dt)
+    def files(self, revision_nr):
+        self._update_files(revision_nr)
         return list(self._files)
     
-    def removed(self, dt):
-        self._update_files(dt)
+    def removed(self, revision_nr):
+        self._update_files(revision_nr)
         return list(self._removed)
 
-    def _update_files(self, dt):
+    def revision_nr(self):
+        return int(self.path.status().rev)
+    
+    def _update_files(self, revision_nr):
         """Go through svn log and update self._files and self._removed.
         """
-        if self._updated_dt == dt:
+        new_revision_nr = int(self.path.status().rev)
+        if self._updated_revision_nr == new_revision_nr:
             return
-        files = set()
-        removed = set()
+        # logs won't include revision_nr itself, but that's what we want
+        if new_revision_nr > revision_nr:
+            logs = self.path.log(revision_nr, new_revision_nr, verbose=True)
+        else:
+            # the log function always seem to return at least one log
+            # entry (the latest one). This way we skip that check if not
+            # needed
+            logs = []
         checkout_path = self._checkout_path()
+        files, removed = self._info_from_logs(logs, checkout_path)
 
-        # step backwards through svn log until we're done
-        rev = int(self.path.status().rev)
-        while True:
-            prev_rev = rev - LOG_STEP
-            try:
-                logs = self.path.log(prev_rev, rev, verbose=True)
-            except ValueError:
-                # no more revisions available, bail out too
-                break
-            done = self._update_from_logs(logs, dt, checkout_path,
-                                          files, removed)
-            if done:
-                break
-            rev = prev_rev - 1
-        
         self._files = files
         self._removed = removed
-        self._updated_dt = dt
+        self._updated_revision_nr = new_revision_nr
 
-    def _update_from_logs(self, logs, dt, checkout_path, files, removed):
-        """Update files and removed from logs.
+    def _info_from_logs(self, logs, checkout_path):
+        """Get files and removed lists from logs.
+        """        
+        files = set()
+        removed = set()
 
-        Return True if we're done.
-        """
         # go from newest to oldest
         logs.reverse()
+        
         for log in logs:
-            log_dt = datetime.fromtimestamp(log.date).replace(tzinfo=dt.tzinfo)
-            if log_dt < dt:
-                return True
             for p in log.strpaths:
                 rel_path = p.strpath[len(checkout_path):]
                 steps = rel_path.split(self.path.sep)
@@ -97,8 +89,8 @@
                     removed.add(path)
                 else:
                     files.add(path)                
-        return False
-    
+        return files, removed
+
 def _resolve_helper(path):
     for p in path.listdir():
         if not p.check(dir=True):

Modified: z3c.vcsync/trunk/src/z3c/vcsync/tests.py
===================================================================
--- z3c.vcsync/trunk/src/z3c/vcsync/tests.py	2007-11-21 13:31:41 UTC (rev 81952)
+++ z3c.vcsync/trunk/src/z3c/vcsync/tests.py	2007-11-21 18:40:10 UTC (rev 81953)
@@ -35,10 +35,10 @@
     def commit(self, message):
         pass
 
-    def files(self, dt):
+    def files(self, revision_nr):
         return self._files
 
-    def removed(self, dt):
+    def removed(self, revision_nr):
         return self._removed
     
 class TestState(vc.AllState):
@@ -47,7 +47,7 @@
         super(TestState, self).__init__(root)
         self.removed_paths = []
 
-    def removed(self, dt):
+    def removed(self, revision_nr):
         return self.removed_paths
 
 class Container(object):

Modified: z3c.vcsync/trunk/src/z3c/vcsync/vc.py
===================================================================
--- z3c.vcsync/trunk/src/z3c/vcsync/vc.py	2007-11-21 13:31:41 UTC (rev 81952)
+++ z3c.vcsync/trunk/src/z3c/vcsync/vc.py	2007-11-21 18:40:10 UTC (rev 81953)
@@ -74,8 +74,8 @@
         self.state = state
         self._to_remove = []
 
-    def sync(self, dt, message=''):
-        self.save(dt)
+    def sync(self, revision_nr, message=''):
+        self.save(revision_nr)
         self.checkout.up()
         self.checkout.resolve()
         # now after doing an up, remove dirs that can be removed
@@ -85,13 +85,14 @@
         # the ZODB when we do a load.
         for to_remove in self._to_remove:
             py.path.local(to_remove).remove(rec=True)
-        self.load(dt)
+        self.load(revision_nr)
         self.checkout.commit(message)
-
-    def save(self, dt):
+        return self.checkout.revision_nr()
+    
+    def save(self, revision_nr):
         # remove all files that have been removed in the database
         path = self.checkout.path
-        for removed_path in self.state.removed(dt):
+        for removed_path in self.state.removed(revision_nr):
             # construct path to directory containing file/dir to remove
             steps = removed_path.split('/')
             container_dir_path = path.join(*steps[:-1])
@@ -118,14 +119,14 @@
 
         # now save all files that have been modified/added
         root = self.state.root
-        for obj in self.state.objects(dt):
+        for obj in self.state.objects(revision_nr):
             IVcDump(obj).save(self._get_container_path(root, obj))
 
-    def load(self, dt):
+    def load(self, revision_nr):
         # remove all objects that have been removed in the checkout
         root = self.state.root
         # sort to ensure that containers are deleted before items in them
-        removed_paths = self.checkout.removed(dt)
+        removed_paths = self.checkout.removed(revision_nr)
         removed_paths.sort()
         for removed_path in removed_paths:
             obj = resolve(root, self.checkout.path, removed_path)
@@ -135,7 +136,7 @@
                 del obj.__parent__[obj.__name__]
         # now modify/add all objects that have been modified/added in the
         # checkout
-        file_paths = self.checkout.files(dt)
+        file_paths = self.checkout.files(revision_nr)
         # to ensure that containers are created before items we sort them
         file_paths.sort()
         for file_path in file_paths:
@@ -171,17 +172,17 @@
     def __init__(self, root):
         self.root = root
 
-    def objects(self, dt):
-        for container in self._containers(dt):
+    def objects(self, revision_nr):
+        for container in self._containers(revision_nr):
             for item in container.values():
                 if not IContainer.providedBy(item):
                     yield item
             yield container
 
-    def removed(self, dt):
+    def removed(self, revision_nr):
         return []
     
-    def _containers(self, dt):
+    def _containers(self, revision_nr):
         return self._containers_helper(self.root)
 
     def _containers_helper(self, container):



More information about the Checkins mailing list