[Checkins] SVN: zc.zodbdgc/branches/dev/src/zc/zodbdgc/README.txt Updated documentation.

Jim Fulton jim at zope.com
Thu Jun 11 15:36:37 EDT 2009


Log message for revision 100886:
  Updated documentation.
  

Changed:
  U   zc.zodbdgc/branches/dev/src/zc/zodbdgc/README.txt

-=-
Modified: zc.zodbdgc/branches/dev/src/zc/zodbdgc/README.txt
===================================================================
--- zc.zodbdgc/branches/dev/src/zc/zodbdgc/README.txt	2009-06-11 19:36:15 UTC (rev 100885)
+++ zc.zodbdgc/branches/dev/src/zc/zodbdgc/README.txt	2009-06-11 19:36:37 UTC (rev 100886)
@@ -1,31 +1,63 @@
 ZODB Distributed GC
 ===================
 
-This package provides a script for performing distributed garbage
-collection for a collection of ZODB storages, which will typically be
-ZEO clients.
+This package provides 2 scripts, for multi-database garbage collection
+and database validation.
 
-Note that this script will likely be included in future ZODB
-releases. It's being developed independently now because it is new and
-we don't want to be limited by or to affect the ZODB release cycle.
+The scripts require that the databases provided to them use 64-bit
+object ids.  The garbage-collection script also assumes that the
+databases support efficient iteration from transactions near the end
+of the databases.
 
-The script takes the fillowing options:
+multi-zodb-gc
+-------------
 
--d n, --days n
+The multi-zodb-gc script takes one or 2 configuration files.  If a
+single configuration file is given, garbage collection is performed on
+the databases specified by the configuration files.  If garbage is
+found, then delete records are written to the databases.  When the
+databases are subsequently packed to a time after the delete records
+are written, the garbage objects will be removed.
 
-   Provide the number of days in the past to garbage collect to.  And
-   objects written after than number of days will be considered to be
-   non garbage.  This defaults to 3.
+If a second configuration file is given, then the databases specified
+in the second configuration file will be used to find garbage.
+Deleted records are still written to the databases given in the first
+configuration file.  When using replicated-database technology,
+analysis can be performed using secondary storages, which are usually
+lightly loaded.  This is helpful because finding garbage places a
+significant load on the databases used to find garbage.
 
--s config, --storage config
+Some number of trailing days (1 by default) of database records are
+considered good, meaning the objects referenced by them are not
+garbage. This allows the garbage-collection algorithm to work more
+efficiently and avoids problems when applications (incorrectly) do
+things that cause objects to be temporarily unreferenced, such as
+moving objects in 2 transactions.
 
-   The name of a configuration file defining storages to be garbage
-   collected.
+Options can be used to control the number of days of trailing data to
+be treated as non garbage and to specify the logging level.  Use the
+``--help`` option to get details.
 
--a config, --analyze config
+multi-zodb-check-refs
+---------------------
 
-   The name of a configuration file defining storage servers to use
-   for analysis.  This is useful with replicated storages, as it
-   allows analysis to take place using stprage servers that are under
-   lighter load.  If not provided, then the storages specified using
-   the --storage option are used for analysis.
+The multi-zodb-check-refs script validates a collection of databases
+by starting with their roots and traversing the databases to make sure
+all referenced objects are reachable.  Any unreachable objects are
+reported. If any databases are configured to disallow implicit
+cross-database references, then invalid references are reported as
+well.  Blob records are checked to make sure their blob files can be
+loaded.
+
+Optionally, a database of reference information can be generated. This
+database allows you to find objects referencing a given object id in a
+database. This can be very useful to debugging missing objects.
+Generation of the references database increases the analysis time
+substantially. The references database can become quite large, often a
+substantial percentage of the size of the databases being analyzed.
+Typically, you'll perform an initial analysis without a references
+database and only create a references file in a subsequent run if
+problems are found.
+
+You can run the script with the ``--help`` option to get usage
+information.



More information about the Checkins mailing list