[Checkins] SVN: zc.async/trunk/src/zc/async/README.txt Mostly an attempt to improve docs, with some improvements of tests.

Wed Aug 16 23:21:29 EDT 2006

Log message for revision 69579:
  Mostly an attempt to improve docs, with some improvements of tests.
  

Changed:
  U   zc.async/trunk/src/zc/async/README.txt

-=-
Modified: zc.async/trunk/src/zc/async/README.txt
===================================================================

--- zc.async/trunk/src/zc/async/README.txt	2006-08-17 02:37:14 UTC (rev 69578)
+++ zc.async/trunk/src/zc/async/README.txt	2006-08-17 03:21:28 UTC (rev 69579)
@@ -20,12 +20,11 @@
 
 Worker data objects have queues representing potential or current tasks
 for the worker, in the main thread or a secondary thread.  Each worker
-has a virtual loop, part of the Twisted or asyncore main loop, for every
-worker process, which is responsible for responding to system calls
-(like pings) and for claiming pending main thread calls by moving them
-from the datamanager async queue to their own.  Each worker thread queue
-also represents spots for claiming and performing pending thread
-calls.
+has a virtual loop, part of the Twisted main loop, for every worker
+process, which is responsible for responding to system calls (like
+pings) and for claiming pending main thread calls by moving them from
+the datamanager async queue to their own.  Each worker thread queue also
+represents spots for claiming and performing pending thread calls.
 
 Set Up
 ======
@@ -34,35 +33,19 @@
 the ZODB, alongside the application object, with a key of
 'zc.async.datamanager'.  The package includes subscribers to
 zope.app.appsetup.interfaces.IDatabaseOpenedEvent that sets an instance
-up in this location if one does not exist.
+up in this location if one does not exist [#subscribers]_.
 
-One version of the subscriber expects to put the object in the same
-database as the main application (`basicInstallerAndNotifier`), and the
-other version expects to put the object in a secondary database, with a
-reference to it in the main database (`installerAndNotifier`).  The
-second approach keeps the database churn generated by zc.async, which
-can be significant, separate from your main data.  You can use either
-(or your own); the first version is the default, since it requires no
-additional set-up. When this documentation is run as a test, it is run
-twice, once with each setup.  To accomodate this, in our example below
-we appear to pull the "installerAndNotifier" out of the air: it is
-installed as a global when the test is run.  You will want to use one of
-the two subscribers mentioned above, or roll your own.
-
-XXX explain possible gotchas if you run the separate database (i.e., you may
-have to explicitly add objects to connections if you create an object for the
-main database and put it as a partial callable or argument in the same
-transaction).
-
 Let's assume we have a reference to a database named `db`, a connection
-named `conn`, a `root`, and an application in the 'app' key
-[#setup]_.  If we provide a handler, fire the event and examine the
-root, we will see the new datamanager.
+named `conn`, a `root`, an application in the 'app' key [#setup]_, and a
+handler named `installerAndNotifier` [#handlers]_.  If we provide a
+handler, fire the event and examine the root, we will see the new
+datamanager.
 
     >>> import zope.component
     >>> import zc.async.subscribers
-    >>> zope.component.provideHandler(installerAndNotifier) # see above
-    ... # for explanation of where installerAndNotifier came from
+    >>> zope.component.provideHandler(installerAndNotifier) # see footnotes
+    ... # for explanation of where installerAndNotifier came from, and what
+    ... # it is.
     >>> import zope.event
     >>> import zope.app.appsetup.interfaces
     >>> zope.event.notify(zope.app.appsetup.interfaces.DatabaseOpened(db))
@@ -72,7 +55,8 @@
     <zc.async.datamanager.DataManager object at ...>
 
 The default adapter from persistent object to datamanager will get us
-the same result.
+the same result; adapting a persistent object to IDataManager is the
+preferred spelling.
 
     >>> import zc.async.adapters
     >>> zope.component.provideAdapter(
@@ -81,10 +65,13 @@
     >>> zc.async.interfaces.IDataManager(app) # doctest: +ELLIPSIS
     <zc.async.datamanager.DataManager object at ...>
 
-Normally, each process discovers or creates its UUID and registers
-itself with the data manager as a worker.  This would have happened when
-the data manager was announced as available in the InstallerAndNotifier
-above.
+Normally, each process discovers or creates its UUID, and starts an
+engine to do work.  The engine is a non-persistent object that
+participates in the Twisted main loop.  It discovers or creates the
+persistent worker object associated with the instance UUID in the
+datamanager's `workers` mapping, and starts polling.  This would have
+happened when the data manager was announced as available in the
+InstallerAndNotifier above.
 
     >>> from zope.component import eventtesting
     >>> evs = eventtesting.getEvents(
@@ -102,7 +89,10 @@
 Let's install the subscriber we need and refire the event.  Our worker
 will have a UUID created for it, and then it will be installed with the
 UUID as key.  We can't actually use the same event because it has an
-object from a different connection, so we'll recreate it.
+object from a different connection, so we'll recreate it.  We'll then use
+a magic `time_passes` function to simulate the Twisted reactor cycling and
+firing scheduled calls.  After we sync our connection with the database,
+the worker appears.  It is tied to the engineUUID of the current engine.
 
     >>> zope.component.provideHandler(
     ...     zc.async.subscribers.installTwistedEngine)
@@ -121,7 +111,7 @@
     ...  is not None)
     True
 
-The new UUID, in hex, is stored in INSTANCE_HOME/etc/uuid.txt
+The instance UUID, in hex, is stored in INSTANCE_HOME/etc/uuid.txt
 
     >>> import uuid
     >>> import os
@@ -137,7 +127,10 @@
 The file is intended to stay in the instance home as a persistent identifier
 of this particular worker.
 
-Our worker has `thread` and `reactor` jobs, with all jobs available.
+Our worker has `thread` and `reactor` jobs, with all jobs available.  By
+default, a worker begins offering a single thread job and a four
+"simultaneous" reactor jobs.  This can be changed simply by changing the value
+on the worker and committing.
 
     >>> worker.thread.size
     1
@@ -148,6 +141,19 @@
     >>> len(worker.reactor)
     0
 
+But what are `thread` and `reactor` jobs?
+
+A `thread` job is one that is performed in a thread with a dedicated
+ZODB connection.  It's the simplest to use for typical tasks.
+
+A thread job also may be overkill for some jobs that don't need a
+connection constantly.  It also is not friendly to Twisted services. 
+
+A `reactor` job is performed in the main thread, in a call scheduled by
+the Twisted reactor.  It has some gotchas (see zc.twist's README), but it
+can be good for jobs that don't need a constant connection, and for jobs
+that can leverage Twisted code.
+
 We now have a simple set up: a data manager with a single worker.  Let's start
 making some asynchronous calls!
 
@@ -155,7 +161,7 @@
 =========================
 
 The simplest case is simple to perform: pass a persistable callable to the
-manager's .add method.
+`put` method of one of the manager's queues.  We'll make reactor calls first.
 
     >>> from zc.async import interfaces
     >>> dm = zc.async.interfaces.IDataManager(app)
@@ -168,18 +174,47 @@
 use a helper function called `time_flies` to simulate the asynchronous
 cycles necessary for the manager and workers to perform the task.
 
-    >>> count = time_flies(dm.workers.values()[0].poll_seconds)
+    >>> dm.workers.values()[0].poll_seconds
+    5
+    >>> count = time_flies(5)
     imagine this sent a message to another machine
 
-You can also pass a datetime.datetime to schedule the call: the
-safest thing to use is a UTC timezone. The datetime is interpreted as a
-UTC datetime.
+We also could have used the method of a persistent object.  Here's another
+quick example.
 
+    >>> import persistent
+    >>> class Demo(persistent.Persistent):
+    ...     counter = 0
+    ...     def increase(self, value=1):
+    ...         self.counter += value
+    ...
+    >>> app['demo'] = Demo()
+    >>> transaction.commit()
+    >>> app['demo'].counter
+    0
+    >>> partial = dm.reactor.put(app['demo'].increase)
+    >>> transaction.commit()
+    >>> count = time_flies(5)
+
+We need to sync our connection so that we get the changes in other
+connections: we can do that with a transaction begin, commit, or abort.
+
+    >>> app['demo'].counter
+    0
     >>> t = transaction.begin()
+    >>> app['demo'].counter
+    1
+
+The method was called, and the persistent object modified!
+
+You can also pass a timezone-aware datetime.datetime to schedule a
+call.  The safest thing to use is a UTC timezone.
+
+    >>> t = transaction.begin()
     >>> import datetime
     >>> import pytz
     >>> datetime.datetime.now(pytz.UTC)
-    datetime.datetime(2006, 8, 10, 15, 44, 27, 211, tzinfo=<UTC>)
+    datetime.datetime(2006, 8, 10, 15, 44, 32, 211, tzinfo=<UTC>)
     >>> partial = dm.reactor.put(
     ...     send_message, datetime.datetime(
     ...         2006, 8, 10, 15, 45, tzinfo=pytz.UTC))
@@ -188,8 +223,8 @@
     >>> transaction.commit()
     >>> count = time_flies(10)
     >>> count = time_flies(10)
-    >>> count = time_flies(10)
     >>> count = time_flies(5)
+    >>> count = time_flies(5)
     imagine this sent a message to another machine
     >>> datetime.datetime.now(pytz.UTC)
     datetime.datetime(2006, 8, 10, 15, 45, 2, 211, tzinfo=<UTC>)
@@ -204,56 +239,49 @@
     >>> count = time_flies(5)
     imagine this sent a message to another machine
 
-The `add` method of the thread and reactor queues is the manager's
+The `put` method of the thread and reactor queues is the manager's
 entire application API.  Other methods are used to introspect, but are
-not needed for basic usage.  We will examine the introspection API below
-(`Manager Introspection`_), and will discuss an advanced feature of the
-`add` method (`Specifying Workers`), but let's explore some more usage
-patterns first.
+not needed for basic usage.
 
-Typical Usage: zc.async.Partial
-================================
+But what is that result of the `put` call in the examples above?  A
+partial?  What do you do with that?
 
-...(currently tests and discussion are in partial.txt and datamanager.txt.
-We need user-friendly docs, as well as stress tests.  The remainder of the
-below is somewhat unedited and incomplete at the moment)...
+Partials
+========
 
-    >>> t = transaction.begin()
-    >>> import zc.async
-    >>> import persistent
-    >>> import transaction
-    >>> import zc.async.partial
-    >>> class Demo(persistent.Persistent):
-    ...     counter = 0
-    ...     def increase(self, value=1):
-    ...         self.counter += value
-    ...
-    >>> app['demo'] = Demo()
-    >>> transaction.commit() # XXX example of gotcha for multiple databases:
-    ... # connection.add or commit before adding to partial
-    >>> app['demo'].counter
-    0
-    >>> partial = dm.reactor.put(
-    ...     zc.async.partial.Partial(app['demo'].increase))
-    >>> transaction.commit()
-    >>> count = time_flies(5)
+The result of a call to `put` returns an IDataManagerPartial.  The
+partial represents the pending call.  This object has a lot of
+functionality that's explored in other documents in this package, and
+demostrated a bit below, but here's a summary.  
 
-We need to commit the transaction in our connection so that we get the
-changes in other connections (beginning and committing transactions sync
-connections).
+- You can introspect it to look at, and even modify, the call and its
+  arguments.
 
-    >>> app['demo'].counter
-    0
-    >>> t = transaction.begin()
-    >>> app['demo'].counter
-    1
+- You can specify that the partial may or may not be run by given
+  workers (identifying them by their UUID).
 
-The deferred class can take arguments and keyword arguments for the
-wrapped callable as well, similar to Python 2.5's `partial`.  For this
-use case, though, realize that the partial will be called with no
-arguments, so you must supply all necessary arguments for the callable
-on creation time.
+- You can specify other calls that should be made on the basis of the
+  result of this call.
 
+- You can persist a reference to it, and periodically (after syncing
+  your connection with the database, which happens whenever you begin or
+  commit a transaction) check its `state` to see if it is equal to
+  zc.async.interfaces.COMPLETED.  When it is, the call has run to
+  completion, either to success or an exception.
+
+- You can look at the result of the call (once COMPLETED).  It might be
+  the result you expect, or a twisted.python.failure.Failure, which is a
+  way to safely communicate exceptions across connections and machines
+  and processes.
+
+What's more, you can pass a Partial to the `put` call.  This means that
+you aren't constrained to simply having simple non-argument calls
+performed asynchronously, but you can pass a partial with a call,
+arguments, and keyword arguments.  Here's a quick example.  We'll use
+the same demo object, and its increase method, as our example above, but
+this time we'll include some arguments [#partial]_.
+
+    >>> t = transaction.begin()
     >>> partial = dm.reactor.put(
     ...     zc.async.partial.Partial(app['demo'].increase, 5))
     >>> transaction.commit()
@@ -269,6 +297,11 @@
     >>> app['demo'].counter
     16
 
+Thread Calls And Reactor Calls
+==============================
+
+...
+
 Optimized Usage
 ===============
 
@@ -395,12 +428,12 @@
     ...
     >>> p = dm.thread.put(zc.async.partial.Partial.bind(callWithProgressReport))
     >>> transaction.commit()
-    >>> ignore = time_flies(5) # get the reactor to kick for main call
+    >>> ignore = time_flies(10); acquired = main_lock.acquire()
+    ... # get the reactor to kick for main call; then get the reactor to
+    ... # kick for progress report; then wait for lock release.
     do some work
     more work
     about half done
-    >>> ignore = time_flies(5) # get the reactor to kick for progress report
-    >>> acquired = main_lock.acquire()
     >>> t = transaction.begin() # sync
     >>> p.annotations.get('zc.async.partial_txt.half_done')
     True
@@ -457,6 +490,72 @@
     this file will be created and populated with a new UUID if it does
     not exist.
 
+.. [#subscribers] The zc.async.subscribers module provides two different
+    subscribers to set up a datamanager.  One subscriber expects to put
+    the object in the same database as the main application
+    (`zc.async.subscribers.basicInstallerAndNotifier`).  This is the
+    default, and should probably be used if you are a casual user.
+    
+    The other subscriber expects to put the object in a secondary
+    database, with a reference to it in the main database
+    (`zc.async.subscribers.installerAndNotifier`).  This approach keeps
+    the database churn generated by zc.async, which can be significant,
+    separate from your main data.  However, it also requires that you
+    set up two databases in your zope.conf (or equivalent, if this is
+    used outside of Zope 3).  And possibly even more onerously, it means
+    that persistent objects used for calls must either already be
+    committed, or be explicitly added to a connection; otherwise you
+    will get an InvalidObjectReference (see
+    cross-database-references.txt in the ZODB package).  The possible
+    annoyances may be worth it to someone building a more demanding
+    application.
+    
+    Again, the first subscriber is the easier to use, and is the default.
+    You can use either one (or your own).
+
+    If you do want to use the second subscriber, here's a start on what
+    you might need to do in your zope.conf.  In a Zope without ZEO you
+    would set something like this up.
+
+    <zodb>
+      <filestorage>
+        path $DATADIR/Data.fs
+      </filestorage>
+    </zodb>
+    <zodb zc.async>
+      <filestorage>
+        path $DATADIR/zc.async.fs
+      </filestorage>
+    </zodb>
+
+    For ZEO, you could have the two databases on one server...
+    
+    <filestorage 1>
+      path Data.fs
+    </filestorage>
+    <filestorage 2>
+      path zc.async.fs
+    </filestorage>
+    
+    ...and then set up ZEO clients something like this.
+    
+    <zodb>
+      <zeoclient>
+        server localhost:8100
+        storage 1
+        # ZEO client cache, in bytes
+        cache-size 20MB
+      </zeoclient>
+    </zodb>
+    <zodb zc.async>
+      <zeoclient>
+        server localhost:8100
+        storage 2
+        # ZEO client cache, in bytes
+        cache-size 20MB
+      </zeoclient>
+    </zodb>
+
 .. [#setup] This is a bit more than standard set-up code for a ZODB test,
     because it sets up a multi-database.
 
@@ -611,6 +710,19 @@
     >>> time_flies = faux.time_flies
     >>> time_passes = faux.time_passes
 
+.. [#handlers] In the second footnote above, the text describes two
+    available subscribers.  When this documentation is run as a test, it
+    is run twice, once with each.  To accomodate this, in our example
+    below we appear to pull the "installerAndNotifier" out of the air:
+    it is installed as a global when the test is run.
+
+.. [#partial] The Partial class can take arguments and keyword arguments
+    for the wrapped callable at call time as well, similar to Python
+    2.5's `partial`.  This will be important when we use the Partial as
+    a callback.  For this use case, though, realize that the partial
+    will be called with no arguments, so you must supply all necessary
+    arguments for the callable on creation time.
+
 .. [#tear_down]
 
     >>> twisted.internet.reactor.callLater = oldCallLater