[Checkins] SVN: zc.async/trunk/src/zc/async/README.txt Mostly an
attempt to improve docs, with some improvements of tests.
Gary Poster
gary at zope.com
Wed Aug 16 23:21:29 EDT 2006
Log message for revision 69579:
Mostly an attempt to improve docs, with some improvements of tests.
Changed:
U zc.async/trunk/src/zc/async/README.txt
-=-
Modified: zc.async/trunk/src/zc/async/README.txt
===================================================================
--- zc.async/trunk/src/zc/async/README.txt 2006-08-17 02:37:14 UTC (rev 69578)
+++ zc.async/trunk/src/zc/async/README.txt 2006-08-17 03:21:28 UTC (rev 69579)
@@ -20,12 +20,11 @@
Worker data objects have queues representing potential or current tasks
for the worker, in the main thread or a secondary thread. Each worker
-has a virtual loop, part of the Twisted or asyncore main loop, for every
-worker process, which is responsible for responding to system calls
-(like pings) and for claiming pending main thread calls by moving them
-from the datamanager async queue to their own. Each worker thread queue
-also represents spots for claiming and performing pending thread
-calls.
+has a virtual loop, part of the Twisted main loop, for every worker
+process, which is responsible for responding to system calls (like
+pings) and for claiming pending main thread calls by moving them from
+the datamanager async queue to their own. Each worker thread queue also
+represents spots for claiming and performing pending thread calls.
Set Up
======
@@ -34,35 +33,19 @@
the ZODB, alongside the application object, with a key of
'zc.async.datamanager'. The package includes subscribers to
zope.app.appsetup.interfaces.IDatabaseOpenedEvent that sets an instance
-up in this location if one does not exist.
+up in this location if one does not exist [#subscribers]_.
-One version of the subscriber expects to put the object in the same
-database as the main application (`basicInstallerAndNotifier`), and the
-other version expects to put the object in a secondary database, with a
-reference to it in the main database (`installerAndNotifier`). The
-second approach keeps the database churn generated by zc.async, which
-can be significant, separate from your main data. You can use either
-(or your own); the first version is the default, since it requires no
-additional set-up. When this documentation is run as a test, it is run
-twice, once with each setup. To accomodate this, in our example below
-we appear to pull the "installerAndNotifier" out of the air: it is
-installed as a global when the test is run. You will want to use one of
-the two subscribers mentioned above, or roll your own.
-
-XXX explain possible gotchas if you run the separate database (i.e., you may
-have to explicitly add objects to connections if you create an object for the
-main database and put it as a partial callable or argument in the same
-transaction).
-
Let's assume we have a reference to a database named `db`, a connection
-named `conn`, a `root`, and an application in the 'app' key
-[#setup]_. If we provide a handler, fire the event and examine the
-root, we will see the new datamanager.
+named `conn`, a `root`, an application in the 'app' key [#setup]_, and a
+handler named `installerAndNotifier` [#handlers]_. If we provide a
+handler, fire the event and examine the root, we will see the new
+datamanager.
>>> import zope.component
>>> import zc.async.subscribers
- >>> zope.component.provideHandler(installerAndNotifier) # see above
- ... # for explanation of where installerAndNotifier came from
+ >>> zope.component.provideHandler(installerAndNotifier) # see footnotes
+ ... # for explanation of where installerAndNotifier came from, and what
+ ... # it is.
>>> import zope.event
>>> import zope.app.appsetup.interfaces
>>> zope.event.notify(zope.app.appsetup.interfaces.DatabaseOpened(db))
@@ -72,7 +55,8 @@
<zc.async.datamanager.DataManager object at ...>
The default adapter from persistent object to datamanager will get us
-the same result.
+the same result; adapting a persistent object to IDataManager is the
+preferred spelling.
>>> import zc.async.adapters
>>> zope.component.provideAdapter(
@@ -81,10 +65,13 @@
>>> zc.async.interfaces.IDataManager(app) # doctest: +ELLIPSIS
<zc.async.datamanager.DataManager object at ...>
-Normally, each process discovers or creates its UUID and registers
-itself with the data manager as a worker. This would have happened when
-the data manager was announced as available in the InstallerAndNotifier
-above.
+Normally, each process discovers or creates its UUID, and starts an
+engine to do work. The engine is a non-persistent object that
+participates in the Twisted main loop. It discovers or creates the
+persistent worker object associated with the instance UUID in the
+datamanager's `workers` mapping, and starts polling. This would have
+happened when the data manager was announced as available in the
+InstallerAndNotifier above.
>>> from zope.component import eventtesting
>>> evs = eventtesting.getEvents(
@@ -102,7 +89,10 @@
Let's install the subscriber we need and refire the event. Our worker
will have a UUID created for it, and then it will be installed with the
UUID as key. We can't actually use the same event because it has an
-object from a different connection, so we'll recreate it.
+object from a different connection, so we'll recreate it. We'll then use
+a magic `time_passes` function to simulate the Twisted reactor cycling and
+firing scheduled calls. After we sync our connection with the database,
+the worker appears. It is tied to the engineUUID of the current engine.
>>> zope.component.provideHandler(
... zc.async.subscribers.installTwistedEngine)
@@ -121,7 +111,7 @@
... is not None)
True
-The new UUID, in hex, is stored in INSTANCE_HOME/etc/uuid.txt
+The instance UUID, in hex, is stored in INSTANCE_HOME/etc/uuid.txt
>>> import uuid
>>> import os
@@ -137,7 +127,10 @@
The file is intended to stay in the instance home as a persistent identifier
of this particular worker.
-Our worker has `thread` and `reactor` jobs, with all jobs available.
+Our worker has `thread` and `reactor` jobs, with all jobs available. By
+default, a worker begins offering a single thread job and a four
+"simultaneous" reactor jobs. This can be changed simply by changing the value
+on the worker and committing.
>>> worker.thread.size
1
@@ -148,6 +141,19 @@
>>> len(worker.reactor)
0
+But what are `thread` and `reactor` jobs?
+
+A `thread` job is one that is performed in a thread with a dedicated
+ZODB connection. It's the simplest to use for typical tasks.
+
+A thread job also may be overkill for some jobs that don't need a
+connection constantly. It also is not friendly to Twisted services.
+
+A `reactor` job is performed in the main thread, in a call scheduled by
+the Twisted reactor. It has some gotchas (see zc.twist's README), but it
+can be good for jobs that don't need a constant connection, and for jobs
+that can leverage Twisted code.
+
We now have a simple set up: a data manager with a single worker. Let's start
making some asynchronous calls!
@@ -155,7 +161,7 @@
=========================
The simplest case is simple to perform: pass a persistable callable to the
-manager's .add method.
+`put` method of one of the manager's queues. We'll make reactor calls first.
>>> from zc.async import interfaces
>>> dm = zc.async.interfaces.IDataManager(app)
@@ -168,18 +174,47 @@
use a helper function called `time_flies` to simulate the asynchronous
cycles necessary for the manager and workers to perform the task.
- >>> count = time_flies(dm.workers.values()[0].poll_seconds)
+ >>> dm.workers.values()[0].poll_seconds
+ 5
+ >>> count = time_flies(5)
imagine this sent a message to another machine
-You can also pass a datetime.datetime to schedule the call: the
-safest thing to use is a UTC timezone. The datetime is interpreted as a
-UTC datetime.
+We also could have used the method of a persistent object. Here's another
+quick example.
+ >>> import persistent
+ >>> class Demo(persistent.Persistent):
+ ... counter = 0
+ ... def increase(self, value=1):
+ ... self.counter += value
+ ...
+ >>> app['demo'] = Demo()
+ >>> transaction.commit()
+ >>> app['demo'].counter
+ 0
+ >>> partial = dm.reactor.put(app['demo'].increase)
+ >>> transaction.commit()
+ >>> count = time_flies(5)
+
+We need to sync our connection so that we get the changes in other
+connections: we can do that with a transaction begin, commit, or abort.
+
+ >>> app['demo'].counter
+ 0
>>> t = transaction.begin()
+ >>> app['demo'].counter
+ 1
+
+The method was called, and the persistent object modified!
+
+You can also pass a timezone-aware datetime.datetime to schedule a
+call. The safest thing to use is a UTC timezone.
+
+ >>> t = transaction.begin()
>>> import datetime
>>> import pytz
>>> datetime.datetime.now(pytz.UTC)
- datetime.datetime(2006, 8, 10, 15, 44, 27, 211, tzinfo=<UTC>)
+ datetime.datetime(2006, 8, 10, 15, 44, 32, 211, tzinfo=<UTC>)
>>> partial = dm.reactor.put(
... send_message, datetime.datetime(
... 2006, 8, 10, 15, 45, tzinfo=pytz.UTC))
@@ -188,8 +223,8 @@
>>> transaction.commit()
>>> count = time_flies(10)
>>> count = time_flies(10)
- >>> count = time_flies(10)
>>> count = time_flies(5)
+ >>> count = time_flies(5)
imagine this sent a message to another machine
>>> datetime.datetime.now(pytz.UTC)
datetime.datetime(2006, 8, 10, 15, 45, 2, 211, tzinfo=<UTC>)
@@ -204,56 +239,49 @@
>>> count = time_flies(5)
imagine this sent a message to another machine
-The `add` method of the thread and reactor queues is the manager's
+The `put` method of the thread and reactor queues is the manager's
entire application API. Other methods are used to introspect, but are
-not needed for basic usage. We will examine the introspection API below
-(`Manager Introspection`_), and will discuss an advanced feature of the
-`add` method (`Specifying Workers`), but let's explore some more usage
-patterns first.
+not needed for basic usage.
-Typical Usage: zc.async.Partial
-================================
+But what is that result of the `put` call in the examples above? A
+partial? What do you do with that?
-...(currently tests and discussion are in partial.txt and datamanager.txt.
-We need user-friendly docs, as well as stress tests. The remainder of the
-below is somewhat unedited and incomplete at the moment)...
+Partials
+========
- >>> t = transaction.begin()
- >>> import zc.async
- >>> import persistent
- >>> import transaction
- >>> import zc.async.partial
- >>> class Demo(persistent.Persistent):
- ... counter = 0
- ... def increase(self, value=1):
- ... self.counter += value
- ...
- >>> app['demo'] = Demo()
- >>> transaction.commit() # XXX example of gotcha for multiple databases:
- ... # connection.add or commit before adding to partial
- >>> app['demo'].counter
- 0
- >>> partial = dm.reactor.put(
- ... zc.async.partial.Partial(app['demo'].increase))
- >>> transaction.commit()
- >>> count = time_flies(5)
+The result of a call to `put` returns an IDataManagerPartial. The
+partial represents the pending call. This object has a lot of
+functionality that's explored in other documents in this package, and
+demostrated a bit below, but here's a summary.
-We need to commit the transaction in our connection so that we get the
-changes in other connections (beginning and committing transactions sync
-connections).
+- You can introspect it to look at, and even modify, the call and its
+ arguments.
- >>> app['demo'].counter
- 0
- >>> t = transaction.begin()
- >>> app['demo'].counter
- 1
+- You can specify that the partial may or may not be run by given
+ workers (identifying them by their UUID).
-The deferred class can take arguments and keyword arguments for the
-wrapped callable as well, similar to Python 2.5's `partial`. For this
-use case, though, realize that the partial will be called with no
-arguments, so you must supply all necessary arguments for the callable
-on creation time.
+- You can specify other calls that should be made on the basis of the
+ result of this call.
+- You can persist a reference to it, and periodically (after syncing
+ your connection with the database, which happens whenever you begin or
+ commit a transaction) check its `state` to see if it is equal to
+ zc.async.interfaces.COMPLETED. When it is, the call has run to
+ completion, either to success or an exception.
+
+- You can look at the result of the call (once COMPLETED). It might be
+ the result you expect, or a twisted.python.failure.Failure, which is a
+ way to safely communicate exceptions across connections and machines
+ and processes.
+
+What's more, you can pass a Partial to the `put` call. This means that
+you aren't constrained to simply having simple non-argument calls
+performed asynchronously, but you can pass a partial with a call,
+arguments, and keyword arguments. Here's a quick example. We'll use
+the same demo object, and its increase method, as our example above, but
+this time we'll include some arguments [#partial]_.
+
+ >>> t = transaction.begin()
>>> partial = dm.reactor.put(
... zc.async.partial.Partial(app['demo'].increase, 5))
>>> transaction.commit()
@@ -269,6 +297,11 @@
>>> app['demo'].counter
16
+Thread Calls And Reactor Calls
+==============================
+
+...
+
Optimized Usage
===============
@@ -395,12 +428,12 @@
...
>>> p = dm.thread.put(zc.async.partial.Partial.bind(callWithProgressReport))
>>> transaction.commit()
- >>> ignore = time_flies(5) # get the reactor to kick for main call
+ >>> ignore = time_flies(10); acquired = main_lock.acquire()
+ ... # get the reactor to kick for main call; then get the reactor to
+ ... # kick for progress report; then wait for lock release.
do some work
more work
about half done
- >>> ignore = time_flies(5) # get the reactor to kick for progress report
- >>> acquired = main_lock.acquire()
>>> t = transaction.begin() # sync
>>> p.annotations.get('zc.async.partial_txt.half_done')
True
@@ -457,6 +490,72 @@
this file will be created and populated with a new UUID if it does
not exist.
+.. [#subscribers] The zc.async.subscribers module provides two different
+ subscribers to set up a datamanager. One subscriber expects to put
+ the object in the same database as the main application
+ (`zc.async.subscribers.basicInstallerAndNotifier`). This is the
+ default, and should probably be used if you are a casual user.
+
+ The other subscriber expects to put the object in a secondary
+ database, with a reference to it in the main database
+ (`zc.async.subscribers.installerAndNotifier`). This approach keeps
+ the database churn generated by zc.async, which can be significant,
+ separate from your main data. However, it also requires that you
+ set up two databases in your zope.conf (or equivalent, if this is
+ used outside of Zope 3). And possibly even more onerously, it means
+ that persistent objects used for calls must either already be
+ committed, or be explicitly added to a connection; otherwise you
+ will get an InvalidObjectReference (see
+ cross-database-references.txt in the ZODB package). The possible
+ annoyances may be worth it to someone building a more demanding
+ application.
+
+ Again, the first subscriber is the easier to use, and is the default.
+ You can use either one (or your own).
+
+ If you do want to use the second subscriber, here's a start on what
+ you might need to do in your zope.conf. In a Zope without ZEO you
+ would set something like this up.
+
+ <zodb>
+ <filestorage>
+ path $DATADIR/Data.fs
+ </filestorage>
+ </zodb>
+ <zodb zc.async>
+ <filestorage>
+ path $DATADIR/zc.async.fs
+ </filestorage>
+ </zodb>
+
+ For ZEO, you could have the two databases on one server...
+
+ <filestorage 1>
+ path Data.fs
+ </filestorage>
+ <filestorage 2>
+ path zc.async.fs
+ </filestorage>
+
+ ...and then set up ZEO clients something like this.
+
+ <zodb>
+ <zeoclient>
+ server localhost:8100
+ storage 1
+ # ZEO client cache, in bytes
+ cache-size 20MB
+ </zeoclient>
+ </zodb>
+ <zodb zc.async>
+ <zeoclient>
+ server localhost:8100
+ storage 2
+ # ZEO client cache, in bytes
+ cache-size 20MB
+ </zeoclient>
+ </zodb>
+
.. [#setup] This is a bit more than standard set-up code for a ZODB test,
because it sets up a multi-database.
@@ -611,6 +710,19 @@
>>> time_flies = faux.time_flies
>>> time_passes = faux.time_passes
+.. [#handlers] In the second footnote above, the text describes two
+ available subscribers. When this documentation is run as a test, it
+ is run twice, once with each. To accomodate this, in our example
+ below we appear to pull the "installerAndNotifier" out of the air:
+ it is installed as a global when the test is run.
+
+.. [#partial] The Partial class can take arguments and keyword arguments
+ for the wrapped callable at call time as well, similar to Python
+ 2.5's `partial`. This will be important when we use the Partial as
+ a callback. For this use case, though, realize that the partial
+ will be called with no arguments, so you must supply all necessary
+ arguments for the callable on creation time.
+
.. [#tear_down]
>>> twisted.internet.reactor.callLater = oldCallLater
More information about the Checkins
mailing list