[Checkins] SVN: ZEO/trunk/ Move ZEO-server specific documentation here from ZODB trunk.

Tres Seaver cvs-admin at zope.org
Tue Jan 22 20:07:41 UTC 2013


Log message for revision 129080:
  Move ZEO-server specific documentation here from ZODB trunk.

Changed:
  _U  ZEO/trunk/
  A   ZEO/trunk/doc/zeo.txt

-=-
Added: ZEO/trunk/doc/zeo.txt
===================================================================
--- ZEO/trunk/doc/zeo.txt	                        (rev 0)
+++ ZEO/trunk/doc/zeo.txt	2013-01-22 20:07:40 UTC (rev 129080)
@@ -0,0 +1,392 @@
+==========================
+Running a ZEO Server HOWTO
+==========================
+
+Introduction
+------------
+
+ZEO (Zope Enterprise Objects) is a client-server system for sharing a
+single storage among many clients.  Normally, a ZODB storage can only
+be used by a single process.  When you use ZEO, the storage is opened
+in the ZEO server process.  Client programs connect to this process
+using a ZEO ClientStorage.  ZEO provides a consistent view of the
+database to all clients.  The ZEO client and server communicate using
+a custom RPC protocol layered on top of TCP.
+
+There are several configuration options that affect the behavior of a
+ZEO server.  This section describes how a few of these features
+working.  Subsequent sections describe how to configure every option.
+
+Client cache
+~~~~~~~~~~~~
+
+Each ZEO client keeps an on-disk cache of recently used objects to
+avoid fetching those objects from the server each time they are
+requested.  It is usually faster to read the objects from disk than it
+is to fetch them over the network.  The cache can also provide
+read-only copies of objects during server outages.
+
+The cache may be persistent or transient. If the cache is persistent,
+then the cache files are retained for use after process restarts. A
+non-persistent cache uses temporary files that are removed when the
+client storage is closed.
+
+The client cache size is configured when the ClientStorage is created.
+The default size is 20MB, but the right size depends entirely on the
+particular database.  Setting the cache size too small can hurt
+performance, but in most cases making it too big just wastes disk
+space.  The document "Client cache tracing" describes how to collect a
+cache trace that can be used to determine a good cache size.
+
+ZEO uses invalidations for cache consistency.  Every time an object is
+modified, the server sends a message to each client informing it of
+the change.  The client will discard the object from its cache when it
+receives an invalidation.  These invalidations are often batched.
+
+Each time a client connects to a server, it must verify that its cache
+contents are still valid.  (It did not receive any invalidation
+messages while it was disconnected.)  There are several mechanisms
+used to perform cache verification.  In the worst case, the client
+sends the server a list of all objects in its cache along with their
+timestamps; the server sends back an invalidation message for each
+stale object.  The cost of verification is one drawback to making the
+cache too large.
+
+Note that every time a client crashes or disconnects, it must verify
+its cache.  Every time a server crashes, all of its clients must
+verify their caches.
+
+The cache verification process is optimized in two ways to eliminate
+costs when restarting clients and servers.  Each client keeps the
+timestamp of the last invalidation message it has seen.  When it
+connects to the server, it checks to see if any invalidation messages
+were sent after that timestamp.  If not, then the cache is up-to-date
+and no further verification occurs.  The other optimization is the
+invalidation queue, described below.
+
+Invalidation queue
+~~~~~~~~~~~~~~~~~~
+
+The ZEO server keeps a queue of recent invalidation messages in
+memory.  When a client connects to the server, it sends the timestamp
+of the most recent invalidation message it has received.  If that
+message is still in the invalidation queue, then the server sends the
+client all the missing invalidations.  This is often cheaper than
+perform full cache verification.
+
+The default size of the invalidation queue is 100.  If the
+invalidation queue is larger, it will be more likely that a client
+that reconnects will be able to verify its cache using the queue.  On
+the other hand, a large queue uses more memory on the server to store
+the message.  Invalidation messages tend to be small, perhaps a few
+hundred bytes each on average; it depends on the number of objects
+modified by a transaction.
+
+Transaction timeouts
+~~~~~~~~~~~~~~~~~~~~
+
+A ZEO server can be configured to timeout a transaction if it takes
+too long to complete.  Only a single transaction can commit at a time;
+so if one transaction takes too long, all other clients will be
+delayed waiting for it.  In the extreme, a client can hang during the
+commit process.  If the client hangs, the server will be unable to
+commit other transactions until it restarts.  A well-behaved client
+will not hang, but the server can be configured with a transaction
+timeout to guard against bugs that cause a client to hang.
+
+If any transaction exceeds the timeout threshold, the client's
+connection to the server will be closed and the transaction aborted.
+Once the transaction is aborted, the server can start processing other
+client's requests.  Most transactions should take very little time to
+commit.  The timer begins for a transaction after all the data has
+been sent to the server.  At this point, the cost of commit should be
+dominated by the cost of writing data to disk; it should be unusual
+for a commit to take longer than 1 second.  A transaction timeout of
+30 seconds should tolerate heavy load and slow communications between
+client and server, while guarding against hung servers.
+
+When a transaction times out, the client can be left in an awkward
+position.  If the timeout occurs during the second phase of the two
+phase commit, the client will log a panic message.  This should only
+cause problems if the client transaction involved multiple storages.
+If it did, it is possible that some storages committed the client
+changes and others did not.
+
+Connection management
+~~~~~~~~~~~~~~~~~~~~~
+
+A ZEO client manages its connection to the ZEO server.  If it loses
+the connection, it attempts to reconnect.  While
+it is disconnected, it can satisfy some reads by using its cache.
+
+The client can be configured to wait for a connection when it is created
+or to return immediately and provide data from its persistent cache.
+It usually simplifies programming to have the client wait for a
+connection on startup.
+
+When the client is disconnected, it polls periodically to see if the
+server is available.  The rate at which it polls is configurable.
+
+The client can be configured with multiple server addresses.  In this
+case, it assumes that each server has identical content and will use
+any server that is available.  It is possible to configure the client
+to accept a read-only connection to one of these servers if no
+read-write connection is available.  If it has a read-only connection,
+it will continue to poll for a read-write connection.  This feature
+supports the Zope Replication Services product,
+http://www.zope.com/Products/ZopeProducts/ZRS.  In general, it could
+be used to with a system that arranges to provide hot backups of
+servers in the case of failure.
+
+If a single address resolves to multiple IPv4 or IPv6 addresses,
+the client will connect to an arbitrary of these addresses.
+
+Authentication
+~~~~~~~~~~~~~~
+
+ZEO supports optional authentication of client and server using a
+password scheme similar to HTTP digest authentication (RFC 2069).  It
+is a simple challenge-response protocol that does not send passwords
+in the clear, but does not offer strong security.  The RFC discusses
+many of the limitations of this kind of protocol.  Note that this
+feature provides authentication only.  It does not provide encryption
+or confidentiality.
+
+The challenge-response also produces a session key that is used to
+generate message authentication codes for each ZEO message.  This
+should prevent session hijacking.
+
+Guard the password database as if it contained plaintext passwords.
+It stores the hash of a username and password.  This does not expose
+the plaintext password, but it is sensitive nonetheless.  An attacker
+with the hash can impersonate the real user.  This is a limitation of
+the simple digest scheme.
+
+The authentication framework allows third-party developers to provide
+new authentication modules.
+
+Installing software
+-------------------
+
+ZEO is distributed as part of the ZODB3 package and with Zope,
+starting with Zope 2.7.  You can download it from
+http://pypi.python.org/pypi/ZODB3.
+
+Configuring server
+------------------
+
+The script runzeo.py runs the ZEO server.  The server can be
+configured using command-line arguments or a config file.  This
+document only describes the config file.  Run runzeo.py
+-h to see the list of command-line arguments.
+
+The runzeo.py script imports the ZEO package.  ZEO must either be
+installed in Python's site-packages directory or be in a directory on
+PYTHONPATH.  
+
+The configuration file specifies the underlying storage the server
+uses, the address it binds, and a few other optional parameters.
+An example is::
+
+    <zeo>
+    address zeo.example.com:8090
+    monitor-address zeo.example.com:8091
+    </zeo>
+
+    <filestorage 1>
+    path /var/tmp/Data.fs
+    </filestorage>
+
+    <eventlog>
+    <logfile>
+    path /var/tmp/zeo.log
+    format %(asctime)s %(message)s
+    </logfile>
+    </eventlog>
+
+This file configures a server to use a FileStorage from
+/var/tmp/Data.fs.  The server listens on port 8090 of zeo.example.com.
+It also starts a monitor server that lists in port 8091.  The ZEO
+server writes its log file to /var/tmp/zeo.log and uses a custom
+format for each line.  Assuming the example configuration it stored in
+zeo.config, you can run a server by typing::
+
+    python /usr/local/bin/runzeo.py -C zeo.config
+
+A configuration file consists of a <zeo> section and a storage
+section, where the storage section can use any of the valid ZODB
+storage types.  It may also contain an eventlog configuration.  See
+the document "Configuring a ZODB database" for more information about
+configuring storages and eventlogs.
+
+The zeo section must list the address.  All the other keys are
+optional.
+
+address
+        The address at which the server should listen.  This can be in
+        the form 'host:port' to signify a TCP/IP connection or a
+        pathname string to signify a Unix domain socket connection (at
+        least one '/' is required).  A hostname may be a DNS name or a
+        dotted IP address.  If the hostname is omitted, the platform's
+        default behavior is used when binding the listening socket (''
+        is passed to socket.bind() as the hostname portion of the
+        address).
+
+read-only
+        Flag indicating whether the server should operate in read-only
+        mode.  Defaults to false.  Note that even if the server is
+        operating in writable mode, individual storages may still be
+        read-only.  But if the server is in read-only mode, no write
+        operations are allowed, even if the storages are writable.  Note
+        that pack() is considered a read-only operation.
+
+invalidation-queue-size
+        The storage server keeps a queue of the objects modified by the
+        last N transactions, where N == invalidation_queue_size.  This
+        queue is used to speed client cache verification when a client
+        disconnects for a short period of time.
+
+monitor-address
+        The address at which the monitor server should listen.  If
+        specified, a monitor server is started.  The monitor server
+        provides server statistics in a simple text format.  This can
+        be in the form 'host:port' to signify a TCP/IP connection or a
+        pathname string to signify a Unix domain socket connection (at
+        least one '/' is required).  A hostname may be a DNS name or a
+        dotted IP address.  If the hostname is omitted, the platform's
+        default behavior is used when binding the listening socket (''
+        is passed to socket.bind() as the hostname portion of the
+        address).
+
+transaction-timeout
+        The maximum amount of time to wait for a transaction to commit
+        after acquiring the storage lock, specified in seconds.  If the
+        transaction takes too long, the client connection will be closed
+        and the transaction aborted.
+
+authentication-protocol
+        The name of the protocol used for authentication.  The
+        only protocol provided with ZEO is "digest," but extensions
+        may provide other protocols.
+
+authentication-database
+        The path of the database containing authentication credentials.
+
+authentication-realm
+        The authentication realm of the server.  Some authentication
+        schemes use a realm to identify the logic set of usernames
+        that are accepted by this server.
+
+Configuring clients
+-------------------
+
+The ZEO client can also be configured using ZConfig.  The ZODB.config
+module provides several function for opening a storage based on its
+configuration.
+
+- ZODB.config.storageFromString()
+- ZODB.config.storageFromFile()
+- ZODB.config.storageFromURL()
+
+The ZEO client configuration requires the server address be
+specified.  Everything else is optional.  An example configuration is::
+
+    <zeoclient>
+    server zeo.example.com:8090
+    </zeoclient>
+
+The other configuration options are listed below.
+
+storage
+        The name of the storage that the client wants to use.  If the
+        ZEO server serves more than one storage, the client selects
+        the storage it wants to use by name.  The default name is '1',
+        which is also the default name for the ZEO server.
+
+cache-size
+        The maximum size of the client cache, in bytes.
+
+name
+        The storage name.  If unspecified, the address of the server
+        will be used as the name.
+
+client
+        Enables persistent cache files.  The string passed here is
+        used to construct the cache filenames.  If it is not
+        specified, the client creates a temporary cache that will
+        only be used by the current object.
+
+var
+        The directory where persistent cache files are stored.  By
+        default cache files, if they are persistent, are stored in 
+        the current directory.
+
+min-disconnect-poll
+        The minimum delay in seconds between attempts to connect to
+        the server, in seconds.  Defaults to 5 seconds.
+
+max-disconnect-poll
+        The maximum delay in seconds between attempts to connect to
+        the server, in seconds.  Defaults to 300 seconds.
+
+wait
+        A boolean indicating whether the constructor should wait
+        for the client to connect to the server and verify the cache
+        before returning.  The default is true.
+
+read-only
+        A flag indicating whether this should be a read-only storage,
+        defaulting to false (i.e. writing is allowed by default).
+
+read-only-fallback
+        A flag indicating whether a read-only remote storage should be
+        acceptable as a fallback when no writable storages are
+        available.  Defaults to false.  At most one of read_only and
+        read_only_fallback should be true.
+realm
+        The authentication realm of the server.  Some authentication
+        schemes use a realm to identify the logic set of usernames
+        that are accepted by this server.
+
+A ZEO client can also be created by calling the ClientStorage
+constructor explicitly.  For example::
+
+    from ZEO.ClientStorage import ClientStorage
+    storage = ClientStorage(("zeo.example.com", 8090))
+
+Running the ZEO server as a daemon
+----------------------------------
+
+In an operational setting, you will want to run the ZEO server a
+daemon process that is restarted when it dies.  The zdaemon package
+provides two tools for running daemons: zdrun.py and zdctl.py. You can
+find zdaemon and it's documentation at
+http://pypi.python.org/pypi/zdaemon.
+
+Rotating log files
+~~~~~~~~~~~~~~~~~~
+
+ZEO will re-initialize its logging subsystem when it receives a
+SIGUSR2 signal.  If you are using the standard event logger, you
+should first rename the log file and then send the signal to the
+server.  The server will continue writing to the renamed log file
+until it receives the signal.  After it receives the signal, the
+server will create a new file with the old name and write to it.
+
+Tools
+-----
+
+There are a few scripts that may help running a ZEO server.  The
+zeopack.py script connects to a server and packs the storage.  It can
+be run as a cron job.  The zeoup.py script attempts to connect to a
+ZEO server and verify that is is functioning.  The zeopasswd.py script
+manages a ZEO servers password database.
+
+Diagnosing problems
+-------------------
+
+If an exception occurs on the server, the server will log a traceback
+and send an exception to the client.  The traceback on the client will
+show a ZEO protocol library as the source of the error.  If you need
+to diagnose the problem, you will have to look in the server log for
+the rest of the traceback.



More information about the checkins mailing list