[Checkins] SVN: Sandbox/J1m/zodb-doc/ checkpoint

Jim Fulton jim at zope.com
Sun May 9 14:35:44 EDT 2010


Log message for revision 112218:
  checkpoint

Changed:
  A   Sandbox/J1m/zodb-doc/intro.txt
  A   Sandbox/J1m/zodb-doc/topics.txt

-=-
Added: Sandbox/J1m/zodb-doc/intro.txt
===================================================================
--- Sandbox/J1m/zodb-doc/intro.txt	                        (rev 0)
+++ Sandbox/J1m/zodb-doc/intro.txt	2010-05-09 18:35:44 UTC (rev 112218)
@@ -0,0 +1,515 @@
+================
+Introducing ZODB
+================
+
+The ZODB provides an object-persistence facility for Python.  It
+provides many features, which we'll get into later, but first, let's
+take a very quick look at the basics.
+
+We start by creating a file-based database:
+
+    >>> import ZODB
+    >>> conn = ZODB.connection('data.fs')
+
+The connection function opens a database and returns a connection to it.
+Because the database didn't already exist, it's created automatically.
+
+Initially, the database contains a single object, the root object.
+Connections have a root method that retrieves the root object:
+
+    >>> conn.root
+    <root >
+
+When using the ZODB, we always start with the root object and use it
+to access other objects.  The root object is an object that you
+can add items to to store your own data.
+
+To store data in the database, you must use "persistent" objects.  To
+create a persistent object, start by creating a class that subclasses
+persistent.Persistent.  Let's create a database to manage books and
+authors. We'll start by creating a book module that defines the
+classes we'll use::
+
+    import persistent, BTrees.OOBTree
+
+    class Book(persistent.Persistent):
+        def __init__(self, title):
+            self.title = title
+
+    class Author(persistent.Persistent):
+        def __init__(self, id, name):
+            self.id = id
+            self.name = name
+            self.books = BTrees.OOBTree.OOBTree()
+
+        def new_book(self, title):
+            book = Book(title)
+            book.author = self
+            self.books[title] = book
+            return book
+
+The OOBTree class implements a persistent mapping object that keeps
+its keys sorted.
+
+Now, we'll use the book module we created to store books in our database:
+
+    >>> import book, BTrees.OOBTree
+    >>> conn.root.authors = BTrees.OOBTree.OOBTree()
+    >>> conn.root.books = BTrees.OOBTree.OOBTree()
+
+    >>> author = book.Author('tolkien', 'J.R.R. Tolkien')
+    >>> conn.root.authors[author.id] = author
+    >>> for title in ["The Fellowship of the Ring",
+    ...               "The Two Towers",
+    ...               "The return of the King"]:
+    ...     conn.root.books[title] = author.new_book[title]
+
+In the root of our database, we added a collection of authors
+arranged by author name, and a collection of books arranged by
+book title.
+
+So, now we have a small database with 3 books and an author.  We can
+query it by author or by book using Python as our query language:
+
+    >>> conn.books["The Two Towers"].author.name
+    'J.R.R. Tolkien'
+
+We used BTrees to implement the books and authors collections. BTrees
+have a number of advantages and limitations:
+
+- BTrees are highly scalable and are therefore appropriate for
+  collections containg large number of items.
+
+- BTrees manage their keys in sorted order and support range searches.
+
+- BTrees require that their keys be orderable and that the ordering is
+  stable. For example, the order must not change with new Python
+  versions. This generally implies that the keys be of a simple type.
+
+The database root object isn't a BTree, but is based on a persistent
+dictionary. It is a bit more flexible than BTrees but isn't very
+scalable.  For this reason, you don't want to use the database root
+object to implement a large collection of objects, but rather use a
+BTree stored in the root object (or in some other object reachable
+from the root object).
+
+We've made a number of changes to the database, but so far the changes
+exist only in memory. To make the changes permanent, we need to commit
+them. To do this, we use the transaction module:
+
+    >>> import transaction
+    >>> transaction.commit()
+
+We can close and reopen the database and see that our changes are
+permanent:
+
+    >>> conn.close()
+    >>> conn = ZODB.open('data.fs')
+    >>> conn.books["The Two Towers"].author.name
+    'J.R.R. Tolkien'
+
+Let's add another book to the database:
+
+    >>> author = conn.books["The Two Towers"].author
+    >>> conn.books['The Hobbit'] = author.new_book("The Hobit")
+    >>> conn.books["The Hobbit"].author.name
+    'J.R.R. Tolkien'
+
+We made a typo when we created the book.  We can fix it, or we can
+start over.  Remember that changes aren't permanent until we commit.
+We can discard changes at any time by aborting the transaction::
+
+    >>> transaction.abort()
+    >>> conn.books["The Hobbit"]
+    Traceback
+    ...
+    Key Error: ...
+
+Let's expand the initials in the author's name:
+
+    >>> conn.root.authors['tolkien'].name = 'John Ronald Reuel Tolkien'
+
+Note that we didn't use any specialized API to perform the update.  We
+simply set an object attribute as we normally would without a
+database.  In fact, almost all of the operations we've performed were
+accomplished through normal object operations.  The only exception has
+been the use of transaction comit and abort calls to indicate that changes
+made should be saved or discarded.  We didn't had to keep track of
+the changes made.  The ZODB did that for us.
+
+Persistence
+===========
+
+We saw above that the ZODB keeps track of object modifications for us.
+It turns out that applications have to provide some assistence.  In
+many cases it's sufficient to subclass ``persistent.Persistent``, but
+in some cases, some extra care is needed.  To see what's needed, we
+need to uderstand what the basic responsibilities of peristent objects
+are and how the ``Persistent`` base class helps implement those
+responsibilities.
+
+ZODB needs to know when objects are accessed so their state can be
+laoded, if necessary.  The ``Persistent`` base class tracks attribute
+access and causes an object's state to be loaded when needed.  For
+objects implemented in Python [#c]_, this is really all that's needed
+to get objects loaded.
+
+ZODB needs to know when objects are modified so that changes are
+saved.  The ``Persistent`` base class tracks attribute assignments and
+marks an object as needing to be changed when attributes are
+assigned. This strategy works for many persistent objects.
+
+Care is required when a persistent object changes subobjects. Consider
+an alternate, and naive auther class implemention::
+
+    class BrokenAuthor(persistent.Persistent):
+
+        def __init__(self, id, name):
+            self.id = id
+            self.name = name
+            self.books = {}
+
+        def new_book(self, title):
+            book = Book(title)
+            book.author = self
+            self.books[title] = book
+            return book
+
+In this implementation, the books attribute is a dictionary, rather
+than a BTree.  The ``new_book`` method modifies author instances,
+but doesn't assign an attribute, so the change isn't seen by
+ZODB. This leads to the change being lost when the affected author
+object is reloaded from its database.  There are 2 basic approaches
+for dealling with mutable subobjects:
+
+- Use persistent sub-objects.
+
+  This is the approach taken by the ``Author`` class shown earlier.
+  When we use a persistent subobject, the containing object isn't
+  responsible for managing the persistence of subobject changes; the
+  subobject is responsible.  In the (non-broken) ``Author`` class,
+  adding a book doesn't change the author, it changes the author's
+  books.
+
+- Tell ZODB about changes explicitly.
+
+  We can tell ZODB about object changes explicitly my assigning the
+  ``_p_changed`` attribute. Here's a non-broken author implementation::
+
+    class NonBrokenAuthor(persistent.Persistent):
+
+        def __init__(self, id, name):
+            self.id = id
+            self.name = name
+            self.books = {}
+
+        def new_book(self, title):
+            book = Book(title)
+            book.author = self
+            self.books[title] = book
+            self._p_changed = True
+            return book
+
+  Here we assigned ``_p_changed`` attribute to signal that the author
+  object has changed.
+
+Finally, ZODB needs to keep track of certain meta data for persistent
+objects.  The ``Persistent`` base class takes care of this too.  The
+standard meta data includes::
+
+``_p_oid``
+    Every persistent object that has been stored in a database as a
+    database-unique object identifier.  Before an object has been
+    added to a database, the value of this attribute is None.
+
+``_p_jar``
+    The ``_p_jar`` [#jar]_ attribute is the database connection
+    managing the object. If an object hasn't been stored in a
+    database, then the value of this attribute is None.
+
+``_p_serial``
+    The ``_p_serial`` attribute has the identifier of the last
+    transaction to update the object.  This acts as a revision
+    identitier for the object.  This has a value of None if the object
+    hasn't been stored yet.
+
+``_p_changed``
+    The _p_changed attribute provides access to and control of the
+    changed state of an object.  It can be have one of 3 values:
+
+    True
+       The object has changed.
+
+    False
+       The object hasn't been changed.
+
+    None
+       The object's state hasn't been loaded from the database.
+       (We call such objects "ghosts".)
+
+    Typically, this attribute is used to specifically signal that an
+    object has changed, as we saw earlier.
+
+Databases, connections, and storages
+====================================
+
+ZODB separates persistent data management from low-level storage.
+This allows storage implementations to vary independently from the
+core persistence support.  When we create database objects in Python,
+we have to supply a storage object to manage low-level database
+records.
+
+When using databases, we create database objects and then open
+connections to them.  Connections allow using multiple threads within
+a process to access the same database.  We'll say more about that
+later in the section on concurrency.
+
+When we began this introduction, we used a simplified API to access
+our database:
+
+    >>> conn = ZODB.connection('data.fs')
+
+The connection function actually did three things for us:
+
+- It instantiated a file storage
+
+- It instantiated a database object using the storage, and
+
+- It opened a connection to the database [#itdidmore]_.
+
+Here's what these steps would look like if we didn't use this API::
+
+    >>> import ZODB, ZODB.FileStorage
+    >>> storage = ZODB.FileStorage.FileStorage('data.fs')
+    >>> db = ZODB.DB(storage)
+    >>> conn = db.open()
+
+Why would we use the low-level APIs?  Doing so allows us greater
+control.  We can supply special storage and connection options not
+supported by the high-level API. The high-level API also supports only
+a few standard storages.
+
+If we want to open multipke connections, we can use a slightly
+lower-level API, by passing a database file name to ``ZODB.DB``:
+
+    >>> db = ZODB.DB('data.fs')
+
+Standard storages
+-----------------
+
+ZODB comes with some standard storages:
+
+FileStorage
+    File storages provide basic data storage in a single file
+    [#exceptforblobs]_.  Most ZODB installations use FileStorage,
+    sometimes in combination with other storages.
+
+    To use a file storage with the high-level APIs, just pass the
+    file-storage file name as a string::
+
+       >>> db = ZODB.DB('data.fs')
+
+ZEO
+    ZEO (Zope Enterprise Objects) provides a client-server facility
+    for ZODB.  ZEO allows multiple processes to share a single
+    storage.  When you use ZEO, you run a ZEO storage server and
+    configure your applications to use ZEO client storages to access
+    your storage server. The storage server uses some underlying
+    storage, such as a file storage to store data.
+
+    To use a ZEO client storage with the high-level APIs, just pass
+    the ZEO server address as a host and port tuple or as an integer
+    port on the local host::
+
+       >>> db = ZODB.DB(('storage.example.com', 8100))
+       >>> connection = ZODB.connection(8100) # localhost
+
+
+MappingStorage
+    Mapping storages store database records in memory.  Mapping
+    storages are typically used for testing or experimenting with
+    ZODB.
+
+    A goal of MappingStorage is to provide a fairly simple storage
+    implementation to study when learning how to implement storages.
+
+    To use a mapping storage with the high-level APIs, just pass
+    None::
+
+       >>> connection = ZODB.connection(None)
+
+DemoStorage
+    Demo storages allow you to take an unchanging base storage and
+    store changes in a separate changes storage.  They were originally
+    implemented to allow demonstrations of applications in which a
+    populated sample database was provided in CD and users could make
+    changes that were stored in memory.
+
+    Demo storages don't actually store anything themselves. They
+    delegate to 2 other storages, an unchanging base storage and a
+    storage that holds changes.  This is an example of a storage
+    wrapper.  It's common to compose storages from base storages and
+    one or more storage wrapper.
+
+3rd-party storages
+------------------
+
+There are a number of 3rd-party storages.  Writing additional storage
+implementations relatively straightforward.  Here are some examples of
+3rd-party storage implementations:
+
+RelStorage
+    RelStorage stores data in relational databases.
+
+zc.beforestorage
+    zc.beforestorage is a storage wrapper that provides a snapshot of
+    an underlying storage at a moment of time.  This is useful because
+    it can take changing storage, like a ZEO client storage and freeze
+    it at a point in time, allowing it to be used as a base for a demo
+    storage.
+
+zc.zrs and zeoraid
+    zc.zrs and zeoraid provide database replication.  zc.zrs is a
+    commercial storage implementation, while zeoraid is open source.
+
+Configuration strings
+---------------------
+
+ZODB supports the use of textual configuration files to define
+databases, storages, and ZEO servers.  Production applications
+typically create database objects by loading configuration strings
+from application configuration files [#zconfig]_.
+
+To create a database from a configuration string, use the
+``ZODB.config.databaseFromString`` function.  Here's an example that
+creates a database using a file storage::
+
+    >>> import ZODB.config
+    >>> db = ZODB.config.databaseFromString("""
+    ... <zodb>
+    ...     <filestorage>
+    ...         path data.fs
+    ...     </filestorage>
+    ... </zodb>
+    ... """)
+
+The configuration syntax was inspired by the Apache configuration
+syntax. Configuration sections are bracketed by opening and closing
+types tags and can be nested. Options are given as names and values
+separated by spaces.
+
+In the example above, a ``zodb`` tag defines a database object
+[#multipledbtags]_.  It contains a ``filestorage`` tag and this uses a
+file storage at the path ``data.fs``.
+
+To find out about the database options supported by the ``zodb`` tag,
+see the databse reference documentation. To find out about storage
+options, see the storage reference documentation.
+
+Concurrency
+===========
+
+ZODB supports accessing databases from multiple threads.  Each thread
+operates as if it has it's own copy of the database.  Threads are
+synchonized through transaction commit.
+
+Each thread opens a separate connection to a database.  Each
+connection has it's own object cache. If multiple connections access
+the same object, they each get their own copy. Let's look at an example:
+
+   >>> conn1 = db.open()
+   >>> author1 = conn1.root.authors['tolkien']
+
+   >>> conn2 = db.open()
+   >>> author2 = conn1.root.authors['tolkien']
+
+Here we've opened two connections and fetched the author object for
+J.R.R. Tolkien. From a database perspective, these are the same
+objects:
+
+    >>> author1._p_oid == author2._p_oid
+    True
+
+    >>> author1.name
+    'J.R.R. Tolkien'
+    >>> author2.name
+    'J.R.R. Tolkien'
+
+But they're different Python objects:
+
+    >>> author1 is author2
+    False
+
+If we modify one, we don't see the change in the other:
+
+    >>> author1.name = 'John Ronald Reuel Tolkien'
+    >>> author2.name
+    'J.R.R. Tolkien'
+
+Until we commit the change:
+
+    >>> transaction.commit()
+    >>> author2.name
+    'John Ronald Reuel Tolkien'
+
+Transaction managers
+--------------------
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+.. [#c] Implementing objects in C requires a lot more care. It's
+   really hard. :) See "Implementing persistent objects in C" for more
+   details.
+
+.. [#jar] The name ``_p_jar`` comes from early implementations of ZODB
+   in which databases were called "pickle jars", becaise objects were
+   stored using the Python pickle format.  In those early versions,
+   there weren't separate database connections.
+
+.. [#itdidmore] It also arranged that when we closed the connection,
+   the the underlying database was closed.
+
+.. [#zconfig] ZODB uses the ``ZConfig`` configuration
+   system. Applications that use ``ZConfig`` can also merge the ZODB
+   configuration schemas with thier own configuration schemas.
+
+.. [#multipledbtags] You can define multiple databases, so there can
+   be multiple ``zodb`` tags. See "Using multiple databases."


Property changes on: Sandbox/J1m/zodb-doc/intro.txt
___________________________________________________________________
Added: svn:eol-style
   + native

Added: Sandbox/J1m/zodb-doc/topics.txt
===================================================================
--- Sandbox/J1m/zodb-doc/topics.txt	                        (rev 0)
+++ Sandbox/J1m/zodb-doc/topics.txt	2010-05-09 18:35:44 UTC (rev 112218)
@@ -0,0 +1,4 @@
+
+implementing storages
+implementing storages in C
+multidatabases


Property changes on: Sandbox/J1m/zodb-doc/topics.txt
___________________________________________________________________
Added: svn:eol-style
   + native



More information about the checkins mailing list