[Zodb-checkins] CVS: ZODB3/Doc/guide - modules.tex:1.9

Tim Peters tim.one at comcast.net
Thu Dec 25 00:03:16 EST 2003


Update of /cvs-repository/ZODB3/Doc/guide
In directory cvs.zope.org:/tmp/cvs-serv18825/Doc/guide

Modified Files:
	modules.tex 
Log Message:
Updated the BTree intro to include operations new in ZODB 3.3 (play nice
with Python's newer iteration and __contains__ protocols; optional
Boolean args to exclude endpoints on range searches).


=== ZODB3/Doc/guide/modules.tex 1.8 => 1.9 ===
--- ZODB3/Doc/guide/modules.tex:1.8	Sun Oct  5 21:24:50 2003
+++ ZODB3/Doc/guide/modules.tex	Thu Dec 25 00:03:15 2003
@@ -56,28 +56,31 @@
 
 The BTrees package provides a large collection of related data
 structures.  There are variants of the data structures specialized to
-handle integer values, which are faster and use less memory.  There
+integers, which are faster and use less memory.  There
 are four modules that handle the different variants.  The first two
 letters of the module name specify the types of the keys and values in
 mappings -- O for any object and I for integer.  For example, the
 \module{BTrees.IOBTree} module provides a mapping with integer
 keys and arbitrary objects as values.
 
-The four data structures provide by each module are a btree, a bucket,
-a tree set, and a set.  The btree and bucket types are mappings and
+The four data structures provide by each module are a BTree, a Bucket,
+a TreeSet, and a Set.  The BTree and Bucket types are mappings and
 support all the usual mapping methods, e.g. \function{update()} and
-\function{keys()}.  The tree set and set types are similar to mappings
+\function{keys()}.  The TreeSet and Set types are similar to mappings
 but they have no values; they support the methods that make sense for
 a mapping with no keys, e.g. \function{keys()} but not
-\function{items()}.  The bucket and set types are the individual
-building blocks for btrees and tree sets, respectively.  A bucket or
-set can be used when you are sure that it will have few elements.  If
-the data structure will grow large, you should use a btree or tree
-set.  Like Python lists, buckets and sets are allocated in one
+\function{items()}.  The Bucket and Set types are the individual
+building blocks for BTrees and TreeSets, respectively.  A Bucket or
+Set can be used when you are sure that it will have few elements.  If
+the data structure will grow large, you should use a BTree or TreeSet.
+Like Python lists, Buckets and Sets are allocated in one
 contiguous piece, and insertions and deletions can take time
-proportional to the number of existing elements.  Btrees and tree sets
-are multi-level tree structures with much better (logarithmic) worst-case
-time bounds.
+proportional to the number of existing elements.  Also like Python lists,
+a Bucket or Set is a single object, and is pickled and unpickled in its
+entirety.  BTrees and TreeSets are multi-level tree structures with
+much better (logarithmic) worst-case time bounds, and the tree structure
+is built out of multiple objects, which ZODB can load individually
+as needed.
 
 The four modules are named \module{OOBTree}, \module{IOBTree},
 \module{OIBTree}, and \module{IIBTree}.  The two letter prefixes are
@@ -87,7 +90,7 @@
 each define their own variants of those four types.
 
 The \function{keys()}, \function{values()}, and \function{items()}
-methods on btree and tree set types do not materialize a list with all
+methods on BTree and TreeSet types do not materialize a list with all
 of the data.  Instead, they return lazy sequences that fetch data
 from the BTree as needed.  They also support optional arguments to
 specify the minimum and maximum values to return, often called "range
@@ -95,24 +98,29 @@
 searching is very efficient.
 
 The \function{keys()}, \function{values()}, and \function{items()}
-methods on bucket and set types do return lists with all the data.
-Starting in ZODB4, there are also \function{iterkeys()},
+methods on Bucket and Set types do return lists with all the data.
+Starting in ZODB 3.3, there are also \function{iterkeys()},
 \function{itervalues()}, and \function{iteritems()} methods that
-return iterators (in the Python 2.2 sense).
+return iterators (in the Python 2.2 sense).  Those methods also apply to
+BTree and TreeSet objects.
 
-A BTree object supports all the methods you would expect of a mapping
+A BTree object supports all the methods you would expect of a mapping,
 with a few extensions that exploit the fact that the keys are sorted.
 The example below demonstrates how some of the methods work.  The
 extra methods are \function{minKey()} and \function{maxKey()}, which
 find the minimum and maximum key value subject to an optional bound
 argument, and \function{byValue()}, which should probably be ignored
 (it's hard to explain exactly what it does, and as a result it's
-almost never used -- best to consider it deprecated).
+almost never used -- best to consider it deprecated).  The various
+methods for enumerating keys, values and items also accept minimum
+and maximum key arguments ("range search"), and (new in ZODB 3.3)
+optional Boolean arguments to control whether a range search is
+inclusive or exclusive of the range's endpoints.
 
 \begin{verbatim}
 >>> from BTrees.OOBTree import OOBTree
 >>> t = OOBTree()
->>> t.update({ 1: "red", 2: "green", 3: "blue", 4: "spades" })
+>>> t.update({1: "red", 2: "green", 3: "blue", 4: "spades"})
 >>> len(t)
 4
 >>> t[2]
@@ -128,14 +136,38 @@
 [1, 2, 3, 4]
 >>> list(t.values())
 ['red', 'green', 'blue', 'spades']
->>> list(t.values(1, 2))
+>>> list(t.values(1, 2)) # values at keys in 1 to 2 inclusive
 ['red', 'green']
->>> list(t.values(2))
+>>> list(t.values(2))    # values at keys >= 2
 ['green', 'blue', 'spades']
+>>> list(t.values(min=1, max=4))  # keyword args new in ZODB 3.3
+['red', 'green', 'blue', 'spades']
+>>> list(t.values(min=1, max=4, excludemin=True, excludemax=True))
+['green', 'blue']
 >>> t.minKey()     # smallest key
 1
 >>> t.minKey(1.5)  # smallest key >= 1.5
 2
+>>> for k in t.keys():
+...     print k,
+1 2 3 4
+>>> for k in t:    # new in ZODB 3.3
+...     print k,
+1 2 3 4
+>>> for pair in t.iteritems():  # new in ZODB 3.3
+...     print pair,
+...
+(1, 'red') (2, 'green') (3, 'blue') (4, 'spades')
+>>> t.has_key(4)  # returns a true value, but exactly what undefined
+2
+>>> t.has_key(5)
+0
+>>> 4 in t  # new in ZODB 3.3
+True
+>>> 5 in t  # new in ZODB 3.3
+False
+>>>
+
 \end{verbatim}
 
 % XXX I'm not sure all of the following is actually correct.  The
@@ -143,19 +175,32 @@
 Each of the modules also defines some functions that operate on
 BTrees -- \function{difference()}, \function{union()}, and
 \function{intersection()}.  The \function{difference()} function returns
-a bucket, while the other two methods return a set.
+a Bucket, while the other two methods return a Set.
 If the keys are integers, then the module also defines
 \function{multiunion()}.  If the values are integers, then the module
 also defines \function{weightedIntersection()} and
 \function{weightedUnion()}.  The function doc strings describe each
 function briefly.
 
+\code{BTrees/Interfaces.py} defines the operations, and is the official
+documentation.  Note that the interfaces don't define the concrete types
+returned by most operations, and you shouldn't rely on the concrete types
+that happen to be returned:  stick to operations guaranteed by the
+interface.  In particular, note that the interfaces don't specify anything
+about comparison behavior, and so nothing about it is guaranteed.  In ZODB
+3.3, for example, two BTrees happen to use Python's default object
+comparison, which amounts to comparing the (arbitrary but fixed) memory
+addresses of the BTrees. This may or may not be true in future releases.
+If the interfaces don't specify a behavior, then whether that behavior
+appears to work, and exactly happens if it does appear to work, are
+undefined and should not be relied on.
+
 \subsubsection{Total Ordering and Persistence}
 
 The BTree-based data structures differ from Python dicts in several
 fundamental ways.  One of the most important is that while dicts
 require that keys support hash codes and equality comparison,
-the btree-based structures don't use hash codes and require a total
+the BTree-based structures don't use hash codes and require a total
 ordering on keys.
 
 Total ordering means three things:
@@ -178,11 +223,11 @@
 does not satisfy these rules:  complex numbers only support \code{==}
 and \code{!=} comparisons, and raise an exception if you try to compare
 them in any other way.  They don't satisfy the trichotomy rule, and must
-not be used as keys in btree-based data structures (although note that
+not be used as keys in BTree-based data structures (although note that
 complex numbers can be used as keys in Python dicts, which do not require
 a total ordering).
 
-Examples of objects that are wholly safe to use as keys in btree-based
+Examples of objects that are wholly safe to use as keys in BTree-based
 structures include ints, longs, floats, 8-bit strings, Unicode strings,
 and tuples composed (possibly recursively) of objects of wholly safe
 types.
@@ -194,11 +239,11 @@
 \code{u'x' == 'x'}, but trying to compare \code{chr(255)} to
 \code{u'x'} raises an exception.  Partly for this reason (another is
 given later), it can be dangerous to use keys with multiple types in
-a single btree-based structure.  Don't try to do that, and you don't
+a single BTree-based structure.  Don't try to do that, and you don't
 have to worry about it.
 
 Another potential problem is mutability:  when a key is inserted in a
-btree-based structure, it must retain the same order relative to the
+BTree-based structure, it must retain the same order relative to the
 other keys over time.  This is easy to run afoul of if you use mutable
 objects as keys.  For example, lists supply a total ordering, and then
 
@@ -228,7 +273,7 @@
 addresses of two objects.  Because Python never moves an object in memory,
 this does supply a usable (albeit arbitrary) total ordering across the
 life of a program run (an object's memory address doesn't change).  But
-if objects compared in this way are used as keys of a btree-based
+if objects compared in this way are used as keys of a BTree-based
 structure that's stored in a database, when the objects are loaded from
 the database again they will almost certainly wind up at different
 memory addresses.  There's no guarantee then that if key K1 had a memory
@@ -244,8 +289,8 @@
 common mistake is to use keys that are instances of a user-defined class
 that doesn't supply its own \method{__cmp__()} method.  Python compares
 such instances by memory address.  This is fine if such instances are
-used as keys in temporary btree-based structures used only in a single
-program run.  It can be disastrous if that btree-based structure is
+used as keys in temporary BTree-based structures used only in a single
+program run.  It can be disastrous if that BTree-based structure is
 stored to a database, though.
 
 \begin{verbatim}
@@ -270,7 +315,7 @@
 \method{__cmp__()}, but define it incorrectly.  It's possible but
 rare for a custom \method{__cmp__()} implementation to violate one
 of the three required formal properties directly.  It's more common for
-it to fall back" to address-based comparison by mistake.
+it to "fall back" to address-based comparison by mistake.
 For example,
 
 \begin{verbatim}
@@ -284,7 +329,7 @@
 
 It's quite possible there that the \keyword{else} clause allows
 a result to be computed based on memory address.  The bug won't show
-up until a btree-based structure uses objects of class \class{Mine} as
+up until a BTree-based structure uses objects of class \class{Mine} as
 keys, and also objects of other types as keys, and the structure is
 loaded from a database, and a sequence of comparisons happens to execute
 the \keyword{else} clause in a case where the relative order of object
@@ -298,9 +343,9 @@
 
 \begin{enumerate}
 \item  Use objects of simple immutable types as keys in
-       btree-based data structures.
+       BTree-based data structures.
 
-\item  Within a single btree-based data structure, use objects of
+\item  Within a single BTree-based data structure, use objects of
        a single type as keys.  Don't use multiple key types in a
        single structure.
 




More information about the Zodb-checkins mailing list