[ZODB-Dev] Iterating objects in the ZODB

Casey Duncan cduncan@kaivo.com
Tue, 21 Aug 2001 10:37:54 -0600


This is a multi-part message in MIME format.
--------------E79BA629B5BD7EBF4779B7E6
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Attached is some code I worked up for efficiently getting a sequence of
objects referenced from a base object stored in the ZODB. AFAICT it
should be storage independent (although it could be made more efficient
for ClientStorage). Here is a quick API ref:

getObjectReferenceOids(ob)

    Returns a list of the oids of the objects referenced by ob.

getObjectReferences(ob):

    Returns a LazyObjectList of the objects referenced by ob.

LazyObjectList:

    A class (like Lazy.py) that can store references to a large number
of persistent objects efficiently, objects are only loaded into memory
when they are accessed, supports __len__, __getitem__, __getslice__
(slice for inside Zope), and __add__

    The constructor takes two arguments: the ZODB connection and a list
of oids

    There are two methods, getOid and getOids for getting the oids back
out.

So, what is this for exactly? This is the start of a query engine for
the ZODB (and Zope). Somebody posted a Python OQL parser the other day,
so I thought I would work from the other end. Eventually, this code can
be used for searching and indexing objects in the ZODB. Basically you
could use it anytime you need to walk over an area of the ZODB.

Enjoy. As this is certainly alpha code, please let me know if you find
anything anomalous.
-- 
| Casey Duncan
| Kaivo, Inc.
| cduncan@kaivo.com
`------------------>
--------------E79BA629B5BD7EBF4779B7E6
Content-Type: text/plain; charset=us-ascii;
 name="FindRefs.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="FindRefs.py"

from ZODB.referencesf import referencesf
from string import join

def getObjectReferenceOids(ob):
    """Get the oids of the object referenced by ob"""

    oids = [ob._p_oid]
    pop_oid = oids.pop
    version = ob._p_jar._version
    load = ob._p_jar._storage.load
    references = {}
    has_reference = references.has_key

    while oids:
        oid = pop_oid()
        if has_reference(oid): continue
        # Get the pickle and see what objects it references
        p, serial = load(oid, version)
        references[oid] = None
        referencesf(p, oids)

    return references.keys()

class LazyObjectList:
    __allow_access_to_unprotected_subobjects__ = 1

    def __init__(self, connection, oids):
        self._connection = connection
        self._oids = oids

    def __len__(self):
        return len(self._oids)

    def __getitem__(self, index):
        return self._connection[self._oids[index]]

    def __getslice__(self, low, high):
        return LazyObjectList(self._connection, self._oids[low:high])

    slice = __getslice__

    def __add__(self, other):
        try:
            if self._connection is not other._connection:
                raise ValueError, "Cannot add object lists from different connections"

            oids = self._oids + other._oids
            return LazyObjectList(self._connection, oids)
        except:
            raise TypeError, "Incompatible types for add operator"


    def getOid(self, index):
        return self._oids[index]

    def getOids(self):
        return self._oids[:]

    def __repr__(self):
        reprs = []
        for ob in self:
            reprs.append(`ob`)
        return '[%s]' % join(reprs, ', ')
    

def getObjectReferences(ob):
    """Get all objects referenced in ob"""
    return LazyObjectList(ob._p_jar, getObjectReferenceOids(ob))

--------------E79BA629B5BD7EBF4779B7E6--