[ZODB-Dev] Feature Request 2381 : Persistent object iterator in Storage

Greg Ward gward@mems-exchange.org
Fri, 6 Jul 2001 17:00:41 -0400


--X1bOJ3K7DJ5YkBrT
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On 06 July 2001, John D. Heintz said:
> Just wanted to share with everyone a feature request I've just logged.  
> Please provide any feedback/criticism.
> 
> ----------
> In order to support bulk database conversion scripts we need to be able to 
> iterate over all the objects in a given Storage.
> 
> Please provide some useful way to do this that is defined in BaseStorage.py.

Yes, I would like to see an official API for this too.  I'll attach a
script I wrote to walk over an entire ZODB by incrementing OIDs.  It's
more interesting than useful, but it *is* interesting.  ;-)  It won't
work out-of-the-box because it relies on our init_database(),
get_connection(), and get_database() utility routines.  Exercise for the
reader, etc. etc.

(BTW, when you say "Pid", do you really mean "OID"?  Or am I missing
something?)

        Greg
-- 
Greg Ward - software developer                gward@mems-exchange.org
MEMS Exchange                            http://www.mems-exchange.org

--X1bOJ3K7DJ5YkBrT
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="zodb_census.py"

#!/www/python/bin/python

"""zodb_census

Inspect every object in a ZODB and count how many times each type
occurs.  (Note that each ExtensionClass is a separate type, so
we'll get a class-by-class breakdown from this.)
"""

import sys
from struct import pack, unpack
from mems.lib.base import init_database, get_connection, get_database

init_database()
conn = get_connection()
oid = 0L
empty_slots = 0L
total_objects = 0L                      # number of actual objects seen
object_count = {}                       # maps type name to count

expected_count = get_database().objectCount()
print "expecting to see %d objects" % expected_count
try:
    while 1:
        if (oid % 0x0800) == 0:
            sys.stdout.write("\rOID: %016x" % oid)
            sys.stdout.flush()

        oid_s = pack(">LL", (oid & 0xffff0000) >> 16, (oid & 0x0000ffff))
        try:
            object = conn[oid_s]
        except KeyError:
            #print "%016x  *empty slot*" % oid
            empty_slots += 1
        else:
            #print "%016x  %s" % (oid, `object`)
            typename = type(object).__name__
            if object_count.has_key(typename):
                object_count[typename] += 1
            else:
                object_count[typename] = 1L
            total_objects += 1

        oid += 1
        if oid >= expected_count:
            sys.stdout.write("\rOID: %016x\n" % oid)
            print "census completed"
            break
        
except KeyboardInterrupt:
    sys.stdout.write("\rOID: %016x\n" % oid)
    print "census interrupted prematurely"

print "total OIDs attempted: %d" % oid
print "empty slots seen: %d" % empty_slots
print "actual objects seen: %d" % total_objects
typenames = object_count.keys()
typenames.sort()
print "objects seen by type:"
for name in typenames:
    print "%-25.25s %10d" % (name, object_count[name])

--X1bOJ3K7DJ5YkBrT--