[ZODB-Dev] ZODB idioms

Ury Marshak um@hottech-israel.com
Sat, 22 Jun 2002 18:47:31 +0200


(Is this list ok with postings related not to developement _of_ ZODB,
but to developement _with_ ZODB? ;)


Now to the question at hand - the 'sequential numbers' idiom.
Being new to ZODB (and object Dbs in general) I'm sure there
are other options that I missed, but these are the two approaches
I've came up with - please comment on them, cause for a newbie
their [dis]advantages are not immediately obvious, especially
what would happen with a large number of objects...

Let's assume we are writing a bug-tracking application. The app
will have to be multiuser, so it seems that ZEO server will be
serving several stations. Let's say we want to keep 'BugReport'
objects somehow together, it seems that for a large number of
objects a list would be very inefficient, so we would probably
use some sort of a BTree.

    zodb_root = ...
    if not zodb_root.has_key('AllBugReports'):
        zodb_root['AllBugReports'] = IOBTree()
        get_transaction().commit()
    bugreports_tree = zodb_root['AllBugReports']

Now the BugReport will need a unique number associated with it.
One option is to use it as a key in a tree:

    brep = BugReport('Your software is broken!', tester115, '15 Jan')
    br_id = ... # our guess at the current maximum id
    while not bugreports_tree.insert(br_id, brep):
        br_id += 1
    #   btw, do we need a commit here, or is 'insert' autocommiting for us?

It seems that this isn't a good idea, since the tree would become
extremely unbalanced (I didn't read the BTrees C source - could it
be rebalancing trees behind the scenes? Or is there a method to do
it?)

The other option would seem to generate new IDs randomly and keep a
separate 'NumberKeeper' object to keep track of the highest number.

initializing the database:

    class NumberKeeper:
        pass
    nk = NumberKeeper()
    nk.num = 1
    zodb_root['BugReportsNumberKeeper'] = nk

creating new object:

    while 1:
        try:
            #     get next number
            br_id = nk.num
            nk.num += 1

            #     create object
            brep = BugReport('Your software has a virus!', tester5, '22
Jan')
            brep.rep_num = br_id

            #     try to commit, should raise a ConflictError if somebody
had
            #     already modified nk.num
            rand_id = randrange(0, ......     # maxint? also have to handle
conflicts ...
            bugreports_tree[rand_id ]= brep
            get_transaction().commit()

        except ConflictError:
            get_connection().sync()   #   do we need this or is it synced
                                                       #   automatically on
ConflictError?
        else:
            break


Are there other possibilities? (considering that skipping some numbers
or having duplicates is not an option). Which is the best approach?
How well it's going to scale for hundreds of thousands of objects?
Which is going to be easier to use?

Thanks for bearing with me,
Ury