[ZODB-Dev] ZODB queries (was ZODB and ORBit-Python problems)

Greg Ward gward@mems-exchange.org
Tue, 22 May 2001 19:26:22 -0400


On 22 May 2001, Christian Robottom Reis said:
> Well, AFAICS a pickle is quite opaque - you have to read it all before you
> figure out what it contains. Structured storage would means saving things
> organized so you could look into the object and read it's data without
> reading it all. I think pickling doesn't let us do this, but I don't
> know if this matters _too much_ performance-wise.

It seems to me that the opacity of pickles is only a problem if you
don't have indeces: because pickles are opaque, reading large swaths of
the database means reading and unpickling a large number of objects.  If
you could avoid the overhead of unpickling (parse the pickle format,
create instances, fill in their dict), you'd be down to just I/O
overhead, plus whatever overhead your mythical "structured storage"
imposes.  Is this a win?  Who knows, until you break down the cost of
reading objects into I/O overhead and unpickling overhead.

However, in the presence of a useful indexing infrastructure, I don't
think it matters.  As I see it, indeces would be an alternative path
through the object graph to supplement the two that we now have (direct
lookup by OID on Connection objects, or play follow-the-pointers from
the database root).  Constructing an index from scratch would be as
expensive as traversing the whole DB to find something, but that's no
surprise.  Once the index is constructed (and maintained with every
update!), then unpickling overhead is immaterial -- you only unpickle
the objects that the index tells you you need.

        Greg
-- 
Greg Ward - software developer                gward@mems-exchange.org
MEMS Exchange                            http://www.mems-exchange.org