[Zope-dev] objectValues performance

Ender kthangavelu@earthlink.net
Tue, 28 Nov 2000 18:04:07 -0800


Casey Duncan wrote:
> 
> Brett Carter wrote:
> 
> > Ok, I'll bite.  Why doesn't the standard folder scale?  Seems like a
> > design flaw to me - why doesn't the default folder use catalogs or BTrees?
> > -Brett

i believe folders store objects as attrs which implies using a hash in
__dict__ for lookup.

some quick conclusions. the btree folder is slower on average for
creation by anywhere from 15-25% andt btree's __getattr is on average
almost twice as slow as the folder's getattr.

btrees folders are great, but they currently have a performance penalty
at the benefit of larger folders that behave better during concurrent
changes, mainly because they implement a __getattr__ hook in python
instead of c. i did some tests a few months ago(posted to zope-dev) and
i believe the performance penalty varied depending on the operation,
approx 15-25% on ob. creation and 40-50% slower on access. unless you
know you'll have a large number of objects or lots of concurrent changes
you'd be better off with a standard folder. as for what constitues a
large number, i'm not sure, it depends on object size and number of
objects, during my tests i didn't see any changes with scaling the
numbers up to 5000, but granted my objects were lightweight and poor
examples on anything resembling realworld.

regarding objectids in this context its the difference between
dict.keys() and dict.items(). 

my catalog suggestion was under the assumption that in getting a
filtered set from the catalog would allow you to do dict[id] on
individual items for the set, using the metadata to perform any runtime
application filtering computations.

cheers 

kapil


> AFAIK a standard folder uses a linear search when you request an object from it
> (ala Python dictionaries, someone please correct me if I'm wrong). This works
> great except that the search time grows linearly (by n) as you add objects. The
> BTreeFolder as the name implies creates a binary tree of the objects where the
> search time grows by only log n. For small folders the search time difference
> is minimal to non-existant, but as n increases the BTreeFolder search time
> increases minimally. B-trees are fairly complex entities to manage and for the
> vast majority of folders are total overkill. That is why standard folders work
> the way they do, the implementation is simple and efficient for 99.9% of
> applications. Your case is fairly atypical of most Zope folders.
> 
> Perhaps a future implementation of Zope folders could automatically use a
> b-tree after a certain threshold is reached, for now you must explicitly select
> them.
> 
> Andy's idea of using objectIds instead of objectValues is also a good one which
> will save significant amounts of memory. You can always access each object
> individually via id if you need to. Using a  ZCatalog could also help in this
> because you can query the objects without loading them into memory and the
> returned result does not load the objects themselves, only the meta-data and
> only once a result item is explicitly accessed (By using so-called lazy
> sequences). However the catalog will not speed up your actual object access
> time unless you divide them up amongst several folders or use a BTreeFolder.
> The latter being a simpler solution from a design standpoint.
> 
> Good luck!
> 
> Casey Duncan
> 
> _______________________________________________
> Zope-Dev maillist  -  Zope-Dev@zope.org
> http://lists.zope.org/mailman/listinfo/zope-dev
> **  No cross posts or HTML encoding!  **
> (Related lists -
>  http://lists.zope.org/mailman/listinfo/zope-announce
>  http://lists.zope.org/mailman/listinfo/zope )