[ZODB-Dev] What makes the ZODB slow?

Sat Jun 24 02:29:25 EDT 2006

On Fri, 2006-06-23 at 21:02 +0200, Dieter Maurer wrote:
> Roché Compaan wrote at 2006-6-22 21:53 +0200:
> > ...
> >What overhead does undo add to performance?
> 
> Very few -- apart from a fast growing storage file.
> 
> However, the log behaviour of "FileStorage" means that
> you get a very different notion of locallity.
> 
> In a relational database, records in the same table have
> some chance to be near to one another. With a "FileStorage"
> records modified in the same transaction are near to one another.
> 
> In general locality in a "FileStorage" is much smaller than
> in a relational database. This means that the equivalent
> of a "full table scan" would be much more inefficient.
> 
> >Can state be serialised more economically to reduce disk IO?
> 
> Sure: the ZODB uses a very bulky serialization format:
> 
>   Each object contains the full path to its class
>   and the state is described in a self contained way
>   (explicitly naming all attributes and their value).
> 
>   This gives you lots of redundancy (compared to a relational
>   system where the field structure is not carried in each row).
> 
> For your most frequent object types, you may work with slots
> rather than dicts (this means that the class determines the fields,
> not each individual object).
> 
> The newest pickle formats can also handle the class references
> is bit more efficiently -- at least when a single transaction
> modifies many objects of the same class.
> 
> >Is the ZODB really slow
> 
> For highly structured data, the ZODB is necessarily considerably slower than
> a (well designed) relational database.
> 
> That's because the relational database makes use of the "highly structured"
> property while the ZODB ignores it.
> 
> 
> We have an additional reason why object oriented databases
> tend to be considerable slower than relational ones:
> 
>   With a standard relational database, the querying operations
>   are executed on the server -- near to the data.
>   Relatively few data travels from the server to the client.
> 
>   Object oriented databases tend to have a stupid server --
>   one that knows only state but no behaviour.
>   Consequently, the server cannot do anything with the objects
>   it stores -- all operations must be done on the clients (which have
>   the behaviour). This means that the operations are not performed
>   near to the data and lots of data needs to travel from the server to
>   the client (often to be discarded there).

Thanks for your detailed answer. I guess I should make piece with a
hybrid backend if I want the best of both worlds.

-- 
Roché Compaan
Upfront Systems                   http://www.upfrontsystems.co.za