[Zope-dev] Alternative Storages: (was RelationalStorage (was LocalFS))

Michel Pelletier michel@digicool.com
Wed, 03 May 2000 23:43:56 -0700


Jimmie Houchin wrote:
> 
> As far as usage is concerned I generally like the ZODB best because it
> is reasonably transparent to the building of a web app with Zope.
> However there are areas in which it does not currently excel that which
> if your site requires these skills then alternatives must be used. A
> couple of areas are data size and heavy writes.

This is often thought to be a deficiency in ZODB, but the root of this
particular problem is really in FileStorage.  Other Storages could
implement much more write intensive abilities.
 
> Some people use an RDBMS to solve these issues.

Right.

> While this will work it
> does expect more from the developer. Some do not have the skills or the
> tools. Even if one does it still requires leaving the transparency of
> developing with the ZODB.

To get an orthogonal benefit, for write intensity you use a realtional
database, instead of using a releational database to solve relational
tasks.
 
> Multiple file storage for the ZODB has been proposed as a solution

to a different problem.  The question is not using multiple file
storages in Zope, but just using multiple storages.  This way you could
use a FileStorage when you want its properties, BerkeleyStorage when you
want something different, or some sort of storage based on a relational
database at the same time in the same Zope system.  An analogy of
allowing Zope to 'mount' various storages into the object tree has been
proposed, but it's a very tricky problem and not the one that creating a
relational based storage will solve.

> and
> there are 2 proposals currently on the ZODB ZWiki. I will add another.
> 
> Class/Object based db files.
> 
> Each class gets it's own db file. This could be similar to the current
> ZODB file except specific to a class. As objects are created they are
> appended to the db file for their class. This could be somewhat
> analogous to tables in an RDBMS.
> 
> Advantages would be spreading out the data space over multiple files
> which would help with some oses. Also I think that each class has
> different characteristics which would be able to be managed better if
> separate.

I'm not any thing of an expert on RDBMS, but we have thought this pretty
though in-house, and this is not really the model we came up with.

The interface of a storage object makes no assumptions about how objects
are stored.  A Storage is, in simple terms, a mapping from object id to
a pickle.  This can easily be a relational database table that is keyed
on object id and contains CLOBs or BLOBs or whatever that represent the
pickle.  When ZODB needs to resolve a reference to an object id into an
object is selects out of the object table the pickle (or pickles, who
knows) it is looking for.  When 'writes' are done, ZODB inserts new
pickles with certain object ids.  Some extra columns containing backlink
references could allow undoing (and thus, sharing the same quick growing
behavior file storage).

Also, the fact that FileStorage *can't* be not undo-able (and not grow
so rapidly) is because the FileStorage just appends to an end of a
file.  A relational database does not have any such restriction, and an
relational Storage could be either undoable or not undoable.

I'm not sure how your suggestion would be better than this, I don't know
much about RDBMS, do they assume one file per table or database?  That
soulds like an implementation issue.  Also, objects in classes would not
be very fairly distributed.  There would be gobs of
ImplicitAcquirerWrappers and very few OFS.Application.Application.  On
Zope.org there are hundreds of classes, and over one hundred thousand
objects in the database, not including previous versions.  There could
very well be over a million objects and their previous revisions.
 
> Example:
> AutoParts
> You have an AutoParts class. The objects will change very little once
> created. However there are a lot of objects and news added periodically.
> This file will need packed seldom. It will also be simple to backup and
> not need backed up often as changes are periodic and regular.
> 
> RetailStore
> In a retail store the product objects are very volatile. Vendors can
> change. Prices do change. A productObject file would have different
> usage characteristics than the AutoParts object.
> 
> Some classes are perfect for few writes and many reads. Others less so.
> 
> Earlier Andrew Kuchling was wanting to walk the object tree. This would
> provide a relatively easy way to walk the object tree.
> 
> This could be implemented with some support classes which have to be
> inherited from to create a class.db file. Any class not so doing would
> go into the standard ZODB. This could help provide desired management
> features for the characteristics of each. It would be nice if in the
> management you could set the path to the file. This would allow for
> multiple disks or partitions for data storage. This too would help with
> backups and such.
> 
> Just a few ideas. They may not stand up to examination, but that's okay.
> I just thought I would put them on the table.

It does absolutly make sense to analyze your problems like this.  The
solutions to your needs here could be met if Zope supported multiple
storages.  Currently, BerkeleyStorage provides has proven high-write
intensity and FileStorage is, of course, wildly useful for the
often-read seldom written objects.

A relational storage is also definatly needed.  

I'm suprised that more people don't use BerkeleyStorage today.  Is this
because it doesn't undo?  I don't immagine it would be too dificult to
extent it to support undo.  Ty, what do you think?

-Michel