[Zope] large images to a database via zope.

marc lindahl marc@bowery.com
Tue, 17 Apr 2001 02:41:05 -0400


I'm glad someone is keeping this thread alive!

I did some tests, adding different size images using add image, to see what
the ZODB bloat might be.  Results:

image size (on linux FS)    ZODB bloat (data.fs after - before - image size)
16203040 (15.5MB)           32467 bytes
447464 (437K)               1475 bytes
9287    (9.1K)              788 bytes
799  (one pixel gif)        940 bytes

I can't explain the last figure, it's a little odd...

In general, it's obvious that ZODB is storing these things in raw binary
format.  This might seem obvious to everyone at digicool, but I looked
around for 2 days for the answer to this question and couldn't find it...
I'm surprised noone has asked and answered this long ago (maybe I just
looked in the wrong places).

It also seems that the 'bloat' (or, size beyond the raw data size that's
added to the ZODB) grows roughly linear with image size.  That's a very
rough guess - hopefully someone intimately familiar with Python and ZODB
could provide a more accurate answer.

The bloat is also 0.2% - 0.3% of the file size, which is pretty
insignificant.

My summary from this is that it is perfectly valid to store binary stuff
(like images, audio files, etc) in the ZODB, and for something in the range
of 10K - 5MB, it seems like there could be more benefits in storing them in
ZODB than in the local FS, not to mention SQL.   For example, there's no
risk of exceeding the maximum directory entries.  It's also easier to back
up.  You also get undo and versioning.

So much for size, now for performance.  Ethan, though zope isn't 'optimized
for the rapid delivery of large binary objects', is it better at pulling
them out of an object than the local FS?  OR via a DB adapter?  For any
particular reasons (multithreading, maybe?)

Why the hard timeout?  Wouldn't it make more sense to have a timeout based
on the inter-packet time (or can't that be seen from zope)?

Is there any how-to's or anything on proxy caches?  I noticed linux comes
with one called SQUID...  has anyone set up Zserver with it?



> From: ethan mindlace fremen <mindlace@digicool.com>
> 
> The zope.org Data.fs recently cleared 2GB in size, with something on the
> order of 200,000 objects.  We store a number of reasonably sized objects
> (1.5mb) in the ZODB, and while I am confident that the ZODB does not
> corrupt them, I do recommend using some proxy-cache technology as the ZODB
> is not optimized for the rapid delivery of large binary objects.  The other
> issue is that Zope currently has a "hard" time out where if a request takes
> more than 30minutes to send, it will give up ... which is another reason
> for the proxy-cache technology.
> 
> Until I see compelling evidence otherwise, my stance is that the ZODB is
> just as safe to store your data in as a relational database, and soon (with
> replicated storage and berkely storage) it will offer compelling
> scalability for read-predominant environments that I don't think any
> relational database can match for less than six figures.
> 
> --
> -mindlace-
> zopatista community liason