[Zope] Size of Data.fs: how big is too big?

Richard Barrett R.Barrett@ftel.co.uk
Fri, 30 Nov 2001 11:23:50 +0000


At 14:09 29/11/2001 -0700, Paul Horbal wrote:
>Hi everyone,
>
>My Zope site has been growing fairly large lately and I'm beginning to 
>wonder at what point I should consider moving files out of the Zope filing 
>system and onto a static filesystem.
>
>At this point, I haven't noticed any performance issues at 
>all.  Currently, Data.fs is about 300 MB in size.  The site is running on 
>a Sun Netra X1 (400 MHz UltraSparc IIe) with 512 MB of RAM.
>
>Should I be worried about adding more large files into Data.fs?
>
>thanks,
>Paul.

Just to relate my decisions on this topic. One of our Zope sites had a 
similar size of Data.fs to yours (and growing fairly rapidly). When I 
looked at the contents it  was clear that a large part of the size was from 
big blob objects such as PDFs, GIFS and JPEGs, and various Microsoft 
product data files (Word and Powerpoint being favorite).

I wrote an external method which selectively decants the content of 
qualifying objects into ExtFile and ExtImage objects. The big blobs end up 
in the UNIX file system and the stub objects left in Data.fs are much 
smaller; in my case the Data.fs shrank from over 350 Mb to less than 30 Mb. 
I now run the decanting function regularly as well as advising content 
providers to use ExtFile/ExtImage for PDFs and such.

My rationale for this approach was twofold:

1. I couldn't see any real benefit of inflating Zope's object database with 
big blobs of fairly static, opaque data. Indeed, my guess was that it was 
more likely to damage the performance of Zope and inflate its process size, 
although I never tried to prove if this was the case.

2. Big, fairly static blobs of opaque data adversely affect incremental 
backup performance for the file system containing the Data.fs. A one byte 
change in any object means all the big, unchanged blobs also become 
candidates for being backed up yet again. With the big blobs out in the 
file system, incremental backups go a lot quicker.

This approach won't save in total file space occupied - it wasn't intended 
to - but I figure that it plays to the strengths and away from the 
weaknesses of both the Zope object database and the UNIX file system.