[ZODB-Dev] Some interesting (to some:) numbers

Jim Fulton jim at zope.com
Tue May 11 14:36:46 EDT 2010


On Sun, May 9, 2010 at 4:59 PM, Roel Bruggink <roel at fourdigits.nl> wrote:
> On Sun, May 9, 2010 at 8:33 PM, Jim Fulton <jim at zope.com> wrote:
>>
>> Our recent discussion of compression made me curious so I did some
>> analysis of pickle sizes in one of our large databases. This is for a
>> content management system.  The database is packed weekly.  It doesn't
>> include media, which are in blobs.
>>
>> There were ~19 million transaction in the database and around 130
>> million data records. About 60% of the size was taken up by BTrees.
>> Compressing pickles using zlib with default compression reduced the
>> pickle sizes by ~58%. The average uncompressed record size was 1163
>> bytes.  The average compressed size was ~493 bytes.
>>
>> This is probably enough of a savings to make compression interesting.

...

> That's really interesting! Did you notice any issues performance wise, or
> didn't you check that yet?

OK, I did some crude tests.  It looks like compressing is a little
less expensive than pickling and decompressing is a little more
expensive than unpickling, which is to say this is pretty cheap.  For
example, decompressing a data record took around 20 microseconds on my
machine. A typical ZEO load takes 10s of milliseconds. Even in Shane's
zodb shootout benchmark which loads data from ram, load times are
several hundred microseconds or more.

I don't think compression will hurt performance.  It is likeley to
help it in practice because:

- There will be less data to send back and forth to remote servers.

- Smaller databases will get more benefit from disk caches.
  (Databases will be more likely to fit on ssds.)

- ZEO caches (and relstorage memcached caches) will be able to hold
  more object records.

Jim

--
Jim Fulton


More information about the ZODB-Dev mailing list