[ZODB-Dev] Some interesting (to some:) numbers

Adam GROSZER agroszer at gmail.com
Wed May 12 02:50:09 EDT 2010


Hello Jim,

Tuesday, May 11, 2010, 8:36:46 PM, you wrote:

JF> On Sun, May 9, 2010 at 4:59 PM, Roel Bruggink <roel at fourdigits.nl> wrote:
>> On Sun, May 9, 2010 at 8:33 PM, Jim Fulton <jim at zope.com> wrote:
>>>
>>> Our recent discussion of compression made me curious so I did some
>>> analysis of pickle sizes in one of our large databases. This is for a
>>> content management system.  The database is packed weekly.  It doesn't
>>> include media, which are in blobs.
>>>
>>> There were ~19 million transaction in the database and around 130
>>> million data records. About 60% of the size was taken up by BTrees.
>>> Compressing pickles using zlib with default compression reduced the
>>> pickle sizes by ~58%. The average uncompressed record size was 1163
>>> bytes.  The average compressed size was ~493 bytes.
>>>
>>> This is probably enough of a savings to make compression interesting.

JF> ...

>> That's really interesting! Did you notice any issues performance wise, or
>> didn't you check that yet?

JF> OK, I did some crude tests.  It looks like compressing is a little
JF> less expensive than pickling and decompressing is a little more
JF> expensive than unpickling, which is to say this is pretty cheap.  For
JF> example, decompressing a data record took around 20 microseconds on my
JF> machine. A typical ZEO load takes 10s of milliseconds. Even in Shane's
JF> zodb shootout benchmark which loads data from ram, load times are
JF> several hundred microseconds or more.

JF> I don't think compression will hurt performance.  It is likeley to
JF> help it in practice because:

JF> - There will be less data to send back and forth to remote servers.

JF> - Smaller databases will get more benefit from disk caches.
JF>   (Databases will be more likely to fit on ssds.)

JF> - ZEO caches (and relstorage memcached caches) will be able to hold
JF>   more object records.

I was thinking about using other compressors.
I found this:
http://tukaani.org/lzma/benchmarks.html
Seems like gzip/zlib is the fastest with some expense of efficiency.

-- 
Best regards,
 Adam GROSZER                            mailto:agroszer at gmail.com
--
Quote of the day:
What this country needs is a good five-cent microcomputer.



More information about the ZODB-Dev mailing list