[ZODB-Dev] Changing the pickle protocol?

Jim Fulton jim at zope.com
Thu Apr 29 10:17:58 EDT 2010


On Wed, Apr 28, 2010 at 9:18 PM, Laurence Rowe <l at lrowe.co.uk> wrote:
> I suspect that something like 90% of ZODB pickle data will be string
> values, so the scope for reducing the space used by a ZODB through the
> newer pickle protocol – and even the class registry – is limited.

I disagree wrt the class registry.

In fact, to disagree with myself :), there *may* be a significant win
in just using the class registry for certain classed built into ZODB
and commonly used in ZODB applications, most notably BTrees, which
would require less effort than implementing a general registry.

> What would make a significant impact on data size is compression. With
> lots of short strings it's probably best to use a preset dictionary
> (which sadly does not seem to be exposed through the python zlib
> module). Text is usually very amenable to compression, and now we have
> blobs most binary data will no longer be in the Data.fs.

Maybe. The best way to find out is to do an experiement. The
experiment should be pretty easy.  Start with a representative
database and use the storage copying APIs (iteration and restore) to
make a copy with the data compressed and see what you end up with.

> Compression could either be implemented on the database level (which
> is probably cleanest) or on the application level (which would also
> reduce the size of content objects in memory). This would bring clear
> wins where I/O or memory bandwidth are the limiting factors - CPUs
> spend most of their time waiting for data to be copied into their
> cache from memory.

The right place to implement compression is at the storage level.
This can most simply be done as a storage wrapper.

For applications that use ZEO, it would be nice to get ZEO to conspire
a bit so compressed data is sent over the wire. This makes the
implementation a little bit more complicated, but not much.

Of course, the right way to start is to do an experiment to see if
it's worth it. :)

Jim

--
Jim Fulton


More information about the ZODB-Dev mailing list