[ZODB-Dev] RFC: Python2 - Py3k database compatibility

Sun Apr 28 23:19:22 UTC 2013

On Wed, Apr 17, 2013 at 2:54 PM, Tres Seaver <tseaver at palladion.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 04/16/2013 05:13 PM, Stephan Richter wrote:
>> On Tuesday, April 16, 2013 04:38:06 PM Tres Seaver wrote:
>>> Comments?
>
> (I don't now why Stephan's e-mail didn't make it to the list).
>
>
>> The big omission that I noticed while reading the text carefully is a
>> note saying that you will never be able to use stock Py3k pickle,
>> because it does not support noload(). Thus ``zodbpickle`` is needed
>> for any Py3k code. (I think this is a correction to you your last
>> bullet in _replace_py2_cPickle.)
>
> Hmm, I think you are correct.
>
>> That reminds me, originally we forked pickle.py from Python 3.3.
>> During PyCon I think you decided to start by using cPickle from Python
>> 2.7 instead. If you are starting from Py2.7 cPickle, then supporting
>> Protocol 3 is not easy.
>
> Already done (as you note in your follow-up).
>
>> Given your writeup, I think you are implicitly saying to start from
>> Py3.3 pickle and add the special support for Python 2 binary via the
>> special new type. That sounds good to me.
>
> I would actually prefer to fork the Python 3.2 version:  the one from 3.3
> pulls in a bunch of grotty internal-only usage.

I'm confused.  I don't understand why we need a Python 3 pickler
change to support the new Python 2 binary type.  I thought we were
going to pickle
Python 2 binary objects using the standard Python 3 protocol 3 code?

>> BTW, what are your motivations for all the different strategies?
>
> I wanted to document them all, because some of the strategies suit
> different cases better than others.
>
>> _ignore_compat is obvious. If you can easily create the ZODB from
>> other data sources, then you can do a one-time switch. In fact, at
>> CipherHealth we have this case, since the ZODB only contains config
>> (which is loaded from text files) and session data.
>
> Yup.  Even for large CMS systems, I would still make "dump-to-filesystem,
> then reload" a requirement.  Others disagree, of course (and may have
> legitimate reasons).  Leo Rochael Almeida has clients with databases "too
> big to convert", for instance (the downtime required to do the conversion
> would be prohibitive, I believe).
>
>> But which strategy would be useful for a large Plone site for example?
>> I think we should focus on that and provide one good way to do it.
>
> Plone has historically preferred in-place migration to dump-reload.  Maybe
> jumping the Py3k curb is enough reason for them to reconsider.

I'm hoping to be able to provide some help with in-place conversion in the
near future.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton