[ZODB-Dev] Plone in P2P using Zope over DHT

Tue Jan 4 09:35:30 EST 2011

I'm not very optimistic about this I'm afraid. First the problems with
using Plone:

 * Plone relies heavily on its in ZODB indexes of all content
(portal_catalog). This means that every edit will change lots of
objects (without versioning ~15-20, most of which are in the
catalogue).

 * At least with archetypes a content object's data is spread over
multiple objects. (This should be better with Dexterity, though you
will still have multiple objects for locking and workflow)

 * If you use versioning you'll see ~ 100 objects changed in an edit.

 * Even loading the front-page will take a long time - In my
experiments writing an amazon s3 backend for ZODB the extra latency of
fetching each object was really noticeable.

But I'm not sure even a simpler ZODB CMS would be a good fit for a p2p DHT:

 * ZODB is transactional using two phase commit. With p2p latencies,
these commits will be horribly slow - all clients storing changed
objects would need to participate in the transaction.

 * Each client's object cache will need to know about invalidations, I
don't see any way of supplying these from a DHT.

I expect you'd have more success storing content items as single
content objects / pages in the DHT and then generating indexes based
on that. You'll need some way of storing parent - child relationships
between the content objects too, as updating a single list of children
object will be incredibly difficult to get right in a distributed
system.

Laurence

On 4 January 2011 11:40, Aran Dunkley <aran at organicdesign.co.nz> wrote:
> Thanks for the feedback Vincent :-) it sounds like NEO is pretty close
> to being SQL-free. As one of the NEO team, what are your thoughts on the
> practicality of running Plone in a P2P environment with the latencies
> experienced in standard DHT (such as for example those based on
> Kademlia) implemtations?
>
> On 04/01/11 22:27, Vincent Pelletier wrote:
>> Hi.
>>
>> Le mardi 4 janvier 2011 07:18:34, Aran Dunkley a écrit :
>>> The problem is that it uses SQL for its indexing queries (they quote
>>> "NoSQL" as meaning "Not only SQL"). SQL cannot work in P2P space, but
>>> can be made to work on server-clusters.
>>
>> Yes, we use MySQL, and it bites us on both worlds actually:
>> - in relational world, we irritate developers as we ask questions like "why
>>   does InnoDB load a whole row when we just select primary key columns", which
>>   ends up with "don't store blobs in mysql"
>> - in key-value world, because NoSQL using MySQL doesn't look consistent
>>
>> So, why do we use MySQL in NEO ?
>> We use InnoDB as an efficient BTree implementation, which handles persistence.
>> We use MySQL as a handy data definition language (NEO is still evolving, we
>> need an easy way to tweak table structure when a new feature requires it), but
>> we don't need any transactional isolation (each MySQL process used for NEO is
>> accessed by only one process through one connection).
>> We want to stop using MySQL & InnoDB in favour of leaner-and-meaner back-ends.
>> I would especially like to try kyoto cabinet[1] in on-disk BTree mode, but it
>> requires more work than the existing MySQL adaptor and there are more urgent
>> tasks in NEO.
>>
>> Just as a proof-of-concept, NEO can use a Python BTree implementation as an
>> alternative (RAM-only) storage back-end. We use ZODB's BTree implementation,
>> which might look surprising as it's designed to be stored in a ZODB... But
>> they work just as well in-RAM, and that's all I needed for such proof-of-
>> concept.
>>
>> [1] http://fallabs.com/kyotocabinet/
>>
>> Regards,
>
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev at zope.org
> https://mail.zope.org/mailman/listinfo/zodb-dev
>