[ZODB-Dev] SAN/RHCS use for ZEO server?

Fri Apr 3 13:12:23 EDT 2009

Andrew Sawyers wrote:
>> I'm particularly interested in how you'll move the SAN from the primary
>> to the secondary node in the even of primary node failure,
> This won't be done by me; it's handled by another team.

Will it be done by software on the nodes or something else completely?

>> and how 
>> you'll bring the secondary's zeo server up when that happens.
> A script....I don't know yet if it's automatable beyond that.

What will fire that script?

>> I'm also interested in how the zeo clients fail over to the secondary
>> once it's up, or will you plan on doing a shared ip between the two nodes?
> We won't - if we fail over, it will be an entire new set of clients

Eh? So what happens to the old clients? There's nothing wrong with them, 
it's just the storage server node that's failed...

> connecting to the new storage....I'd prefer to be able to fall back onto the
> new storage tbh - but I'm not certain about the internal policies.  If we
> were to do failover, it would be by VIP - yes.

I'm not sure VIP is necessary. AFAIK, ClientStorage has the ability to 
round-robin back-end storages until it finds one that works. The was how 
ZRS works, right?

>> In both cases, I'm also interested in how pathological cases such as
>> failure mid-transaction, or worse, failure during a pack, are expected
>> to pan out.
> SRDF is basically transactional - so it would be just like restarting the
> primary during this condition... I'll keep this in mind for testing and
> report back though....

Indeed, but is the storage server code similarly sane: what if it writes 
one chunk (say to Data.fs) and then fails to write the BLOB file that 
was associated with that?

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk