[Zope] ZEO and a front end...

Bill Anderson bill@libc.org
Tue, 18 Jul 2000 04:22:16 -0600


Curtis Maloney wrote:
> 
> On Tue, 18 Jul 2000, ethan mindlace fremen wrote:
> > Curtis Maloney wrote:
> > > Yes, however his point is that by having each Zope instance
> > > 'predominantly' serving one portion of the site, its cache will contain
> > > more objects relevant, and thus be just that little bit faster.
> > >
> > > Personally, I find this such a simple idea that it MUST be good. (o8
> > > So much so, in fact, that I've decided to have a crack at writing just
> > > such a redirector.  I feel the Zope world (and others, most likely) could
> > > benefit from a 'preferential' redirector.
> >
> > The way I would do this is have
> >
> > section1.contrived-example.com
> > section2.contrived-example.com
> > section3.contrived-example.com
> >
> > with siteAccess, and then each zope would serve it according to it's IP
> > (though each "could" serve each site).  Then you can use whatever IP/DNS
> > load balancing tool your heart desires.
> 
> I think most people seem to be missing the point here.
> 
> The idea is that ALL servers can serve ALL content.  HOWEVER, the 'load
> balancer' will opt for a certain server for a certain URL, in order to
> improve cache hits.
> 
> So, for www.contrived-example.com/dir1  it will first try server1, but if
> it's busy (or down) it will try others.  This way, the cache on server1 is
> more likely to contain objects relevant to /dir1  and thus have a higher hit
> rate, therefore improving performance.

No, I understand what is being discussed, I doubt the problem. :-)

Given an equal distribution*, then all the back-end (BE) servers will
have a fairly consistent cache content from server to server. you are
_equally_ likely to hit a server with that object in cache. The more
requests you have for a given object, the greater odds you'll see it in
the caches of all BE servers.  

* Now, not all systems are equal, this is true. However, in an
intelligent load balancing sysstem, you 'weight' the faster/better
performing machines, such that they are hit more often. Since these
machines will be used more frequently, they will have the best chance to
have what you want in cache already. I just don't see that the
additional effort is worth it. The job is already done, and the
additional overhead would seem to outweigh any perceived increases in
performance.  See below.

 
> An enforced 'mapping', as you were suggesting, removes ALL redundancy from
> the site, but would likely provide even better cache hits.

How so?

http://my.site.com/sec1 is mapped to: sec1.site.com, which is load
balanced across as many machines as possible, using ZEO and a load
balancing tool. Any of the machines in the pool known as sec1 (nobody
said it had to be a single machine) could respond. since these machines
serve out sec1 predominantly (they can also participate in the general
site load balanceing servers), these would have a better cache hit rate
on sec1 stuff than the primary BE servers.

Perhaps this can help:

www.libc.org (real site, fictional setup :) is a ZEO cluster.
 o The site's primary ZEO Clients number 5.
 o My load balancing tool lets me weight some servers over others.

/Members is a heavily trafficked section, so I want it to be seperated
out using a rewrite tool (SiteAccess, Roxen, Apache mod_rewrite,
whatever) to send all /Members urls to members.libc.org. 

I set up two ZEO clients, M1 and M2. These two talk to the same ZSS as
the other 5, and respond to members.libc.org.

So, when you go to www.libc.org/Members, you will wind up on either M1
or M2. These machines are set up as low-weighted primary site servers
(bringing the total up to 7), so they will have a cache that is biased
towards /Members, but still can serve up any part of www.libc.org 

If M1 or M2 goes down, you stay up.

For added redundancy, you can add the other 5 primary servers as
low-weighted servers for  members.libc.org, such that if both M1 and M2
die, or get heavily loaded, one or more of the other 5 can pick up the
overage, just as M1 and M2 can for teh 5 primary servewrs for the main
site.

Now you have 'preferred' machines, to improve cache-hit-rate for certain
heavily trafficked sections of your site, and maintain (or even improve)
overall performance and redundancy of the system.  Of course, you still
have ZSS as a SPOF, but even that can be gotten around with good design
and planning. :^)

If that isn't enough, you can throw eddieware into the mix, which
*already* has the ability to redirect based upon the URL.

And-yes,-McGuyver-is-my-hero-ly y'rs Bill

--
Do not meddle in the affairs of sysadmins, for they are easy to annoy,
and have the root password.