[Zope] One big Zope or many small Zopes? Or perhaps a ZEO?

sean.upton@uniontrib.com sean.upton@uniontrib.com
Fri, 19 Oct 2001 12:14:28 -0700


We use ZEO + Squid, with plenty of fast hardware and segmented VLANs for
performance and security.  We proxy directly from Squid to ZServer, use
squid for load-balancing, and use a squid redirector for virtual host
support.  Squid does indeed help out a ton...  Caching proxies have the nice
benefit of allowing you to keep images in Zope, but have the speed of
serving them from in-memory caches (for in-transit and frequently-used
images).  We currently are running up to 150,000 page views / day through
this (classifieds site), and eventually will have up to 1 million+ page
views (well, on peak days) per day as we move more portions of our site onto
this setup away from primarily static publishing, and we feel pretty
confident with the combo, especially with what you might call "semi-dynamic"
content, like published items that get requested many times, which works
well for newspaper web site content, like classifieds ads, editorial, and
vertical advertising content like MLS listings.

We are likely to move a lot of stuff onto Zope gradually over the next year,
and we plan on having a lot of traffic, and a need to deal with it.  Here's
a simplified breakdown of what we are doing:

     3 cluster tiers, between each a different VLAN
     ==============================================    

                                  Caching Proxy
                              =====================
       [cache1]::::[cache2]
            | \                ZEO Client / Apache    
            v  \              =====================
       [node1]<-x->[node2]
            |   ___/          ZSS/NFS/MySQL Cluster
            v  /              =====================
       [storage]::::[hot_backup_storage]

For reliability, each node in each of these three tiers uses clustering
software (Linux-HA/heartbeat, which is really simple, free, and just does IP
takeover when it doesn't see the heartbeat of its peer). We are likely to
add ZEO client nodes as our traffic and application needs grow.

If you wanted to get a lot of the same benefits with much less hardware, you
could set up a 2-box arrangement:

1 - Get a fast dual-CPU box, with internal hardware RAID, and 1GB+ RAM,
running ZEO/ZSS and 2 ZEO client processes, as well as your relational
database software.  You use ZEO so your Zope processes can take advantage of
multiple processors.

2 - Get another similar box, and run Squid on it, with a bunch of redirector
processes (which are CPU intensive, justifying the investment in a dual CPU
box).  In order to load-balance the 2 Zopes running on different TCP ports
on server 1, you would have to use multiple IP addresses on your interface
on server 1 (above).

This wouldn't reduce single-points of failure, but would be quite nice from
a performance standpoint, and really doesn't involve a substantial hardware
investment. This would look like:

[squid]  Running Squid, load-balancing 2 ZEO client
  | |                   processes
  v v
 [node]  For example, Serving Zope at 10.1.1.1:8080
                      and 10.1.1.2:8080
	   This would require you to bind Z2.py to an interface
         This box would also run your Relational Databases, and the ZSS.

Servers we are looking into for next year that fit this kind of description
are, for example, Appro 1124, a 1 rack unit box, with Dual 1.5GHz Athlon
CPUs, which is a nice box designed around the Tyan Thunder K7 mainboard, and
this is likely the fasted dual-CPU x86 box you can buy, and it is reasonably
priced, given it was built with somewhat commodity standard components
(though it looks to handle heat well, from reviews I have read of this).

Anyway, this is just my take on the best way to address this; others may
feel differently, but I feel this is an excellent strategy for an online
newspaper site, or another site that "publishes" content accessed in similar
ways by many users.

Sean

=========================
Sean Upton
Senior Programmer/Analyst
SignOnSanDiego.com
The San Diego Union-Tribune
619.718.5241
sean.upton@uniontrib.com
=========================


-----Original Message-----
From: Kirk Strauser [mailto:kirk@strauser.com]
Sent: Friday, October 19, 2001 9:44 AM
To: zope@zope.org
Subject: Re: [Zope] One big Zope or many small Zopes? Or perhaps a ZEO?



At 2001-10-19T16:22:27Z, Chris Muldrow <muldrow@mac.com> writes:

> Also, we're running at traffic somewhere around 60,000-80,000 page views a
> day--not huge traffic, but more than we had a year ago, certainly. At what
> traffic point have most folks noticed a need for more server power?  Is it
> 100,000 page views? More? Less?
> 
> We are also serving ads to the Zopes from a different Windows 2000 server
> running Apache and using a PERL process to serve the ads at a rate of
> between 400,000 and 700,000 ads per day.

Note: I'm a Zope newbie, as anyone reading my last week's worth of postings
can tell, but I'm not completely inexperienced at network design.

I would strongly recommend the use of a proxy/cache in front of your
servers.  It sounds as if much of your content is pseudo-static.  That is,
although it may change, it's likely to do so slowly.  Caching servers can
make a vast difference in performance in setups like this.  For example,
suppose that users often go to:

  http://mynewspaper.com/sports/todays_headlines

Why force your Zope to regenerate that page 30,000 times per day when it may
only change 3 or 4 times?  Zope even has built-in methods for cache
management so that you can have it send special headers to the cache servers
to tell them how often to re-query specific objects.  You may not want to
cache a stock ticker at all.  OTOH, the current temperature won't
drastically change in any given 5 minute interval.

I haven't personally used these methods yet (see the first line of my post),
but I *can* certify that a properly-configured Squid server can increase
your current platform's potential throughput by several hundred percent.
-- 
Kirk Strauser

_______________________________________________
Zope maillist  -  Zope@zope.org
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )