[Zope] zope.org Members list

Martijn Pieters mj@digicool.com
Sat, 11 Mar 2000 09:20:30 +0100


From: "Michel Pelletier" <michel@digicool.com>
> alan runyan wrote:
> >
> > is it me or is the Members list take FOREVER to navigate through?  and
why
> > would it take so long (more than minute) to go from first page of
members to
> > the next page?  it wasn't like this a few months ago.  is it the load on
the
> > server?(I tried last night @4AM to the same results)  is it the way its
> > fetching and ordering members from the catalog (this definetly shouldnt
be
> > the problem - its only 6000 people, w/ listed boolean and alphabetizing
> > results).
>
> It doesn't use the Catalog.  It's a plain Zope folder.  This means that
> the folder object, including it's dictionary of childer, is loaded into
> memory every time it is accessed.  This is the inefficient part.
> Folders were never meant to contain this many objects.  This is not
> neccesarrily a lack, Linux directories are not that efficient either.
> Try putting 6000 files in one ext2 directory and then type 'ls'.  Pack a
> lunch.

Actually, the Members roster is a bit more efficient. It uses caching to
store the results. This was added only recently, before the roster was
offline for a while because it was taking too much resources.

What happens is this:

If the cache has expired, or is empty, we do go through all 6000 Member
folders, and check wether or not they are listed. From listed Folders (just
over a 1000 now), we store the id and title in a list. This list is sorted
on id. Next we store the total number of Members found.

Then, we loop over the 6000 Member objects, and check wether or not they
have been active in the past two weeks. The number of active accounts is
also stored.

Together with a timestamp, this info is then stored as a volatile dictionary
attribute, and returned.

A next call to Members will return this cached information. Timeout of the
cache currently is 60 minutes. So, the batched pages and the searches use
this cached information.

There are a few reasons why it still may be slow:

1/ The cache has expired or is empty: we have to fill it again.

2/ The cache is per-thread. It isn't persistent, and therefor not shared
between threads. Your request could hit on thread first, your next request a
different thread, and then have to wait for the cache to fill twice.

3/ The volatile attributes don't hang around as long as I expect them to.
Maybe the object cache is a bit too aggressive, I haven't figured this out
yet. I haven't been able to get the cache to stay around for more than 20
minutes, but it is hard to verify and debug with 7 threads. I am not sure
about this one, because in the course of some extra testing I got to see
cache older than 20 minutes again.

4/ The dictionary containing the cache may be coming from swapped memory.
Currently, the Zope.org process grows quite large. This may have to do with
the up to 42000 Folder objects loaded into memory at any time. The Members
Folder holds references to 6000 Member Folders, and there can be up to 7 of
them (7 threads). This is an area where a BTree based folder would _really_
help.

5/ Plain network lag. Check other pages on Zope.org.

6/ CPU is hogged. The rendering of the ZQR is very expensive. If it runs,
the whole of Zope.org suffers. I am seriously considering disabeling it.
People really onlly need the pre-rendered version. I was testing just now,
had to wait way too long, and noticed someone was looking at both the normal
ZQR (rendering from XML), and the print preview version. The Members roster
came from cache.

I CCed David Kankiewicz on this, so he knows =)

Martijn Pieters
| Software Engineer    mailto:mj@digicool.com
| Digital Creations  http://www.digicool.com/
| Creators of Zope       http://www.zope.org/
|   The Open Source Web Application Server
---------------------------------------------