[Zope] Acquisition loops and web robots

Andrews, Martin mandrews@netgenics.com
Thu, 11 Jan 2001 17:44:50 -0500


I have run into several cases where authors at our site have accidentally
employed acquisition to link documents in such a way that an infinite tree
of URLS are possible - for example:

/a/foo contains a link to "b/bar"
/b/bar contains a link to "a/foo"

This really causes a problem with our web indexer (htdig) hits the site and
indexes pages like:

	/a/b/a/b/a/b/a/b/a/b/a/b ... a/foo

Has anyone found a way to avoid these sort of problems - other then just not
indexing zope sites? I already limit htdig with the max_hop_count setting,
but that it tricky to tune correctly (and still index all valid files).

Martin

---
Martin Andrews
mandrews@netgenics.com