[Grok-dev] the installation problems saga, part 2
faassen at startifact.com
Wed Apr 23 19:57:32 EDT 2008
Last year we had a bad set of installation problems with Grok: you'd get
an arbitrary set of versions, which would sometimes break. We didn't
control what versions you used. We fixed that by pinning versions.
Our installation problems are not over yet, however. We have two kinds
Problem 1: relying on too many servers for installation
Some packages we use don't actually upload their tarballs to PyPI.
Instead, they upload their tarballs somewhere else, and then point the
PyPI index page to their homepage. Thanks to the magic of setuptools, it
goes off to the homepage to find the download URLs, and downloads it.
mechanize is an example. See its index page here:
no tarballs to download, just links to sourceforge there. The 'simple'
page is actually the most instructive to see what setuptools looks at:
What actually happens it that setuptools doesn't appear to use the
download URL in this page, but instead goes off to the home page of
mechanize, parses it and then downloads the zip file.
Unfortunately sometimes this other website is down. It might be, say,
sourceforge. Sourceforge is not always very well-behaved. I've also had
problems installing psycopg recently, because the initd.org website it
is hosted on seems to be rather flaky.
This means that people's installation procedure will sometimes break in
the middle. That sucks. I'd rather rely on PyPI than on a lot of
different websites that can fail.
How to fix this one? Somehow fix PyPI so that they suck in all packages
into it? That's one alternative.
We could also modify KGS, Zope 3's package indexing system, to suck in
all packages. Then we'll run our own Grok KGS. Right now it doesn't but
just mirrors PyPI, and thus has the same problems PyPI does. Drawback:
we'd need to mirror *all* the packages on the PyPI. This might be quite
a lot of storage space and bandwith, plus needs to be maintained.
Yet another alternative would be to create a 'big tarball download'
installer for Grok. It'd come with all the proper files already
available. We'd need to make sure we should also include the Windows
binaries of those packages that need it. This would work for Grok,
though could still lead to problems as soon as someone adds in some
other package in their setup.py. It therefore won't work for all
Grok-based *applications*, and ideally we should find something that
works for both.
Problem 2: versions for non-Grok dependencies
When someone develops an application, they pull in dependencies that are
not listed in our versions.cfg, such as megrok.form. This in turn has
other dependencies, such as zc.datetimewidget. I just now discovered
that certain combinations of megrok.form and zc.datetimewidget actually
result in ZCML conflict errors. Ick!
We need to solve the problem of version management for applications, not
just for Grok. What I'd like to avoid is that everybody has to become
their own version manager - developers would need to maintain a list of
'versions' in their buildouts and just magically have to know which
versions of dependencies work together. The package developer knows
which versions work together, and the developer that uses the package
shouldn't really have to worry about it unless there's a special case.
The easy fix would be for the package developer to pin down *all* the
versions of *all* the packages (directly or indirectly) that the package
depends on in setup.py's requirements. Except the ones that Grok already
depends on, as that'd result in a conflict. This has two consequences,
* the package might break with new Grok releases which have newer
* the application developer is absolutely locked into using those
versions, there's no flexibility to upgrade a dependency to a higher
version, needed for some other package, etc.
I think pinning things down in setup.py is the only route we can really
take properly now, but we need to think of a better solution? Maintain a
package index with KGS for Grok *and* all possible Grok extensions? But
that's potentially all Python packages in the world...
Do people have ideas on this one?
Solving Problem 1 + Problem 2 do sound like eventually we'll need to
move into the "distribution management" business, similar to a Linux
distribution manages its packages. That's a big burden to take on,
though. I still have the hopes someone has some great idea that we
haven't thought of yet, though.
Perhaps this is a direction to explore: we could write a tool that for a
given package downloads all the right versions of the dependencies (as
we specify somewhere), and then packages them up in some form of special
tarball that contains the package and all its dependencies. We then also
have a tool which can find these big balls somewhere and installs them
into a place on the user's filesystem where buildout can find them as
normal eggs. Unfortunately I can't think of a way to fit this into the
whole setuptools/eggs system, and it'd be a bother to have to step
outside it - it'd be nice if setuptools looked for these balls first if
it saw a dependency in setup.py, and will get it first.
More information about the Grok-dev