[Python-Dev] Re: [Zope3-dev] Zip import and sys.path manipulation (was Re: directory hierarchy proposal)

Just van Rossum just@letterror.com
Mon, 16 Dec 2002 20:14:36 +0100


FWIW, I think it's generally a bad idea to depend on module.__file__ and
pkg.__path__ to find data files. Playing tricks with the contents of
pkg.__path__ apparently has its uses, but I think that (in general) it's a bad
idea as well. module.__file__ is mostly an introspective aid, and pkg.__path__
should IMO be seen as merely an implementation detail.

module.__file__: in a frozen module it will be set to "<frozen>". Expect
__file__ to be a path to a file and you're screwed.

To the lower levels of the import mechanism, the *existence* of pkg.__path__ is
all that's looked at: "hey, it's a package". Then there's freeze again: a frozen
package has a __path__ variable, but it's not a list, it's a *string*. Only when
an import goes through a sys.path item, __path__ is (more or less) guaranteed to
be a list.

The sys.meta_path import hook mechanism in my patch (the idea is stolen from
Gordon McMillan) acts on the same level as builtin module imports and frozen
module imports: it doesn't need sys.path. So it doesn't need any meaningful
object as pkg.__path__ either. I just uploaded a new version of the patch; it
now contains a test_importhooks.py script, which has a sys.meta_path test case
which actually sets pkg.__path__ to None. Works like a charm. Here's a (slightly
modified) comment from the test script:

    Depending on the kind of importer, there are different
    levels of freedom of what you can use as pkg.__path__.
    
    Importer object on sys.meta_path:
        it can use anything it pleases (even None), as long
        as a __path__ variable is set.
    Importer object on sys.path:
        pkg.__path__ must be a list; it's most logical to use
        an importer object as the only item. Could be the same
        importer instance that imported the package itself.
    A hook on sys.path_hooks:
        pkg.__path__ must be a list and its only item should
        be a string that the hook can handle itself.
    
    These are just guidelines: a set of hooks could in theory
    deliberately set pgk.__path__ up so submodule imports be
    handled by an entirely different importer. Not sure how
    useful that would be...

Just-my-2-eurocents.