[Checkins] SVN: mechanize/tags/0.1.10/ Vendor import mechanize-0.1.10.

Tres Seaver tseaver at palladion.com
Sun Dec 7 08:26:33 EST 2008


Log message for revision 93744:
  Vendor import mechanize-0.1.10.

Changed:
  A   mechanize/tags/0.1.10/
  A   mechanize/tags/0.1.10/0.1-changes.txt
  A   mechanize/tags/0.1.10/COPYING.txt
  A   mechanize/tags/0.1.10/ChangeLog.txt
  A   mechanize/tags/0.1.10/GeneralFAQ.html
  A   mechanize/tags/0.1.10/INSTALL.txt
  A   mechanize/tags/0.1.10/MANIFEST.in
  A   mechanize/tags/0.1.10/PKG-INFO
  A   mechanize/tags/0.1.10/README.html
  A   mechanize/tags/0.1.10/README.html.in
  A   mechanize/tags/0.1.10/README.txt
  A   mechanize/tags/0.1.10/attic/
  A   mechanize/tags/0.1.10/attic/BSDDBCookieJar.py
  A   mechanize/tags/0.1.10/attic/MSIEDBCookieJar.py
  A   mechanize/tags/0.1.10/doc.html
  A   mechanize/tags/0.1.10/doc.html.in
  A   mechanize/tags/0.1.10/examples/
  A   mechanize/tags/0.1.10/examples/hack21.py
  A   mechanize/tags/0.1.10/examples/pypi.py
  A   mechanize/tags/0.1.10/ez_setup.py
  A   mechanize/tags/0.1.10/functional_tests.py
  A   mechanize/tags/0.1.10/mechanize/
  A   mechanize/tags/0.1.10/mechanize/__init__.py
  A   mechanize/tags/0.1.10/mechanize/_auth.py
  A   mechanize/tags/0.1.10/mechanize/_beautifulsoup.py
  A   mechanize/tags/0.1.10/mechanize/_clientcookie.py
  A   mechanize/tags/0.1.10/mechanize/_debug.py
  A   mechanize/tags/0.1.10/mechanize/_file.py
  A   mechanize/tags/0.1.10/mechanize/_firefox3cookiejar.py
  A   mechanize/tags/0.1.10/mechanize/_gzip.py
  A   mechanize/tags/0.1.10/mechanize/_headersutil.py
  A   mechanize/tags/0.1.10/mechanize/_html.py
  A   mechanize/tags/0.1.10/mechanize/_http.py
  A   mechanize/tags/0.1.10/mechanize/_lwpcookiejar.py
  A   mechanize/tags/0.1.10/mechanize/_mechanize.py
  A   mechanize/tags/0.1.10/mechanize/_mozillacookiejar.py
  A   mechanize/tags/0.1.10/mechanize/_msiecookiejar.py
  A   mechanize/tags/0.1.10/mechanize/_opener.py
  A   mechanize/tags/0.1.10/mechanize/_pullparser.py
  A   mechanize/tags/0.1.10/mechanize/_request.py
  A   mechanize/tags/0.1.10/mechanize/_response.py
  A   mechanize/tags/0.1.10/mechanize/_rfc3986.py
  A   mechanize/tags/0.1.10/mechanize/_seek.py
  A   mechanize/tags/0.1.10/mechanize/_sockettimeout.py
  A   mechanize/tags/0.1.10/mechanize/_testcase.py
  A   mechanize/tags/0.1.10/mechanize/_upgrade.py
  A   mechanize/tags/0.1.10/mechanize/_urllib2.py
  A   mechanize/tags/0.1.10/mechanize/_useragent.py
  A   mechanize/tags/0.1.10/mechanize/_util.py
  A   mechanize/tags/0.1.10/mechanize.egg-info/
  A   mechanize/tags/0.1.10/mechanize.egg-info/PKG-INFO
  A   mechanize/tags/0.1.10/mechanize.egg-info/SOURCES.txt
  A   mechanize/tags/0.1.10/mechanize.egg-info/dependency_links.txt
  A   mechanize/tags/0.1.10/mechanize.egg-info/requires.txt
  A   mechanize/tags/0.1.10/mechanize.egg-info/top_level.txt
  A   mechanize/tags/0.1.10/mechanize.egg-info/zip-safe
  A   mechanize/tags/0.1.10/setup.cfg
  A   mechanize/tags/0.1.10/setup.py
  A   mechanize/tags/0.1.10/test/
  A   mechanize/tags/0.1.10/test/test_browser.doctest
  A   mechanize/tags/0.1.10/test/test_browser.py
  A   mechanize/tags/0.1.10/test/test_cookies.py
  A   mechanize/tags/0.1.10/test/test_date.py
  A   mechanize/tags/0.1.10/test/test_forms.doctest
  A   mechanize/tags/0.1.10/test/test_headers.py
  A   mechanize/tags/0.1.10/test/test_history.doctest
  A   mechanize/tags/0.1.10/test/test_html.doctest
  A   mechanize/tags/0.1.10/test/test_html.py
  A   mechanize/tags/0.1.10/test/test_opener.doctest
  A   mechanize/tags/0.1.10/test/test_opener.py
  A   mechanize/tags/0.1.10/test/test_password_manager.special_doctest
  A   mechanize/tags/0.1.10/test/test_pullparser.py
  A   mechanize/tags/0.1.10/test/test_request.doctest
  A   mechanize/tags/0.1.10/test/test_response.doctest
  A   mechanize/tags/0.1.10/test/test_response.py
  A   mechanize/tags/0.1.10/test/test_rfc3986.doctest
  A   mechanize/tags/0.1.10/test/test_robotfileparser.special_doctest
  A   mechanize/tags/0.1.10/test/test_urllib2.py
  A   mechanize/tags/0.1.10/test/test_useragent.py
  A   mechanize/tags/0.1.10/test-tools/
  A   mechanize/tags/0.1.10/test-tools/cookietest.cgi
  A   mechanize/tags/0.1.10/test-tools/doctest.py
  A   mechanize/tags/0.1.10/test-tools/linecache_copy.py
  A   mechanize/tags/0.1.10/test-tools/testprogram.py
  A   mechanize/tags/0.1.10/test-tools/twisted-localserver.py
  A   mechanize/tags/0.1.10/test.py

-=-
Added: mechanize/tags/0.1.10/0.1-changes.txt
===================================================================
--- mechanize/tags/0.1.10/0.1-changes.txt	                        (rev 0)
+++ mechanize/tags/0.1.10/0.1-changes.txt	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,65 @@
+Recent public API changes:
+
+- Since 0.1.2b beta release: Factory now takes EncodingFinder and
+  ResponseTypeFinder class instances instead of functions (since
+  closures don't play well with module pickle).
+
+- ClientCookie has been moved into the mechanize package and is no
+  longer a separate package.  The ClientCookie interface is still
+  supported, but all names must be imported from module mechanize
+  instead of from module ClientCookie.  Python 2.3 is now required. (I
+  have no plans to merge ClientForm with mechanize.)  Note that the
+  logging work-alike facility is gone, and the base logger has been
+  renamed from "ClientCookie" to "mechanize".  Also, the experimental
+  BSDDB support is now only included as example code, and not
+  installed, and the VERSION attribute has been removed (mechanize
+  still has its __version__ attribute).
+
+- pullparser has been moved into the mechanize package and is no
+  longer a separate package.  Also, the interface of pullparser is no
+  longer supported.  Instead, it's just a purely internal
+  implementation detail of mechanize.
+
+- Removed mechanize.Browser.set_seekable_responses() (they're always
+  seekable anyway).
+
+- Some mechanize.Browser constructor args have been moved to
+  mechanize.Factory (default_encoding, ...).
+
+- .get_links_iter() is gone (use .links() instead).
+
+- .forms() and .links() now both return iterators (in fact, generators),
+  not sequences (not really an interface change: these were always
+  documented to return iterables, but it will no doubt break some client
+  code).  Use e.g. list(browser.forms()) if you want a list.
+
+- .links no longer raises LinkNotFoundError (was accidental -- only
+  .click_link() / .find_link() should raise this).
+
+- Rename set_credentials --> set_password_manager (and add some new
+  methods to improve auth and proxy support).
+
+- Added response.get_data() and .set_data() methods, and make responses
+  copy.copy()able.  Browser has a .set_response() method.  responses
+  returned by the Browser are now copies, which means that other code
+  altering headers and data and calling .seek() won't mess up your copy of
+  a response.
+
+- mechanize.Factory has changed completely, to make it easier to avoid
+  re-parsing (principally: add .set_response() method and make
+  factory methods take no args)
+
+- mechanize.Browser.default_encoding is gone.
+
+- mechanize.Browser.set_seekable_responses() is gone (they're always
+  .seek()able).  Browser and UserAgent now both inherit from
+  mechanize.UserAgentBase, and UserAgent is now there only to add the
+  single method .set_seekable_responses().
+
+- Added Browser.encoding().
+
+- Factory() takes an i_want_broken_xhtml_support argument, as a stop
+  gap until I actually make a proper job of it.  Without a true value
+  for that argument, mechanize is ignorant of XML/XHTML.
+
+- _authen handler name renamed --> _basicauth

Added: mechanize/tags/0.1.10/COPYING.txt
===================================================================
--- mechanize/tags/0.1.10/COPYING.txt	                        (rev 0)
+++ mechanize/tags/0.1.10/COPYING.txt	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,101 @@
+All the code with the exception of _gzip.py is covered under either
+the BSD-style license immediately below, or (at your choice) the ZPL
+2.1.  The code in _gzip.py is taken from the effbot.org library, and
+falls under the effbot.org license (also BSD-style) that appears at
+the end of this file.
+
+
+Copyright (c) 2002-2006 John J. Lee <jjl at pobox.com>
+Copyright (c) 1997-1999 Gisle Aas
+Copyright (c) 1997-1999 Johnny Lee
+Copyright (c) 2003 Andy Lester
+
+
+BSD-style License
+==================
+
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+Neither the name of the contributors nor the names of their employers
+may be used to endorse or promote products derived from this software
+without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+
+
+
+ZPL 2.1
+==================
+
+Zope Public License (ZPL) Version 2.1
+
+A copyright notice accompanies this license document that identifies the copyright holders.
+
+This license has been certified as open source. It has also been designated as GPL compatible by the Free Software Foundation (FSF).
+
+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
+
+   1. Redistributions in source code must retain the accompanying copyright notice, this list of conditions, and the following disclaimer.
+   2. Redistributions in binary form must reproduce the accompanying copyright notice, this list of conditions, and the following disclaimer in the documentation and/or other materials provided with the distribution.
+   3. Names of the copyright holders must not be used to endorse or promote products derived from this software without prior written permission from the copyright holders.
+   4. The right to distribute this software or to use it for any purpose does not give you the right to use Servicemarks (sm) or Trademarks (tm) of the copyright holders. Use of them is covered by separate agreement with the copyright holders.
+   5. If any files are modified, you must cause the modified files to carry prominent notices stating that you changed the files and the date of any change.
+
+Disclaimer
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+
+
+
+--------------------------------------------------------------------
+The effbot.org Library is
+
+Copyright (c) 1999-2004 by Secret Labs AB
+Copyright (c) 1999-2004 by Fredrik Lundh
+
+By obtaining, using, and/or copying this software and/or its
+associated documentation, you agree that you have read, understood,
+and will comply with the following terms and conditions:
+
+Permission to use, copy, modify, and distribute this software and its
+associated documentation for any purpose and without fee is hereby
+granted, provided that the above copyright notice appears in all
+copies, and that both that copyright notice and this permission notice
+appear in supporting documentation, and that the name of Secret Labs
+AB or the author not be used in advertising or publicity pertaining to
+distribution of the software without specific, written prior
+permission.
+
+SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO
+THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
+FITNESS.  IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR
+ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
+OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+--------------------------------------------------------------------

Added: mechanize/tags/0.1.10/ChangeLog.txt
===================================================================
--- mechanize/tags/0.1.10/ChangeLog.txt	                        (rev 0)
+++ mechanize/tags/0.1.10/ChangeLog.txt	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,462 @@
+This isn't really in proper GNU ChangeLog format, it just happens to
+look that way.
+
+2008-12-03 John J Lee <jjl at pobox.com>
+	* 0.1.10 release.
+	* Add support for Python 2.6: Raise URLError on file: URL errors,
+	  not IOError (port of upstream urllib2 fix).  Add support for
+	  Python 2.6's per-connection timeouts: Add timeout arguments to
+	  urlopen(), Request constructor, .open(), and .open_novisit().
+	* Drop support for Python 2.3
+	* Add Content-length header to Request object (httplib bug that
+	  prevented doing that was fixed in Python 2.4).  There's no
+	  change is what is actually sent over the wire here, just in what
+	  headers get added to the Request object.
+	* Fix AttributeError on .retrieve() with a Request (as opposed to
+	  URL string) argument
+	* Don't change CookieJar state in .make_cookies().
+	* Fix AttributeError in case where .make_cookies() or
+	  .cookies_for_request() is called before other methods like
+	  .extract_cookies() or .make_cookie_header()
+	* Fixes affecting version cookie-attribute
+	  (http://bugs.python.org/issue3924).
+	* Silence module logging's "no handlers could be found for logger
+	  mechanize" warning in a way that doesn't clobber attempts to set
+	  log level sometimes
+	* Don't use private attribute of request in request upgrade
+	  handler (what was I thinking??)
+	* Don't call setup() on import of setup.py
+	* Add new public function effective_request_host
+	* Add .get_policy() method to CookieJar
+	* Add method CookieJar.cookies_for_request()
+	* Fix documented interface required of requests and responses (and
+	  add some tests for this!)
+	* Allow either .is_unverifiable() or .unverifiable on request
+	  objects (preferring the former)
+	* Note that there's a new functional test for , which fails when
+	  run against the sourceforge site (which is the default).  It
+	  looks like this reflects the fact that digest auth has been
+	  fairly broken since it was introduced in urllib2.  I don't plan
+	  to fix this myself.
+
+2008-09-24 John J Lee <jjl at pobox.com>
+	* 0.1.9 release.
+	* Fix ImportError if sqlite3 not available
+	* Fix a couple of functional tests not to wait 5 seconds each
+
+2008-09-13 John J Lee <jjl at pobox.com>
+	* 0.1.8 release.
+	* Close sockets.  This only affects Python 2.5 (and later) -
+	  earlier versions of Python were unaffected.  See
+	  http://bugs.python.org/issue1627441
+	* Make title parsing follow Firefox behaviour wrt child
+	  elements (previously the behaviour differed between Factory and
+	  RobustFactory).
+	* Fix BeautifulSoup RobustLinksFactory (hence RobustFactory) link
+	  text parsing for case of link text containing tags (Titus Brown)
+	* Fix issue where more tags after <title> caused default parser to
+	  raise an exception
+	* Handle missing cookie max-age value.  Previously, a warning was
+	  emitted in this case.
+	* Fix thoroughly broken digest auth (still need functional
+	  test!) (trebor74hr at ...)
+	* Handle cookies containing embedded tabs in mozilla format files
+	* Remove an assertion about mozilla format cookies file
+	  contents (raise LoadError instead)
+	* Fix MechanizeRobotFileParser.set_opener()
+	* Fix selection of global form using .select_form() (Titus Brown)
+	* Log skipped Refreshes
+	* Stop tests from clobbering files that happen to be lying around
+	  in cwd (!)
+	* Use SO_REUSEADDR for local test server.
+	* Raise exception if local test server fails to start.
+	* Tests no longer (accidentally) depend on third-party coverage
+	  module
+	* The usual docs and test fixes.
+	* Add convenience method Browser.open_local_file(filename)
+	* Add experimental support for Firefox 3 cookie jars
+	  ("cookies.sqlite").  Requires Python 2.5
+	* Fix a _gzip.py NameError (gzip support is experimental)
+
+2007-05-31 John J Lee <jjl at pobox.com>
+	* 0.1.7b release.
+	* Sub-requests should not usually be visiting, so make it so.  In
+	  fact the visible behaviour wasn't really broken here, since
+	  .back() skips over None responses (which is odd in itself, but
+	  won't be changed until after stable release is branched).
+	  However, this patch does change visible behaviour in that it
+	  creates new Request objects for sub-requests (e.g. basic auth
+	  retries) where previously we just mutated the existing Request
+	  object.
+	* Changes to sort out abuse of by SeekableProcessor and
+	  ResponseUpgradeProcessor (latter shouldn't have been public in
+	  the first place) and resulting confusing / unclear / broken
+	  behaviour.  Deprecate SeekableProcessor and
+	  ResponseUpgradeProcessor.  Add SeekableResponseOpener.  Remove
+	  SeekableProcessor and ResponseUpgradeProcessor from Browser.
+	  Move UserAgentBase.add_referer_header() to Browser (it was on by
+	  default, breaking UserAgent, and should never really have been
+	  there).
+	* Fix HTTP proxy support: r29110 meant that Request.get_selector()
+	  didn't take into account the change to .__r_host
+	  (Thanks tgates at ...).
+	* Redirected robots.txt fetch no longer results in another
+	  attempted robots.txt fetch to check the redirection is allowed!
+	* Fix exception raised by RFC 3986 implementation with
+	  urljoin(base, '/..')
+	* Fix two multiple-response-wrapping bugs.
+	* Add missing import in tests (caused failure on Windows).
+	* Set svn:eol-style to native for all text files in SVN.
+	* Add some tests for upgrade_response().
+	* Add a functional test for 302 + 404 case.
+	* Add an -l option to run the functional tests against a local
+	  twisted.web2-based server (you need Twisted installed for this
+	  to work).  This is much faster than running against
+	  wwwsearch.sourceforge.net
+	* Add -u switch to skip unittests (and only run the doctests).
+
+2007-01-07 John J Lee <jjl at pobox.com>
+
+	* 0.1.6b release
+	* Add mechanize.ParseError class, document it as part of the
+	  mechanize.Factory interface, and raise it from all Factory
+	  implementations.  This is backwards-compatible, since the new
+	  exception derives from the old exceptions.
+	* Bug fix: Truncation when there is no full .read() before
+	  navigating to the next page, and an old response is read after
+	  navigation.  This happened e.g. with r = br.open();
+	  r.readline(); br.open(url); r.read(); br.back() .
+	* Bug fix: when .back() caused a reload, it was returning the old
+	  response, not the .reload()ed one.
+	* Bug fix: .back() was not returning a copy of the response, which
+	  presumably would cause seek position problems.
+	* Bug fix: base tag without href attribute would override document
+	  URL with a None value, causing a crash (thanks Nathan Eror).
+	* Fix .set_response() to close current response first.
+	* Fix non-idempotent behaviour of Factory.forms() / .links() .
+	  Previously, if for example you got a ParseError during execution
+	  of .forms(), you could call it again and have it not raise an
+	  exception, because it started out where it left off!
+	* Add a missing copy.copy() to RobustFactory .
+	* Fix redirection to 'URIs' that contain characters that are not
+	  allowed in URIs (thanks Riko Wichmann).  Also, Request
+	  constructor now logs a module logging warning about any such bad
+	  URIs.
+	* Add .global_form() method to Browser to support form controls
+	  whose HTML elements are not descendants of any FORM element.
+	* Add a new method .visit_response() .  This creates a new history
+	  entry from a response object, rather than just changing the
+	  current visited response.  This is useful e.g. when you want to
+	  use Browser features in a handler.
+	* Misc minor bug fixes.
+
+2006-10-25 John J Lee <jjl at pobox.com>
+
+	* 0.1.5b release: Update setuptools dependencies to depend on
+	  ClientForm>=0.2.5 (for an important bug fix affecting fragments
+	  in URLs).  There are no other changes in this release -- this
+	  release was done purely so that people upgrading to the latest
+	  version of mechanize will get the latest ClientForm too.
+
+2006-10-14 John J Lee <jjl at pobox.com>
+	* 0.1.4b release: (skipped a version deliberately for obscure
+	  reasons)
+	* Improved auth & proxies support.
+	* Follow RFC 3986.
+	* Add a .set_cookie() method to Browser .
+	* Add Browser.open_novisit() and Request.visit to allow fetching
+	  files without affecting Browser state.
+	* UserAgent and Browser are now subclasses of UserAgentBase.
+	  UserAgent's only role in life above what UserAgentBase does is
+	  to provide the .set_seekable_responses() method (it lives there
+	  because Browser depends on seekable responses, because that's
+	  how browser history is implemented).
+	* Bundle BeautifulSoup 2.1.1.  No more dependency pain!  Note that
+	  BeautifulSoup is, and always was, optional, and that mechanize
+	  will eventually switch to BeautifulSoup version 3, at which
+	  point it may well stop bundling BeautifulSoup.  Note also that
+	  the module is only used internally, and is not available as a
+	  public attribute of the package.  If you dare, you can import it
+	  ("from mechanize import _beautifulsoup"), but beware that it
+	  will go away later, and that the API of BeautifulSoup will
+	  change when the upgrade to 3 happens.  Also, BeautifulSoup
+	  support (mainly RobustFactory) is still a little experimental
+	  and buggy.
+	* Fix HTTP-EQUIV with no content attribute case (thanks Pratik
+	  Dam).
+	* Fix bug with quoted META Refresh URL (thanks Nilton Volpato).
+	* Fix crash with </base> tag (yajdbgr02 at ...).
+	* Somebody found a server that (incorrectly) depends on HTTP
+	  header case, so follow the Title-Case convention.  Note that the
+	  Request headers interface(s), which were (somewhat oddly -- this
+	  is an inheritance from urllib2 that should really be fixed in a
+	  better way than it is currently) always case-sensitive still
+	  are; the only thing that changed is what actually eventually
+	  gets sent over the wire.
+	* Use mechanize (not urllib) to open robots.txt.  Don't consult
+	  RobotFileParser instance about non-HTTP URLs.
+	* Fix OpenerDirector.retrieve(), which was very broken (thanks
+	  Duncan Booth).
+	* Crash in a much more obvious way if trying to use OpenerDirector
+	  after .close() .
+	* .reload() on .back() if necessary (necessary iff response was
+	  not fully .read() on first .open()ing ) * Strip fragments before
+	  retrieving URLs (fixed Request.get_selector() to strip fragment)
+	* Fix catching HTTPError subclasses while still preserving all
+	  their response behaviour
+	* Correct over-enthusiastic documented guarantees of
+	  closeable_response .
+	* Fix assumption that httplib.HTTPMessage treats dict-style
+	  __setitem__ as append rather than set (where on earth did I get
+	  that from?).
+	* Expose History in mechanize/__init__.py (though interface is
+	  still experimental).
+	* Lots of other "internals" bugs fixed (thanks to reports /
+	  patches from Benji York especially, also Titus Brown, Duncan
+	  Booth, and me ;-), where I'm not 100% sure exactly when they
+	  were introduced, so not listing them here in detail.
+	* Numerous other minor fixes.
+	* Some code cleanup.
+
+2006-05-21 John J Lee <jjl at pobox.com>
+	* 0.1.2b release:
+	* mechanize now exports the whole urllib2 interface.
+	* Pull in bugfixed auth/proxy support code from Python 2.5.
+	* Bugfix: strip leading and trailing whitespace from link URLs
+	* Fix .any_response() / .any_request() methods to have ordering.
+	  consistent with rest of handlers rather than coming before all
+	  of them.
+	* Tell cookie-handling code about new TLDs.
+	* Remove Browser.set_seekable_responses() (they always are
+	  anyway).
+	* Show in web page examples how to munge responses and how to do
+	  proxy/auth.
+	* Rename 0.1.* changes document 0.1.0-changes.txt -->
+	  0.1-changes.txt.
+	* In 0.1 changes document, note change of logger name from
+	  "ClientCookie" to "mechanize"
+	* Add something about response objects to changes document
+	* Improve Browser.__str__
+	* Accept regexp strings as well as regexp objects when finding
+	  links.
+	* Add crappy gzip transfer encoding support.  This is off by
+	  default and warns if you turn it on (hopefully will get better
+	  later :-).
+	* A bit of internal cleanup following merge with pullparser /
+	  ClientCookie.
+
+2006-05-06 John J Lee <jjl at pobox.com>
+	* 0.1.1a release:
+	* Merge ClientCookie and pullparser with mechanize.
+	* Response object fixes.
+	* Remove accidental dependency on BeautifulSoup introduced in
+	  0.1.0a (the BeautifulSoup support is still here, but
+	  BeautifulSoup is not required to use mechanize).
+
+2006-05-03 John J Lee <jjl at pobox.com>
+	* 0.1.0a release:
+	* Stop trying to record precise dates in changelog, since that's
+	  silly ;-)
+	* A fair number of interface changes: see 0.1.0-changes.txt.
+	* Depend on recent ClientCookie with copy.copy()able response
+	  objects.
+	* Don't do broken XHTML handling by default (need to review code
+	  before switching this back on, e.g. should use a real XML parser
+	  for first-try at parsing).  To get the old behaviour, pass
+	  i_want_broken_xhtml_support=True to mechanize.DefaultFactory /
+	  .RobustFactory constructor.
+	* Numerous small bug fixes.
+	* Documentation & setup.py fixes.
+	* Don't use cookielib, to avoid having to work around Python 2.4
+	  RFC 2109 bug, and to avoid my braindead thread synchronisation
+	  code in cookielib :-((((( (I haven't encountered specific
+	  breakage due to latter, but since it's braindead I may as well
+	  avoid it).
+
+2005-11-30 John J Lee <jjl at pobox.com>
+	* Fixed setuptools support.
+	* Release 0.0.11a.
+
+2005-11-19 John J Lee <jjl at pobox.com>
+	* Release 0.0.10a.
+
+2005-11-17 John J Lee <jjl at pobox.com>
+	* Fix set_handle_referer.
+
+2005-11-12 John J Lee <jjl at pobox.com>
+	* Fix history (Gary Poster).
+	* Close responses on reload (Gary Poster).
+	* Don't depend on SSL support (Gary Poster).
+
+2005-10-31 John J Lee <jjl at pobox.com>
+	* Add setuptools support.
+
+2005-10-30 John J Lee <jjl at pobox.com>
+	* Don't mask AttributeError exception messages from ClientForm.
+	* Document intent of .links() vs. .get_links_iter(); Rename
+	  LinksFactory method.
+	* Remove pullparser import dependency.
+	* Remove Browser.urltags (now an argument to LinksFactory).
+	* Document Browser constructor as taking keyword args only (and
+	  change positional arg spec).
+	* Cleanup of lazy parsing (may fix bugs, not sure...).
+
+2005-10-28 John J Lee <jjl at pobox.com>
+	* Support ClientForm backwards_compat switch.
+
+2005-08-28 John J Lee <jjl at pobox.com>
+	* Apply optimisation patch (Stephan Richter).
+
+2005-08-15 John J Lee <jjl at pobox.com>
+	* Close responses (ie. close the file handles but leave response
+	  still .read()able &c., thanks to the response objects we're
+	  using) (aurel at nexedi.com).
+
+2005-08-14 John J Lee <jjl at pobox.com>
+	* Add missing argument to UserAgent's _add_referer_header stub.
+	* Doc and comment improvements.
+
+2005-06-28 John J Lee <jjl at pobox.com>
+	* Allow specifying parser class for equiv handling.
+	* Ensure correct default constructor args are passed to
+	  HTTPRefererProcessor.
+	* Allow configuring details of Refresh handling.
+	* Switch to tolerant parser.
+
+2005-06-11 John J Lee <jjl at pobox.com>
+	* Do .seek(0) after link parsing in a finally block.
+	* Regard text/xhtml as HTML.
+	* Fix 2.4-compatibility bugs.
+	* Fix spelling of '_equiv' feature string.
+
+2005-05-30 John J Lee <jjl at pobox.com>
+	* Turn on Referer, Refresh and HTTP-Equiv handling by default.
+
+2005-05-08 John J Lee <jjl at pobox.com>
+	* Fix .reload() to not update history (thanks to Titus Brown).
+	* Use cookielib where available
+
+2005-03-01 John J Lee <jjl at pobox.com>
+	* Fix referer bugs: Don't send URL fragments; Don't add in Referer
+	  header in redirected request unless original request had a
+	  Referer header.
+
+2005-02-19 John J Lee <jjl at pobox.com>
+        * Allow supplying own mechanize.FormsFactory, so eg. can use
+          ClientForm.XHTMLFormParser.  Also allow supplying own Request
+          class, and use sensible defaults for this.  Now depends on
+          ClientForm 0.1.17.  Side effect is that, since we use the
+          correct Request class by default, there's (I hope) no need for
+          using RequestUpgradeProcessor in Browser._add_referer_header()
+          :-)
+
+2005-01-30 John J Lee <jjl at pobox.com>
+	* Released 0.0.9a.
+
+2005-01-05 John J Lee <jjl at pobox.com>
+	* Fix examples (scraped sites have changed).
+	* Fix .set_*() method boolean arguments.
+	* The .response attribute is now a method, .response()
+	* Don't depend on BaseProcessor (no longer exists).
+
+2004-05-18 John J Lee <jjl at pobox.com>
+	* Released 0.0.8a:
+	* Added robots.txt observance, controlled by
+	* BASE element has attribute 'href', not 'uri'! (patch from Jochen
+	  Knuth)
+	* Fixed several bugs in handling of Referer header.
+	* Link.__eq__ now returns False instead of raising AttributeError
+	  on comparison with non-Link (patch from Jim Jewett)
+	* Removed dependencies on HTTPS support in Python and on
+	  ClientCookie.HTTPRobotRulesProcessor
+
+2004-01-18 John J Lee <jjl at pobox.com>
+	* Added robots.txt observance, controlled by
+	  UserAgent.set_handle_robots().  This is now on by default.
+	* Removed set_persistent_headers() method -- just use .addheaders,
+	  as in base class.
+
+2004-01-09 John J Lee <jjl at pobox.com>
+	* Removed unnecessary dependence on SSL support in Python.  Thanks
+	  to Krzysztof Kowalczyk for bug report.
+	* Released 0.0.7a.
+
+2004-01-06 John J Lee <jjl at pobox.com>
+	* Link instances may now be passed to .click_link() and
+	  .follow_link().
+	* Added a new example program, pypi.py.
+
+2004-01-05 John J Lee <jjl at pobox.com>
+	* Released 0.0.5a.
+	* If <title> tag was missing, links and forms would not be parsed.
+	  Also, base element (giving base URI) was ignored.  Now parse
+	  title lazily, and get base URI while parsing links.  Also, fixed
+	  ClientForm to take note of base element.  Thanks to Phillip J.
+	  Eby for bug report.
+	* Released 0.0.6a.
+
+2004-01-04 John J Lee <jjl at pobox.com>
+	* Fixed _useragent._replace_handler() to update self.handlers
+	  correctly.
+	* Updated required pullparser version check.
+	* Visiting a URL now deselects form (sets self.form to None).
+	* Only first Content-Type header is now checked by
+	  ._viewing_html(), if there are more than one.
+	* Stopped using getheaders from ClientCookie -- don't need it,
+	  since depend on Python 2.2, which has .getheaders() method on
+	  responses.  Improved comments.
+	* .open() now resets .response to None.  Also rearranged .open() a
+	  bit so instance remains in consistent state on failure.
+	* .geturl() now checks for non-None .response, and raises Browser.
+	* .back() now checks for non-None .response, and doesn't attempt
+	  to parse if it's None.
+	* .reload() no longer adds new history item.
+	* Documented tag argument to .find_link().
+	* Fixed a few places where non-keyword arguments for .find_link()
+	  were silently ignored.  Now raises ValueError.
+
+2004-01-02 John J Lee <jjl at pobox.com>
+	* Use response_seek_wrapper instead of seek_wrapper, which broke
+	  use of reponses after they're closed.
+	* (Fixed response_seek_wrapper in ClientCookie.)
+	* Fixed adding of Referer header.  Thanks to Per Cederqvist for
+	  bug report.
+	* Released 0.0.4a.
+	* Updated required ClientCookie version check.
+
+2003-12-30 John J Lee <jjl at pobox.com>
+	* Added support for character encodings (for matching link text).
+	* Released 0.0.3a.
+
+2003-12-28 John J Lee <jjl at pobox.com>
+	* Attribute lookups are no longer forwarded to .response --
+	  you have to do it explicitly.
+	* Added .geturl() method, which just delegates to .response.
+	* Big rehash of UserAgent, which was broken.  Added a test.
+	* Discovered that zip() doesn't raise an exception when its
+	  arguments are of different length, so several tests could pass
+	  when they should have failed.  Fixed.
+	* Fixed <A/> case in ._parse_html().
+	* Released 0.0.2a.
+
+2003-12-27 John J Lee <jjl at pobox.com>
+	* Added and improved docstrings.
+	* Browser.form is now a public attribute.  Also documented
+	  Browser's public attributes.
+	* Added base_url and absolute_url attributes to Link.
+	* Tidied up .open().  Relative URL Request objects are no longer
+	  converted to absolute URLs -- they should probably be absolute
+	  in the first place anyway.
+	* Added proper Referer handling (the handler in ClientCookie is a
+	  hack that only covers a restricted case).
+	* Added click_link method, for symmetry with .click() / .submit()
+	  methods (which latter apply to forms).  Of these methods,
+	  .click/.click_link() returns a request, and .submit/
+	  .follow_link() actually .open()s the request.
+	* Updated broken example code.
+
+2003-12-24 John J Lee <jjl at pobox.com>
+	* Modified setup.py so can easily register with PyPI.
+
+2003-12-22 John J Lee <jjl at pobox.com>
+	* Released 0.0.1a.

Added: mechanize/tags/0.1.10/GeneralFAQ.html
===================================================================
--- mechanize/tags/0.1.10/GeneralFAQ.html	                        (rev 0)
+++ mechanize/tags/0.1.10/GeneralFAQ.html	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,170 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
+        "http://www.w3.org/TR/html4/strict.dtd">
+
+
+
+
+<html>
+<head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
+  <meta name="author" content="John J. Lee &lt;jjl at pobox.com&gt;">
+  <meta name="date" content="2006-05-06">
+  <meta name="keywords" content="FAQ,cookie,HTTP,HTML,form,table,Python,web,client,client-side,testing,sniffer,https,script,embedded">
+  <title>Python web-client programming general FAQs</title>
+  <style type="text/css" media="screen">@import "../styles/style.css";</style>
+  <base href="http://wwwsearch.sourceforge.net/bits/clientx.html">
+</head>
+<body>
+
+<div id="sf"><a href="http://sourceforge.net">
+<img src="http://sourceforge.net/sflogo.php?group_id=48205&amp;type=2"
+ width="125" height="37" alt="SourceForge.net Logo"></a></div>
+<!--<img src="../images/sflogo.png"-->
+
+<h1>Python web-client programming general FAQs</h1>
+
+<div id="Content">
+<ul>
+  <li>Is there any example code?
+     <p>Look in the examples directory of <a href="../mechanize">mechanize</a>.
+     Note that the examples on the <a href="../ClientForm">ClientForm page</a>
+     are executable as-is.  Contributions of example code would be very
+     welcome!
+  <li>HTTPS on Windows?
+     <p>Use this <a href="http://pypgsql.sourceforge.net/misc/python22-win32-ssl.zip">
+     _socket.pyd</a>, or use Python 2.3.
+  <li>I want to see what my web browser is doing, but standard network sniffers
+     like <a href="http://www.ethereal.com/">ethereal</a> or netcat (nc) don't
+     work for HTTPS.  How do I sniff HTTPS traffic?
+  <p>Three good options:
+  <ul>
+    <li>Mozilla plugin: <a href="http://livehttpheaders.mozdev.org/">
+     livehttpheaders</a>.
+    <li><a href="http://www.blunck.info/iehttpheaders.html">ieHTTPHeaders</a>
+     does the same for MSIE.
+    <li>Use <a href="http://lynx.browser.org/">lynx</a> <code>-trace</code>,
+     and filter out the junk with a script.
+  </ul>
+  <p>I'm told you can also use a proxy like <a
+  href="http://www.proxomitron.info/">proxomitron</a> (never tried it
+  myself).  There's also a commercial <a href="http://www.simtec.ltd.uk/">MSIE
+  plugin</a>.
+  <li>Embedded script is messing up my web-scraping.  What do I do?
+     <p>It is possible to embed script in HTML pages (sandwiched between
+     <code>&lt;SCRIPT&gt;here&lt;/SCRIPT&gt;</code> tags, and in
+     <code>javascript:</code> URLs) - JavaScript / ECMAScript, VBScript, or
+     even Python.  These scripts can do all sorts of things, including causing
+     cookies to be set in a browser, submitting or filling in parts of forms in
+     response to user actions, changing link colours as the mouse moves over a
+     link, etc.
+
+     <p>If you come across this in a page you want to automate, you
+     have four options.  Here they are, roughly in order of simplicity.
+
+     <ul>
+       <li>Simply figure out what the embedded script is doing and emulate it
+       in your Python code: for example, by manually adding cookies to your
+       <code>CookieJar</code> instance, calling methods on
+       <code>HTMLForm</code>s, calling <code>urlopen</code>, etc.
+
+       <li>Dump mechanize and ClientForm and automate a browser instead.
+       For example use MS Internet Explorer via its COM automation interfaces, using
+       the <a href="http://starship.python.net/crew/mhammond/">Python for
+       Windows extensions</a>, aka pywin32, aka win32all (eg.
+       <a href="http://vsbabu.org/mt/archives/2003/06/13/ie_automation.html">simple
+       function</a>, <a href="http://pamie.sourceforge.net/">pamie</a>;
+       <a href="http://www.oreilly.com/catalog/pythonwin32/chapter/ch12.html">
+       pywin32 chapter from the O'Reilly book</a>) or
+       <a href="http://starship.python.net/crew/theller/ctypes/">ctypes</a>
+       (<a href="http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/305273">
+       example</a>: may be out of date, since <code>ctypes</code>' COM support is
+       still evolving).
+       <a href="http://www.brunningonline.net/simon/blog/archives/winGuiAuto.py.html">This</a>
+       kind of thing may also come in useful on Windows for cases where the
+       automation API is lacking.
+       <a href="http://ftp.acc.umu.se/pub/GNOME/sources/pyphany/">pyphany</a>
+       is a binding to the <a href="http://www.gnome.org/projects/epiphany/">
+       epiphany web browser</a>, allowing both plugins and automation code to be
+       written in Python.
+       XXX Mozilla automation &amp; XPCOM / PyXPCOM, Konqueror &amp; DCOP / KParts / PyKDE).
+
+       <li>Use Java's <a href="httpunit.sourceforge.net">httpunit</a> from
+       Jython, since it knows some JavaScript.
+       <li>Get ambitious and automatically delegate the work to an appropriate
+       interpreter (Mozilla's JavaScript interpreter, for instance).  This
+       approach is the one taken by <a href="../DOMForm">DOMForm</a> (the
+       JavaScript support is &quot;very alpha&quot;, though!).
+     </ul>
+  <li>Misc links
+     <ul>
+       <li><a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful
+       Soup</a> is a widely recommended HTML-parsing module.
+       <li><a href="http://linux.duke.edu/projects/urlgrabber/">urlgrabber</a>
+       contains useful stuff like persistent connections, mirroring and
+       throttling, and it looks like most or all of it is well-integrated with
+       <code>urllib2</code> (originally part of the yum package manager, but
+       now becoming a separate project).
+       <li>Another Java thing: <a href="http://maxq.tigris.org/">maxq</a>,
+       which provides a proxy to aid automatic generation of functional tests
+       written in Jython using the standard library unittest module (PyUnit)
+       and the &quot;Jakarta Commons&quot; HttpClient library.
+       <li>A useful set Zope-oriented links on <a
+       href="http://viii.dclxvi.org/bookmarks/tech/zope/test">tools for testing
+       web applications</a>.
+       <li>O'Reilly book: <a href="">Spidering Hacks</a>.  Very Perl-oriented.
+       <li>Useful
+       <a href="http://chrispederick.com/work/webdeveloper/"> Firefox extension
+       </a> which, amongst other things, can display HTML form information and
+       HTML table structure(thanks to Erno Kuusela for this link).
+       <li>
+       <a href="http://www.openqa.org/selenium/">Selenium</a>: In-browser web
+       functional testing.
+       <li><a href="http://www.opensourcetesting.org/functional.php">Open
+       source functional testing tools</a>.  A nice list.
+       <li><a href="http://www.rexx.com/~dkuhlman/quixote_htmlscraping.html">
+       A HOWTO on web scraping</a> from Dave Kuhlman.
+     </ul>
+  <li>Will any of this code make its way into the Python standard library?
+
+  <p>The request / response processing extensions to <code>urllib2</code> from
+     mechanize have been merged into <code>urllib2</code> for Python 2.4.
+     The cookie processing has been added, as module <code>cookielib</code>.
+     Eventually, I'll submit patches to get the http-equiv, refresh, and
+     robots.txt code in there too, and maybe <code>mechanize.UserAgent</code>
+     too (but <em>not</em> <code>mechanize.Browser</code>).  The rest, probably
+     not.
+
+</ul>
+
+<p>I prefer questions and comments to be sent to the <a
+href="http://lists.sourceforge.net/lists/listinfo/wwwsearch-general">
+mailing list</a> rather than direct to me.
+
+<p><a href="mailto:jjl at pobox.com">John J. Lee</a>,
+May 2006.
+
+</div> <!--id="Content"-->
+
+<div id="Menu">
+
+<a href="..">Home</a><br>
+<br>
+<span class="thispage">General FAQs</span><br>
+<br>
+<a href="../mechanize/">mechanize</a><br>
+<a href="../mechanize/doc.html"><span class="subpage">mechanize docs</span></a><br>
+<a href="../ClientForm/">ClientForm</a><br>
+<br>
+<a href="../ClientCookie/">ClientCookie</a><br>
+<a href="../ClientCookie/doc.html"><span class="subpage">ClientCookie docs</span></a><br>
+<a href="../pullparser/">pullparser</a><br>
+<a href="../DOMForm/">DOMForm</a><br>
+<a href="../python-spidermonkey/">python-spidermonkey</a><br>
+<a href="../ClientTable/">ClientTable</a><br>
+<a href="../bits/urllib2_152.py">1.5.2 urllib2.py</a><br>
+<a href="../bits/urllib_152.py">1.5.2 urllib.py</a><br>
+
+<br>
+
+</body>
+</html>

Added: mechanize/tags/0.1.10/INSTALL.txt
===================================================================
--- mechanize/tags/0.1.10/INSTALL.txt	                        (rev 0)
+++ mechanize/tags/0.1.10/INSTALL.txt	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,83 @@
+mechanize installation instructions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+NOTE: This file describes the old-fashioned install.
+
+I now recommend using EasyInstall instead.
+
+See the web page for EasyInstall instructions (included here as
+README.html).
+
+If you use EasyInstall, you should ignore the rest of this file.
+
+
+Dependencies
+~~~~~~~~~~~~
+
+See the web page for the version of Python required (included here as
+README.html).
+
+See setup.py for the required Python packages.
+
+
+Installation
+~~~~~~~~~~~~
+
+To install the package, run the following command:
+
+ python setup.py build
+
+then (with appropriate permissions)
+
+ python setup.py easy_install --no-deps .
+
+
+Alternatively, just copy the whole mechanize directory into a directory
+on your Python path (eg. unix: /usr/local/lib/python2.4/site-packages,
+Windows: C:\Python24\Lib\site-packages).  Only copy the mechanize
+directory that's inside the distributed tarball / zip archive, not the
+entire mechanize-x.x.x directory!
+
+
+To run the tests (none of which access the network), run the following
+command:
+
+ python test.py
+
+This runs the tests against the source files extracted from the package.
+For help on command line options:
+
+ python test.py --help
+
+To run the functional tests (which DO access the network), run the
+following command:
+
+ python functional_tests.py
+
+
+Please send bugs and comments to the mailing list (or failing that, to
+jjl at pobox.com):
+
+https://lists.sourceforge.net/lists/listinfo/wwwsearch-general
+
+
+NO WARRANTY
+
+THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
+WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
+
+Copyright Notices
+
+  (C) 2002-2006 John J. Lee.  All rights reserved.
+  (C) 1995-2001 Gisle Aas.  All rights reserved.   (Original LWP code)
+  (C) 2002-2003 Johhny Lee.  All rights reserved.  (MSIE Perl code)
+  (C) 2003 Andy Lester.  All rights reserved.  (Original WWW::Mechanize
+      Perl code)
+
+This code in this package is free software; you can redistribute it
+and/or modify it under the terms of the BSD or ZPL 2.1 licenses (see the
+file COPYING.txt).
+
+John J. Lee <jjl at pobox.com>
+May 2006

Added: mechanize/tags/0.1.10/MANIFEST.in
===================================================================
--- mechanize/tags/0.1.10/MANIFEST.in	                        (rev 0)
+++ mechanize/tags/0.1.10/MANIFEST.in	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,16 @@
+include MANIFEST.in
+include COPYING.txt
+include INSTALL.txt
+include GeneralFAQ.html
+include README.html.in
+include README.html
+include README.txt
+include doc.html.in
+include doc.html
+include ChangeLog.txt
+include 0.1-changes.txt
+include *.py
+prune docs-in-progress
+recursive-include examples *.py
+recursive-include attic *.py
+recursive-include test-tools *.py

Added: mechanize/tags/0.1.10/PKG-INFO
===================================================================
--- mechanize/tags/0.1.10/PKG-INFO	                        (rev 0)
+++ mechanize/tags/0.1.10/PKG-INFO	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,57 @@
+Metadata-Version: 1.0
+Name: mechanize
+Version: 0.1.10
+Summary: Stateful programmatic web browsing.
+Home-page: http://wwwsearch.sourceforge.net/mechanize/
+Author: John J. Lee
+Author-email: jjl at pobox.com
+License: BSD
+Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.10.tar.gz
+Description: Stateful programmatic web browsing, after Andy Lester's Perl module
+        WWW::Mechanize.
+        
+        The library is layered: mechanize.Browser (stateful web browser),
+        mechanize.UserAgent (configurable URL opener), plus urllib2 handlers.
+        
+        Features include: ftp:, http: and file: URL schemes, browser history,
+        high-level hyperlink and HTML form support, HTTP cookies, HTTP-EQUIV and
+        Refresh, Referer [sic] header, robots.txt, redirections, proxies, and
+        Basic and Digest HTTP authentication.  mechanize's response objects are
+        (lazily-) .seek()able and still work after .close().
+        
+        Much of the code originally derived from Perl code by Gisle Aas
+        (libwww-perl), Johnny Lee (MSIE Cookie support) and last but not least
+        Andy Lester (WWW::Mechanize).  urllib2 was written by Jeremy Hylton.
+        
+        
+Platform: any
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Intended Audience :: Developers
+Classifier: Intended Audience :: System Administrators
+Classifier: License :: OSI Approved :: BSD License
+Classifier: License :: OSI Approved :: Zope Public License
+Classifier: Natural Language :: English
+Classifier: Operating System :: OS Independent
+Classifier: Programming Language :: Python
+Classifier: Programming Language :: Python :: 2
+Classifier: Programming Language :: Python :: 2.4
+Classifier: Programming Language :: Python :: 2.5
+Classifier: Programming Language :: Python :: 2.6
+Classifier: Topic :: Internet
+Classifier: Topic :: Internet :: File Transfer Protocol (FTP)
+Classifier: Topic :: Internet :: WWW/HTTP
+Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
+Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
+Classifier: Topic :: Internet :: WWW/HTTP :: Site Management
+Classifier: Topic :: Internet :: WWW/HTTP :: Site Management :: Link Checking
+Classifier: Topic :: Software Development :: Libraries
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Classifier: Topic :: Software Development :: Testing
+Classifier: Topic :: Software Development :: Testing :: Traffic Generation
+Classifier: Topic :: System :: Archiving :: Mirroring
+Classifier: Topic :: System :: Networking :: Monitoring
+Classifier: Topic :: System :: Systems Administration
+Classifier: Topic :: Text Processing
+Classifier: Topic :: Text Processing :: Markup
+Classifier: Topic :: Text Processing :: Markup :: HTML
+Classifier: Topic :: Text Processing :: Markup :: XML

Added: mechanize/tags/0.1.10/README.html
===================================================================
--- mechanize/tags/0.1.10/README.html	                        (rev 0)
+++ mechanize/tags/0.1.10/README.html	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,610 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
+        "http://www.w3.org/TR/html4/strict.dtd">
+
+<html>
+<head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
+  <meta name="author" content="John J. Lee &lt;jjl at pobox.com&gt;">
+  <meta name="date" content="2008-12-02">
+  <meta name="keywords" content="Python,HTML,HTTP,browser,stateful,web,client,client-side,mechanize,cookie,form,META,HTTP-EQUIV,Refresh,ClientForm,ClientCookie,pullparser,WWW::Mechanize">
+  <meta name="keywords" content="cookie,HTTP,Python,web,client,client-side,HTML,META,HTTP-EQUIV,Refresh">
+  <title>mechanize</title>
+  <style type="text/css" media="screen">@import "../styles/style.css";</style>
+  
+</head>
+<body>
+
+<div id="sf"><a href="http://sourceforge.net">
+<img src="http://sourceforge.net/sflogo.php?group_id=48205&amp;type=2"
+ width="125" height="37" alt="SourceForge.net Logo"></a></div>
+<!--<img src="../images/sflogo.png"-->
+
+<h1>mechanize</h1>
+
+<div id="Content">
+
+<p>Stateful programmatic web browsing in Python, after Andy Lester's Perl
+module <a
+href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
+</a>.
+
+<ul>
+
+  <li><code>mechanize.Browser</code> is a subclass of
+    <code>mechanize.UserAgentBase</code>, which is, in turn, a subclass of
+    <code>urllib2.OpenerDirector</code> (in fact, of
+    <code>mechanize.OpenerDirector</code>), so:
+    <ul>
+      <li>any URL can be opened, not just <code>http:</code>
+
+      <li><code>mechanize.UserAgentBase</code> offers easy dynamic
+      configuration of user-agent features like protocol, cookie,
+      redirection and <code>robots.txt</code> handling, without having
+      to make a new <code>OpenerDirector</code> each time, e.g.  by
+      calling <code>build_opener()</code>.
+
+    </ul>
+  <li>Easy HTML form filling, using <a href="../ClientForm/">ClientForm</a>
+    interface.
+  <li>Convenient link parsing and following.
+  <li>Browser history (<code>.back()</code> and <code>.reload()</code>
+    methods).
+  <li>The <code>Referer</code> HTTP header is added properly (optional).
+  <li>Automatic observance of <a
+    href="http://www.robotstxt.org/wc/norobots.html">
+    <code>robots.txt</code></a>.
+  <li>Automatic handling of HTTP-Equiv and Refresh.
+</ul>
+
+
+<a name="examples"></a>
+<h2>Examples</h2>
+
+<p class="docwarning">This documentation is in need of reorganisation and
+extension!</p>
+
+<p>The two below are just to give the gist.  There are also some <a
+href="./#tests">actual working examples</a>.
+
+<pre>
+<span class="pykw">import</span> re
+<span class="pykw">from</span> mechanize <span class="pykw">import</span> Browser
+
+br = Browser()
+br.open(<span class="pystr">"http://www.example.com/"</span>)
+<span class="pycmt"># follow second link with element text matching regular expression
+</span>response1 = br.follow_link(text_regex=<span class="pystr">r"cheese\s*shop"</span>, nr=1)
+<span class="pykw">assert</span> br.viewing_html()
+<span class="pykw">print</span> br.title()
+<span class="pykw">print</span> response1.geturl()
+<span class="pykw">print</span> response1.info()  <span class="pycmt"># headers</span>
+<span class="pykw">print</span> response1.read()  <span class="pycmt"># body</span>
+response1.close()  <span class="pycmt"># (shown for clarity; in fact Browser does this for you)</span>
+
+br.select_form(name=<span class="pystr">"order"</span>)
+<span class="pycmt"># Browser passes through unknown attributes (including methods)
+</span><span class="pycmt"># to the selected HTMLForm (from ClientForm).
+</span>br[<span class="pystr">"cheeses"</span>] = [<span class="pystr">"mozzarella"</span>, <span class="pystr">"caerphilly"</span>]  <span class="pycmt"># (the method here is __setitem__)</span>
+response2 = br.submit()  <span class="pycmt"># submit current form</span>
+
+<span class="pycmt"># print currently selected form (don't call .submit() on this, use br.submit())
+</span><span class="pykw">print</span> br.form
+
+response3 = br.back()  <span class="pycmt"># back to cheese shop (same data as response1)</span>
+<span class="pycmt"># the history mechanism returns cached response objects
+</span><span class="pycmt"># we can still use the response, even though we closed it:
+</span>response3.seek(0)
+response3.read()
+response4 = br.reload()  <span class="pycmt"># fetches from server</span>
+
+<span class="pykw">for</span> form <span class="pykw">in</span> br.forms():
+    <span class="pykw">print</span> form
+<span class="pycmt"># .links() optionally accepts the keyword args of .follow_/.find_link()
+</span><span class="pykw">for</span> link <span class="pykw">in</span> br.links(url_regex=<span class="pystr">"python.org"</span>):
+    <span class="pykw">print</span> link
+    br.follow_link(link)  <span class="pycmt"># takes EITHER Link instance OR keyword args</span>
+    br.back()</pre>
+
+
+<p>You may control the browser's policy by using the methods of
+<code>mechanize.Browser</code>'s base class, <code>mechanize.UserAgent</code>.
+For example:
+
+<pre>
+br = Browser()
+<span class="pycmt"># Explicitly configure proxies (Browser will attempt to set good defaults).
+</span><span class="pycmt"># Note the userinfo ("joe:password@") and port number (":3128") are optional.
+</span>br.set_proxies({<span class="pystr">"http"</span>: <span class="pystr">"joe:password at myproxy.example.com:3128"</span>,
+                <span class="pystr">"ftp"</span>: <span class="pystr">"proxy.example.com"</span>,
+                })
+<span class="pycmt"># Add HTTP Basic/Digest auth username and password for HTTP proxy access.
+</span><span class="pycmt"># (equivalent to using "joe:password at ..." form above)
+</span>br.add_proxy_password(<span class="pystr">"joe"</span>, <span class="pystr">"password"</span>)
+<span class="pycmt"># Add HTTP Basic/Digest auth username and password for website access.
+</span>br.add_password(<span class="pystr">"http://example.com/protected/"</span>, <span class="pystr">"joe"</span>, <span class="pystr">"password"</span>)
+<span class="pycmt"># Don't handle HTTP-EQUIV headers (HTTP headers embedded in HTML).
+</span>br.set_handle_equiv(False)
+<span class="pycmt"># Ignore robots.txt.  Do not do this without thought and consideration.
+</span>br.set_handle_robots(False)
+<span class="pycmt"># Don't add Referer (sic) header
+</span>br.set_handle_referer(False)
+<span class="pycmt"># Don't handle Refresh redirections
+</span>br.set_handle_refresh(False)
+<span class="pycmt"># Don't handle cookies
+</span>br.set_cookiejar()
+<span class="pycmt"># Supply your own mechanize.CookieJar (NOTE: cookie handling is ON by
+</span><span class="pycmt"># default: no need to do this unless you have some reason to use a
+</span><span class="pycmt"># particular cookiejar)
+</span>br.set_cookiejar(cj)
+<span class="pycmt"># Log information about HTTP redirects and Refreshes.
+</span>br.set_debug_redirects(True)
+<span class="pycmt"># Log HTTP response bodies (ie. the HTML, most of the time).
+</span>br.set_debug_responses(True)
+<span class="pycmt"># Print HTTP headers.
+</span>br.set_debug_http(True)
+
+<span class="pycmt"># To make sure you're seeing all debug output:
+</span>logger = logging.getLogger(<span class="pystr">"mechanize"</span>)
+logger.addHandler(logging.StreamHandler(sys.stdout))
+logger.setLevel(logging.INFO)
+
+<span class="pycmt"># Sometimes it's useful to process bad headers or bad HTML:
+</span>response = br.response()  <span class="pycmt"># this is a copy of response</span>
+headers = response.info()  <span class="pycmt"># currently, this is a mimetools.Message</span>
+headers[<span class="pystr">"Content-type"</span>] = <span class="pystr">"text/html; charset=utf-8"</span>
+response.set_data(response.get_data().replace(<span class="pystr">"&lt;!---"</span>, <span class="pystr">"&lt;!--"</span>))
+br.set_response(response)</pre>
+
+
+<p>mechanize exports the complete interface of <code>urllib2</code>:
+
+<pre>
+<span class="pykw">import</span> mechanize
+response = mechanize.urlopen(<span class="pystr">"http://www.example.com/"</span>)
+<span class="pykw">print</span> response.read()</pre>
+
+
+
+<p>so anything you would normally import from <code>urllib2</code> can
+(and should, by preference, to insulate you from future changes) be
+imported from mechanize instead.  In many cases if you import an
+object from mechanize it will be the very same object you would get if
+you imported from urllib2.  In many other cases, though, the
+implementation comes from mechanize, either because bug fixes have
+been applied or the functionality of urllib2 has been extended in some
+way.
+
+
+<a name="useragentbase"></a>
+<h2>UserAgent vs UserAgentBase</h2>
+
+<p><code>mechanize.UserAgent</code> is a trivial subclass of
+<code>mechanize.UserAgentBase</code>, adding just one method,
+<code>.set_seekable_responses()</code> (see the <a
+href="./doc.html#seekable">documentation on seekable responses</a>).
+
+<p>The reason for the extra class is that
+<code>mechanize.Browser</code> depends on seekable response objects
+(because response objects are used to implement the browser history).
+
+
+<a name="compatnotes"></a>
+<h2>Compatibility</h2>
+
+<p>These notes explain the relationship between mechanize, ClientCookie,
+<code>cookielib</code> and <code>urllib2</code>, and which to use when.  If
+you're just using mechanize, and not any of those other libraries, you can
+ignore this section.
+
+<ol>
+
+  <li>mechanize works with Python 2.4, Python 2.5, and Python 2.6.
+
+  <li>ClientCookie is no longer maintained as a separate package.  The code is
+      now part of mechanize, and its interface is now exported through module
+      mechanize (since mechanize 0.1.0).  Old code can simply be changed to
+      <code>import mechanize as ClientCookie</code> and should continue to
+      work.
+
+  <li>The cookie handling parts of mechanize are in Python 2.4 standard library
+      as module <code>cookielib</code> and extensions to module
+      <code>urllib2</code>.
+
+</ol>
+
+<p><strong>IMPORTANT:</strong> The following are the ONLY cases where
+<code>mechanize</code> and <code>urllib2</code> code are intended to work
+together.  For all other code, use mechanize
+<em><strong>exclusively</strong></em>: do NOT mix use of mechanize and
+<code>urllib2</code>!
+
+<ol>
+
+  <li>Handler classes that are missing from 2.4's <code>urllib2</code>
+      (e.g. <code>HTTPRefreshProcessor</code>, <code>HTTPEquivProcessor</code>,
+      <code>HTTPRobotRulesProcessor</code>) may be used with the
+      <code>urllib2</code> of Python 2.4 or newer.  There are not currently any
+      functional tests for this in mechanize, however, so this feature may be
+      broken.
+
+  <li>If you want to use <code>mechanize.RefreshProcessor</code> with Python >=
+      2.4's <code>urllib2</code>, you must also use
+      <code>mechanize.HTTPRedirectHandler</code>.
+
+  <li><code>mechanize.HTTPRefererProcessor</code> requires special support from
+      <code>mechanize.Browser</code>, so cannot be used with vanilla
+      <code>urllib2</code>.
+
+  <li><code>mechanize.HTTPRequestUpgradeProcessor</code> and
+      <code>mechanize.ResponseUpgradeProcessor</code> are not useful outside of
+      mechanize.
+
+  <li>Request and response objects from code based on <code>urllib2</code> work
+      with mechanize, and vice-versa.
+
+  <li>The classes and functions exported by mechanize in its public interface
+      that come straight from <code>urllib2</code>
+      (e.g. <code>FTPHandler</code>, at the time of writing) do work with
+      mechanize (duh ;-).  Exactly which of these classes and functions come
+      straight from <code>urllib2</code> without extension or modification will
+      change over time, though, so don't rely on it; instead, just import
+      everything you need from mechanize, never from <code>urllib2</code>.  The
+      exception is usage as described in the first item in this list, which is
+      explicitly OK (though not well tested ATM), subject to the other
+      restrictions in the list above .
+
+</ol>
+
+
+<a name="docs"></a>
+<h2>Documentation</h2>
+
+<p>Full documentation is in the docstrings.
+
+<p>The documentation in the web pages is in need of reorganisation at the
+moment, after the merge of ClientCookie into mechanize.
+
+
+<a name="credits"></a>
+<h2>Credits</h2>
+
+<p>Thanks to all the too-numerous-to-list people who reported bugs and provided
+patches.  Also thanks to Ian Bicking, for persuading me that a
+<code>UserAgent</code> class would be useful, and to Ronald Tschalar for advice
+on Netscape cookies.
+
+<p>A lot of credit must go to Gisle Aas, who wrote libwww-perl, from which
+large parts of mechanize originally derived, and Andy Lester for the original,
+<a href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
+</a>.  Finally, thanks to the (coincidentally-named) Johnny Lee for the MSIE
+CookieJar Perl code from which mechanize's support for that is derived.
+
+
+<a name="todo"></a>
+<h2>To do</h2>
+
+<p>Contributions welcome!
+
+<p>The documentation to-do list has moved to the new "docs-in-progress"
+directory in SVN.
+
+<p><em>This is <strong>very</strong> roughly in order of priority</em>
+
+<ul>
+  <li>Test <code>.any_response()</code> two handlers case: ordering.
+  <li>Test referer bugs (frags and don't add in redirect unless orig
+    req had Referer)
+  <li>Remove use of urlparse from _auth.py.
+  <li>Proper XHTML support!
+  <li>Fix BeautifulSoup support to use a single BeautifulSoup instance
+    per page.
+  <li>Test BeautifulSoup support better / fix encoding issue.
+  <li>Support BeautifulSoup 3.
+  <li>Add another History implementation or two and finalise interface.
+  <li>History cache expiration.
+  <li>Investigate possible leak further (see Balazs Ree's list posting).
+  <li>Make <code>EncodingFinder</code> public, I guess (but probably
+    improve it first).  (For example: support Mark Pilgrim's universal
+    encoding detector?)
+  <li>Add two-way links between BeautifulSoup &amp; ClientForm object
+    models.
+  <li>In 0.2: switch to Python unicode strings everywhere appropriate
+    (HTTP level should still use byte strings, of course).
+  <li><code>clean_url()</code>: test browser behaviour.  I <em>think</em>
+    this is correct...
+  <li>Use a nicer RFC 3986 join / split / unsplit implementation.
+  <li>Figure out the Right Thing (if such a thing exists) for %-encoding.
+  <li>How do IRIs fit into the world?
+  <li>IDNA -- must read about security stuff first.
+  <li>Unicode support in general.
+  <li>Provide per-connection access to timeouts.
+  <li>Keep-alive / connection caching.
+  <li>Pipelining??
+  <li>Content negotiation.
+  <li>gzip transfer encoding (there's already a handler for this in
+    mechanize, but it's poorly implemented ATM).
+  <li>proxy.pac parsing (I don't think this needs JS interpretation)
+  <li>Topological sort for handlers, instead of .handler_order
+    attribute.  Ordering and other dependencies (where unavoidable)
+    should be defined separate from handlers themselves.  Add new
+    build_opener and deprecate the old one?  Actually, _useragent is
+    probably not far off what I'd have in mind (would just need a
+    method or two and a base class adding I think), and it's not a high
+    priority since I guess most people will just use the UserAgent and
+    Browser classes.
+
+ </ul>
+
+
+<a name="download"></a>
+<h2>Getting mechanize</h2>
+
+<p>You can install the <a href="./#source">old-fashioned way</a>, or using <a
+href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a>.  I
+recommend the latter even though EasyInstall is still in alpha, because it will
+automatically ensure you have the necessary dependencies, downloading if
+necessary.
+
+<p><a href="./#svn">Subversion (SVN) access</a> is also available.
+
+<p>Since EasyInstall is new, I include some instructions below, but mechanize
+follows standard EasyInstall / <code>setuptools</code> conventions, so you
+should refer to the <a
+href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a> and
+<a href="http://peak.telecommunity.com/DevCenter/setuptools">setuptools</a>
+documentation if you need more detailed or up-to-date instructions.
+
+<h2>EasyInstall / setuptools</h2>
+
+<p>The benefit of EasyInstall and the new <code>setuptools</code>-supporting
+<code>setup.py</code> is that they grab all dependencies for you.  Also, using
+EasyInstall is a one-liner for the common case, to be compared with the usual
+download-unpack-install cycle with <code>setup.py</code>.
+
+<h3>Using EasyInstall to download and install mechanize</h3>
+
+<ol>
+  <li><a href="http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install">
+Install easy_install</a>
+  <li><code>easy_install mechanize</code>
+</ol>
+
+<p>If you're on a Unix-like OS, you may need root permissions for that last
+step (or see the <a
+href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall
+documentation</a> for other installation options).
+
+<p>If you already have mechanize installed as a <a
+href="http://peak.telecommunity.com/DevCenter/PythonEggs">Python Egg</a> (as
+you do if you installed using EasyInstall, or using <code>setup.py
+install</code> from mechanize 0.0.10a or newer), you can upgrade to the latest
+version using:
+
+<pre>easy_install --upgrade mechanize</pre>
+
+<p>You may want to read up on the <code>-m</code> option to
+<code>easy_install</code>, which lets you install multiple versions of a
+package.
+
+<a name="svnhead"></a>
+<h3>Using EasyInstall to download and install the latest in-development (SVN HEAD) version of mechanize</h3>
+
+<pre>easy_install "mechanize==dev"</pre>
+
+<p>Note that that will not necessarily grab the SVN versions of dependencies,
+such as ClientForm: It will use SVN to fetch dependencies if and only if the
+SVN HEAD version of mechanize declares itself to depend on the SVN versions of
+those dependencies; even then, those declared dependencies won't necessarily be
+on SVN HEAD, but rather a particular revision.  If you want SVN HEAD for a
+dependency project, you should ask for it explicitly by running
+<code>easy_install "projectname=dev"</code> for that project.
+
+<p>Note also that you can still carry on using a plain old SVN checkout as
+usual if you like.
+
+<h3>Using setup.py from a .tar.gz, .zip or an SVN checkout to download and install mechanize</h3>
+
+<p><code>setup.py</code> should correctly resolve and download dependencies:
+
+<pre>python setup.py install</pre>
+
+<p>Or, to get access to the same options that <code>easy_install</code>
+accepts, use the <code>easy_install</code> distutils command instead of
+<code>install</code> (see <code>python setup.py --help easy_install</code>)
+
+<pre>python setup.py easy_install mechanize</pre>
+
+
+<a name="source"></a>
+<h2>Download</h2>
+<p>All documentation (including this web page) is included in the distribution.
+
+<p>This is a stable release.
+
+<p><em>Development release.</em>
+
+<ul>
+
+<li><a href="./src/mechanize-0.1.10.tar.gz">mechanize-0.1.10.tar.gz</a>
+<li><a href="./src/mechanize-0.1.10.zip">mechanize-0.1.10.zip</a>
+<li><a href="./src/ChangeLog.txt">Change Log</a> (included in distribution)
+<li><a href="./src/">Older versions.</a>
+</ul>
+
+<p>For old-style installation instructions, see the INSTALL file included in
+the distribution.  Better, <a href="./#download">use EasyInstall</a>.
+
+
+<a name="svn"></a>
+<h2>Subversion</h2>
+
+<p>The <a href="http://subversion.tigris.org/">Subversion (SVN)</a> trunk is <a href="http://codespeak.net/svn/wwwsearch/mechanize/trunk#egg=mechanize-dev">http://codespeak.net/svn/wwwsearch/mechanize/trunk</a>, so to check out the source:
+
+<pre>
+svn co http://codespeak.net/svn/wwwsearch/mechanize/trunk mechanize
+</pre>
+
+<a name="tests"></a>
+<h2>Tests and examples</h2>
+
+<h3>Examples</h3>
+
+<p>The <code>examples</code> directory in the <a href="./#source">source
+packages</a> contains a couple of silly, but working, scripts to demonstrate
+basic use of the module.  Note that it's in the nature of web scraping for such
+scripts to break, so don't be too suprised if that happens &#8211; do let me
+know, though!
+
+<p>It's worth knowing also that the examples on the <a
+href="../ClientForm/">ClientForm web page</a> are useful for mechanize users,
+and are now real run-able scripts rather than just documentation.
+
+<h3>Functional tests</h3>
+
+<p>To run the functional tests (which <strong>do</strong> access the network),
+run the following
+
+command:
+<pre>python functional_tests.py</pre>
+
+<h3>Unit tests</h3>
+
+<p>Note that ClientForm (a dependency of mechanize) has its own unit tests,
+which must be run separately.
+
+<p>To run the unit tests (none of which access the network), run the following
+command:
+
+<pre>python test.py</pre>
+
+<p>This runs the tests against the source files extracted from the
+package.  For help on command line options:
+
+<pre>python test.py --help</pre>
+
+
+<h2>See also</h2>
+
+<p>There are several wrappers around mechanize designed for functional testing
+of web applications:
+
+<ul>
+
+  <li><a href="http://cheeseshop.python.org/pypi?:action=display&amp;name=zope.testbrowser">
+    <code>zope.testbrowser</code></a> (or
+    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&amp;name=ZopeTestbrowser">
+    <code>ZopeTestBrowser</code></a>, the standalone version).
+  <li><a href="http://www.idyll.org/~t/www-tools/twill.html">twill</a>.
+</ul>
+
+<p>Richard Jones' <a href="http://mechanicalcat.net/tech/webunit/">webunit</a>
+(this is not the same as Steven Purcell's <a
+href="http://webunit.sourceforge.net/">code of the same name</a>).  webunit and
+mechanize are quite similar.  On the minus side, webunit is missing things like
+browser history, high-level forms and links handling, thorough cookie handling,
+refresh redirection, adding of the Referer header, observance of robots.txt and
+easy extensibility.  On the plus side, webunit has a bunch of utility functions
+bound up in its WebFetcher class, which look useful for writing tests (though
+they'd be easy to duplicate using mechanize).  In general, webunit has more of
+a frameworky emphasis, with aims limited to writing tests, where mechanize and
+the modules it depends on try hard to be general-purpose libraries.
+
+<p>There are many related links in the <a
+href="../bits/GeneralFAQ.html">General FAQ</a> page, too.
+
+
+<a name="faq"></a>
+<h2>FAQs - pre install</h2>
+<ul>
+  <li>Which version of Python do I need?
+  <p>Python 2.4, 2.5 or 2.6.
+  <li>What else do I need?
+  <p>mechanize depends on <a href="../ClientForm/">ClientForm</a>.
+  <li>Does mechanize depend on BeautifulSoup?
+     No.  mechanize offers a few (still rather experimental) classes that make
+     use of BeautifulSoup, but these classes are not required to use mechanize.
+     mechanize bundles BeautifulSoup version 2, so that module is no longer
+     required.  A future version of mechanize will support BeautifulSoup
+     version 3, at which point mechanize will likely no longer bundle the
+     module.
+  <p>The versions of those required modules are listed in the
+     <code>setup.py</code> for mechanize (included with the download).  The
+     dependencies are automatically fetched by <a
+     href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a>
+     (or by <a href="./#source">downloading</a> a mechanize source package and
+     running <code>python setup.py install</code>).  If you like you can fetch
+     and install them manually, instead &#8211; see the <code>INSTALL.txt</code>
+     file (included with the distribution).
+  <li>Which license?
+  <p>mechanize is dual-licensed: you may pick either the
+     <a href="http://www.opensource.org/licenses/bsd-license.php">BSD license</a>,
+     or the <a href="http://www.zope.org/Resources/ZPL">ZPL 2.1</a> (both are
+     included in the distribution).
+</ul>
+
+<a name="usagefaq"></a>
+<h2>FAQs - usage</h2>
+<ul>
+  <li>I'm not getting the HTML page I expected to see.
+    <ul>
+      <li><a href="http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging">Debugging tips</a>
+      <li><a href="http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html">More tips</a>
+     </ul>
+  <li>I'm <strong><em>sure</em></strong> this page is HTML, why does
+     <code>mechanize.Browser</code> think otherwise?
+<pre>
+b = mechanize.Browser(
+    <span class="pycmt"># mechanize's XHTML support needs work, so is currently switched off.  If</span>
+    <span class="pycmt"># we want to get our work done, we have to turn it on by supplying a</span>
+    <span class="pycmt"># mechanize.Factory (with XHTML support turned on):</span>
+    factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
+    )</pre>
+
+</ul>
+
+<p>I prefer questions and comments to be sent to the <a
+href="http://lists.sourceforge.net/lists/listinfo/wwwsearch-general">
+mailing list</a> rather than direct to me.
+
+<p><a href="mailto:jjl at pobox.com">John J. Lee</a>,
+December 2008.
+
+<hr>
+
+</div>
+
+<div id="Menu">
+
+<a href="..">Home</a><br>
+<br>
+<a href="../bits/GeneralFAQ.html">General FAQs</a><br>
+<br>
+<span class="thispage">mechanize</span><br>
+<a href="../mechanize/doc.html"><span class="subpage">mechanize docs</span></a><br>
+<a href="../ClientForm/">ClientForm</a><br>
+<br>
+<a href="../ClientCookie/">ClientCookie</a><br>
+<a href="../ClientCookie/doc.html"><span class="subpage">ClientCookie docs</span></a><br>
+<a href="../pullparser/">pullparser</a><br>
+<a href="../DOMForm/">DOMForm</a><br>
+<a href="../python-spidermonkey/">python-spidermonkey</a><br>
+<a href="../ClientTable/">ClientTable</a><br>
+<a href="../bits/urllib2_152.py">1.5.2 urllib2.py</a><br>
+<a href="../bits/urllib_152.py">1.5.2 urllib.py</a><br>
+
+<br>
+
+<a href="./#examples">Examples</a><br>
+<a href="./#compatnotes">Compatibility</a><br>
+<a href="./#docs">Documentation</a><br>
+<a href="./#todo">To-do</a><br>
+<a href="./#download">Download</a><br>
+<a href="./#svn">Subversion</a><br>
+<a href="./#tests">More examples</a><br>
+<a href="./#faq">FAQs</a><br>
+
+</div>
+
+
+</body>
+</html>

Added: mechanize/tags/0.1.10/README.html.in
===================================================================
--- mechanize/tags/0.1.10/README.html.in	                        (rev 0)
+++ mechanize/tags/0.1.10/README.html.in	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,606 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
+        "http://www.w3.org/TR/html4/strict.dtd">
+@# This file is processed by EmPy: do not edit
+@# http://wwwsearch.sf.net/bits/colorize.py
+@{
+from colorize import colorize
+import time
+import release
+last_modified = release.svn_id_to_time("$Id: README.html.in 60274 2008-12-02 00:18:37Z jjlee $")
+try:
+    base
+except NameError:
+    base = False
+}
+<html>
+<head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
+  <meta name="author" content="John J. Lee &lt;jjl@@pobox.com&gt;">
+  <meta name="date" content="@(time.strftime("%Y-%m-%d", last_modified))">
+  <meta name="keywords" content="Python,HTML,HTTP,browser,stateful,web,client,client-side,mechanize,cookie,form,META,HTTP-EQUIV,Refresh,ClientForm,ClientCookie,pullparser,WWW::Mechanize">
+  <meta name="keywords" content="cookie,HTTP,Python,web,client,client-side,HTML,META,HTTP-EQUIV,Refresh">
+  <title>mechanize</title>
+  <style type="text/css" media="screen">@@import "../styles/style.css";</style>
+  @[if base]<base href="http://wwwsearch.sourceforge.net/mechanize/">@[end if]
+</head>
+<body>
+
+<div id="sf"><a href="http://sourceforge.net">
+<img src="http://sourceforge.net/sflogo.php?group_id=48205&amp;type=2"
+ width="125" height="37" alt="SourceForge.net Logo"></a></div>
+<!--<img src="../images/sflogo.png"-->
+
+<h1>mechanize</h1>
+
+<div id="Content">
+
+<p>Stateful programmatic web browsing in Python, after Andy Lester's Perl
+module <a
+href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
+</a>.
+
+<ul>
+
+  <li><code>mechanize.Browser</code> is a subclass of
+    <code>mechanize.UserAgentBase</code>, which is, in turn, a subclass of
+    <code>urllib2.OpenerDirector</code> (in fact, of
+    <code>mechanize.OpenerDirector</code>), so:
+    <ul>
+      <li>any URL can be opened, not just <code>http:</code>
+
+      <li><code>mechanize.UserAgentBase</code> offers easy dynamic
+      configuration of user-agent features like protocol, cookie,
+      redirection and <code>robots.txt</code> handling, without having
+      to make a new <code>OpenerDirector</code> each time, e.g.  by
+      calling <code>build_opener()</code>.
+
+    </ul>
+  <li>Easy HTML form filling, using <a href="../ClientForm/">ClientForm</a>
+    interface.
+  <li>Convenient link parsing and following.
+  <li>Browser history (<code>.back()</code> and <code>.reload()</code>
+    methods).
+  <li>The <code>Referer</code> HTTP header is added properly (optional).
+  <li>Automatic observance of <a
+    href="http://www.robotstxt.org/wc/norobots.html">
+    <code>robots.txt</code></a>.
+  <li>Automatic handling of HTTP-Equiv and Refresh.
+</ul>
+
+
+<a name="examples"></a>
+<h2>Examples</h2>
+
+<p class="docwarning">This documentation is in need of reorganisation and
+extension!</p>
+
+<p>The two below are just to give the gist.  There are also some <a
+href="./#tests">actual working examples</a>.
+
+@{colorize(r"""
+import re
+from mechanize import Browser
+
+br = Browser()
+br.open("http://www.example.com/")
+# follow second link with element text matching regular expression
+response1 = br.follow_link(text_regex=r"cheese\s*shop", nr=1)
+assert br.viewing_html()
+print br.title()
+print response1.geturl()
+print response1.info()  # headers
+print response1.read()  # body
+response1.close()  # (shown for clarity; in fact Browser does this for you)
+
+br.select_form(name="order")
+# Browser passes through unknown attributes (including methods)
+# to the selected HTMLForm (from ClientForm).
+br["cheeses"] = ["mozzarella", "caerphilly"]  # (the method here is __setitem__)
+response2 = br.submit()  # submit current form
+
+# print currently selected form (don't call .submit() on this, use br.submit())
+print br.form
+
+response3 = br.back()  # back to cheese shop (same data as response1)
+# the history mechanism returns cached response objects
+# we can still use the response, even though we closed it:
+response3.seek(0)
+response3.read()
+response4 = br.reload()  # fetches from server
+
+for form in br.forms():
+    print form
+# .links() optionally accepts the keyword args of .follow_/.find_link()
+for link in br.links(url_regex="python.org"):
+    print link
+    br.follow_link(link)  # takes EITHER Link instance OR keyword args
+    br.back()
+""")}
+
+<p>You may control the browser's policy by using the methods of
+<code>mechanize.Browser</code>'s base class, <code>mechanize.UserAgent</code>.
+For example:
+
+@{colorize("""
+br = Browser()
+# Explicitly configure proxies (Browser will attempt to set good defaults).
+# Note the userinfo ("joe:password@") and port number (":3128") are optional.
+br.set_proxies({"http": "joe:password at myproxy.example.com:3128",
+                "ftp": "proxy.example.com",
+                })
+# Add HTTP Basic/Digest auth username and password for HTTP proxy access.
+# (equivalent to using "joe:password at ..." form above)
+br.add_proxy_password("joe", "password")
+# Add HTTP Basic/Digest auth username and password for website access.
+br.add_password("http://example.com/protected/", "joe", "password")
+# Don't handle HTTP-EQUIV headers (HTTP headers embedded in HTML).
+br.set_handle_equiv(False)
+# Ignore robots.txt.  Do not do this without thought and consideration.
+br.set_handle_robots(False)
+# Don't add Referer (sic) header
+br.set_handle_referer(False)
+# Don't handle Refresh redirections
+br.set_handle_refresh(False)
+# Don't handle cookies
+br.set_cookiejar()
+# Supply your own mechanize.CookieJar (NOTE: cookie handling is ON by
+# default: no need to do this unless you have some reason to use a
+# particular cookiejar)
+br.set_cookiejar(cj)
+# Log information about HTTP redirects and Refreshes.
+br.set_debug_redirects(True)
+# Log HTTP response bodies (ie. the HTML, most of the time).
+br.set_debug_responses(True)
+# Print HTTP headers.
+br.set_debug_http(True)
+
+# To make sure you're seeing all debug output:
+logger = logging.getLogger("mechanize")
+logger.addHandler(logging.StreamHandler(sys.stdout))
+logger.setLevel(logging.INFO)
+
+# Sometimes it's useful to process bad headers or bad HTML:
+response = br.response()  # this is a copy of response
+headers = response.info()  # currently, this is a mimetools.Message
+headers["Content-type"] = "text/html; charset=utf-8"
+response.set_data(response.get_data().replace("<!---", "<!--"))
+br.set_response(response)
+""")}
+
+<p>mechanize exports the complete interface of <code>urllib2</code>:
+
+@{colorize("""
+import mechanize
+response = mechanize.urlopen("http://www.example.com/")
+print response.read()
+""")}
+
+
+<p>so anything you would normally import from <code>urllib2</code> can
+(and should, by preference, to insulate you from future changes) be
+imported from mechanize instead.  In many cases if you import an
+object from mechanize it will be the very same object you would get if
+you imported from urllib2.  In many other cases, though, the
+implementation comes from mechanize, either because bug fixes have
+been applied or the functionality of urllib2 has been extended in some
+way.
+
+
+<a name="useragentbase"></a>
+<h2>UserAgent vs UserAgentBase</h2>
+
+<p><code>mechanize.UserAgent</code> is a trivial subclass of
+<code>mechanize.UserAgentBase</code>, adding just one method,
+<code>.set_seekable_responses()</code> (see the <a
+href="./doc.html#seekable">documentation on seekable responses</a>).
+
+<p>The reason for the extra class is that
+<code>mechanize.Browser</code> depends on seekable response objects
+(because response objects are used to implement the browser history).
+
+
+<a name="compatnotes"></a>
+<h2>Compatibility</h2>
+
+<p>These notes explain the relationship between mechanize, ClientCookie,
+<code>cookielib</code> and <code>urllib2</code>, and which to use when.  If
+you're just using mechanize, and not any of those other libraries, you can
+ignore this section.
+
+<ol>
+
+  <li>mechanize works with Python 2.4, Python 2.5, and Python 2.6.
+
+  <li>ClientCookie is no longer maintained as a separate package.  The code is
+      now part of mechanize, and its interface is now exported through module
+      mechanize (since mechanize 0.1.0).  Old code can simply be changed to
+      <code>import mechanize as ClientCookie</code> and should continue to
+      work.
+
+  <li>The cookie handling parts of mechanize are in Python 2.4 standard library
+      as module <code>cookielib</code> and extensions to module
+      <code>urllib2</code>.
+
+</ol>
+
+<p><strong>IMPORTANT:</strong> The following are the ONLY cases where
+<code>mechanize</code> and <code>urllib2</code> code are intended to work
+together.  For all other code, use mechanize
+<em><strong>exclusively</strong></em>: do NOT mix use of mechanize and
+<code>urllib2</code>!
+
+<ol>
+
+  <li>Handler classes that are missing from 2.4's <code>urllib2</code>
+      (e.g. <code>HTTPRefreshProcessor</code>, <code>HTTPEquivProcessor</code>,
+      <code>HTTPRobotRulesProcessor</code>) may be used with the
+      <code>urllib2</code> of Python 2.4 or newer.  There are not currently any
+      functional tests for this in mechanize, however, so this feature may be
+      broken.
+
+  <li>If you want to use <code>mechanize.RefreshProcessor</code> with Python >=
+      2.4's <code>urllib2</code>, you must also use
+      <code>mechanize.HTTPRedirectHandler</code>.
+
+  <li><code>mechanize.HTTPRefererProcessor</code> requires special support from
+      <code>mechanize.Browser</code>, so cannot be used with vanilla
+      <code>urllib2</code>.
+
+  <li><code>mechanize.HTTPRequestUpgradeProcessor</code> and
+      <code>mechanize.ResponseUpgradeProcessor</code> are not useful outside of
+      mechanize.
+
+  <li>Request and response objects from code based on <code>urllib2</code> work
+      with mechanize, and vice-versa.
+
+  <li>The classes and functions exported by mechanize in its public interface
+      that come straight from <code>urllib2</code>
+      (e.g. <code>FTPHandler</code>, at the time of writing) do work with
+      mechanize (duh ;-).  Exactly which of these classes and functions come
+      straight from <code>urllib2</code> without extension or modification will
+      change over time, though, so don't rely on it; instead, just import
+      everything you need from mechanize, never from <code>urllib2</code>.  The
+      exception is usage as described in the first item in this list, which is
+      explicitly OK (though not well tested ATM), subject to the other
+      restrictions in the list above .
+
+</ol>
+
+
+<a name="docs"></a>
+<h2>Documentation</h2>
+
+<p>Full documentation is in the docstrings.
+
+<p>The documentation in the web pages is in need of reorganisation at the
+moment, after the merge of ClientCookie into mechanize.
+
+
+<a name="credits"></a>
+<h2>Credits</h2>
+
+<p>Thanks to all the too-numerous-to-list people who reported bugs and provided
+patches.  Also thanks to Ian Bicking, for persuading me that a
+<code>UserAgent</code> class would be useful, and to Ronald Tschalar for advice
+on Netscape cookies.
+
+<p>A lot of credit must go to Gisle Aas, who wrote libwww-perl, from which
+large parts of mechanize originally derived, and Andy Lester for the original,
+<a href="http://search.cpan.org/dist/WWW-Mechanize/"><code>WWW::Mechanize</code>
+</a>.  Finally, thanks to the (coincidentally-named) Johnny Lee for the MSIE
+CookieJar Perl code from which mechanize's support for that is derived.
+
+
+<a name="todo"></a>
+<h2>To do</h2>
+
+<p>Contributions welcome!
+
+<p>The documentation to-do list has moved to the new "docs-in-progress"
+directory in SVN.
+
+<p><em>This is <strong>very</strong> roughly in order of priority</em>
+
+<ul>
+  <li>Test <code>.any_response()</code> two handlers case: ordering.
+  <li>Test referer bugs (frags and don't add in redirect unless orig
+    req had Referer)
+  <li>Remove use of urlparse from _auth.py.
+  <li>Proper XHTML support!
+  <li>Fix BeautifulSoup support to use a single BeautifulSoup instance
+    per page.
+  <li>Test BeautifulSoup support better / fix encoding issue.
+  <li>Support BeautifulSoup 3.
+  <li>Add another History implementation or two and finalise interface.
+  <li>History cache expiration.
+  <li>Investigate possible leak further (see Balazs Ree's list posting).
+  <li>Make <code>EncodingFinder</code> public, I guess (but probably
+    improve it first).  (For example: support Mark Pilgrim's universal
+    encoding detector?)
+  <li>Add two-way links between BeautifulSoup &amp; ClientForm object
+    models.
+  <li>In 0.2: switch to Python unicode strings everywhere appropriate
+    (HTTP level should still use byte strings, of course).
+  <li><code>clean_url()</code>: test browser behaviour.  I <em>think</em>
+    this is correct...
+  <li>Use a nicer RFC 3986 join / split / unsplit implementation.
+  <li>Figure out the Right Thing (if such a thing exists) for %-encoding.
+  <li>How do IRIs fit into the world?
+  <li>IDNA -- must read about security stuff first.
+  <li>Unicode support in general.
+  <li>Provide per-connection access to timeouts.
+  <li>Keep-alive / connection caching.
+  <li>Pipelining??
+  <li>Content negotiation.
+  <li>gzip transfer encoding (there's already a handler for this in
+    mechanize, but it's poorly implemented ATM).
+  <li>proxy.pac parsing (I don't think this needs JS interpretation)
+  <li>Topological sort for handlers, instead of .handler_order
+    attribute.  Ordering and other dependencies (where unavoidable)
+    should be defined separate from handlers themselves.  Add new
+    build_opener and deprecate the old one?  Actually, _useragent is
+    probably not far off what I'd have in mind (would just need a
+    method or two and a base class adding I think), and it's not a high
+    priority since I guess most people will just use the UserAgent and
+    Browser classes.
+
+ </ul>
+
+
+<a name="download"></a>
+<h2>Getting mechanize</h2>
+
+<p>You can install the <a href="./#source">old-fashioned way</a>, or using <a
+href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a>.  I
+recommend the latter even though EasyInstall is still in alpha, because it will
+automatically ensure you have the necessary dependencies, downloading if
+necessary.
+
+<p><a href="./#svn">Subversion (SVN) access</a> is also available.
+
+<p>Since EasyInstall is new, I include some instructions below, but mechanize
+follows standard EasyInstall / <code>setuptools</code> conventions, so you
+should refer to the <a
+href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a> and
+<a href="http://peak.telecommunity.com/DevCenter/setuptools">setuptools</a>
+documentation if you need more detailed or up-to-date instructions.
+
+<h2>EasyInstall / setuptools</h2>
+
+<p>The benefit of EasyInstall and the new <code>setuptools</code>-supporting
+<code>setup.py</code> is that they grab all dependencies for you.  Also, using
+EasyInstall is a one-liner for the common case, to be compared with the usual
+download-unpack-install cycle with <code>setup.py</code>.
+
+<h3>Using EasyInstall to download and install mechanize</h3>
+
+<ol>
+  <li><a href="http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install">
+Install easy_install</a>
+  <li><code>easy_install mechanize</code>
+</ol>
+
+<p>If you're on a Unix-like OS, you may need root permissions for that last
+step (or see the <a
+href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall
+documentation</a> for other installation options).
+
+<p>If you already have mechanize installed as a <a
+href="http://peak.telecommunity.com/DevCenter/PythonEggs">Python Egg</a> (as
+you do if you installed using EasyInstall, or using <code>setup.py
+install</code> from mechanize 0.0.10a or newer), you can upgrade to the latest
+version using:
+
+<pre>easy_install --upgrade mechanize</pre>
+
+<p>You may want to read up on the <code>-m</code> option to
+<code>easy_install</code>, which lets you install multiple versions of a
+package.
+
+<a name="svnhead"></a>
+<h3>Using EasyInstall to download and install the latest in-development (SVN HEAD) version of mechanize</h3>
+
+<pre>easy_install "mechanize==dev"</pre>
+
+<p>Note that that will not necessarily grab the SVN versions of dependencies,
+such as ClientForm: It will use SVN to fetch dependencies if and only if the
+SVN HEAD version of mechanize declares itself to depend on the SVN versions of
+those dependencies; even then, those declared dependencies won't necessarily be
+on SVN HEAD, but rather a particular revision.  If you want SVN HEAD for a
+dependency project, you should ask for it explicitly by running
+<code>easy_install "projectname=dev"</code> for that project.
+
+<p>Note also that you can still carry on using a plain old SVN checkout as
+usual if you like.
+
+<h3>Using setup.py from a .tar.gz, .zip or an SVN checkout to download and install mechanize</h3>
+
+<p><code>setup.py</code> should correctly resolve and download dependencies:
+
+<pre>python setup.py install</pre>
+
+<p>Or, to get access to the same options that <code>easy_install</code>
+accepts, use the <code>easy_install</code> distutils command instead of
+<code>install</code> (see <code>python setup.py --help easy_install</code>)
+
+<pre>python setup.py easy_install mechanize</pre>
+
+
+<a name="source"></a>
+<h2>Download</h2>
+<p>All documentation (including this web page) is included in the distribution.
+
+<p>This is a stable release.
+
+<p><em>Development release.</em>
+
+<ul>
+@{version = "0.1.10"}
+<li><a href="./src/mechanize-@(version).tar.gz">mechanize-@(version).tar.gz</a>
+<li><a href="./src/mechanize-@(version).zip">mechanize-@(version).zip</a>
+<li><a href="./src/ChangeLog.txt">Change Log</a> (included in distribution)
+<li><a href="./src/">Older versions.</a>
+</ul>
+
+<p>For old-style installation instructions, see the INSTALL file included in
+the distribution.  Better, <a href="./#download">use EasyInstall</a>.
+
+
+<a name="svn"></a>
+<h2>Subversion</h2>
+
+<p>The <a href="http://subversion.tigris.org/">Subversion (SVN)</a> trunk is <a href="http://codespeak.net/svn/wwwsearch/mechanize/trunk#egg=mechanize-dev">http://codespeak.net/svn/wwwsearch/mechanize/trunk</a>, so to check out the source:
+
+<pre>
+svn co http://codespeak.net/svn/wwwsearch/mechanize/trunk mechanize
+</pre>
+
+<a name="tests"></a>
+<h2>Tests and examples</h2>
+
+<h3>Examples</h3>
+
+<p>The <code>examples</code> directory in the <a href="./#source">source
+packages</a> contains a couple of silly, but working, scripts to demonstrate
+basic use of the module.  Note that it's in the nature of web scraping for such
+scripts to break, so don't be too suprised if that happens &#8211; do let me
+know, though!
+
+<p>It's worth knowing also that the examples on the <a
+href="../ClientForm/">ClientForm web page</a> are useful for mechanize users,
+and are now real run-able scripts rather than just documentation.
+
+<h3>Functional tests</h3>
+
+<p>To run the functional tests (which <strong>do</strong> access the network),
+run the following
+
+command:
+<pre>python functional_tests.py</pre>
+
+<h3>Unit tests</h3>
+
+<p>Note that ClientForm (a dependency of mechanize) has its own unit tests,
+which must be run separately.
+
+<p>To run the unit tests (none of which access the network), run the following
+command:
+
+<pre>python test.py</pre>
+
+<p>This runs the tests against the source files extracted from the
+package.  For help on command line options:
+
+<pre>python test.py --help</pre>
+
+
+<h2>See also</h2>
+
+<p>There are several wrappers around mechanize designed for functional testing
+of web applications:
+
+<ul>
+
+  <li><a href="http://cheeseshop.python.org/pypi?:action=display&amp;name=zope.testbrowser">
+    <code>zope.testbrowser</code></a> (or
+    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&amp;name=ZopeTestbrowser">
+    <code>ZopeTestBrowser</code></a>, the standalone version).
+  <li><a href="http://www.idyll.org/~t/www-tools/twill.html">twill</a>.
+</ul>
+
+<p>Richard Jones' <a href="http://mechanicalcat.net/tech/webunit/">webunit</a>
+(this is not the same as Steven Purcell's <a
+href="http://webunit.sourceforge.net/">code of the same name</a>).  webunit and
+mechanize are quite similar.  On the minus side, webunit is missing things like
+browser history, high-level forms and links handling, thorough cookie handling,
+refresh redirection, adding of the Referer header, observance of robots.txt and
+easy extensibility.  On the plus side, webunit has a bunch of utility functions
+bound up in its WebFetcher class, which look useful for writing tests (though
+they'd be easy to duplicate using mechanize).  In general, webunit has more of
+a frameworky emphasis, with aims limited to writing tests, where mechanize and
+the modules it depends on try hard to be general-purpose libraries.
+
+<p>There are many related links in the <a
+href="../bits/GeneralFAQ.html">General FAQ</a> page, too.
+
+
+<a name="faq"></a>
+<h2>FAQs - pre install</h2>
+<ul>
+  <li>Which version of Python do I need?
+  <p>Python 2.4, 2.5 or 2.6.
+  <li>What else do I need?
+  <p>mechanize depends on <a href="../ClientForm/">ClientForm</a>.
+  <li>Does mechanize depend on BeautifulSoup?
+     No.  mechanize offers a few (still rather experimental) classes that make
+     use of BeautifulSoup, but these classes are not required to use mechanize.
+     mechanize bundles BeautifulSoup version 2, so that module is no longer
+     required.  A future version of mechanize will support BeautifulSoup
+     version 3, at which point mechanize will likely no longer bundle the
+     module.
+  <p>The versions of those required modules are listed in the
+     <code>setup.py</code> for mechanize (included with the download).  The
+     dependencies are automatically fetched by <a
+     href="http://peak.telecommunity.com/DevCenter/EasyInstall">EasyInstall</a>
+     (or by <a href="./#source">downloading</a> a mechanize source package and
+     running <code>python setup.py install</code>).  If you like you can fetch
+     and install them manually, instead &#8211; see the <code>INSTALL.txt</code>
+     file (included with the distribution).
+  <li>Which license?
+  <p>mechanize is dual-licensed: you may pick either the
+     <a href="http://www.opensource.org/licenses/bsd-license.php">BSD license</a>,
+     or the <a href="http://www.zope.org/Resources/ZPL">ZPL 2.1</a> (both are
+     included in the distribution).
+</ul>
+
+<a name="usagefaq"></a>
+<h2>FAQs - usage</h2>
+<ul>
+  <li>I'm not getting the HTML page I expected to see.
+    <ul>
+      <li><a href="http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging">Debugging tips</a>
+      <li><a href="http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html">More tips</a>
+     </ul>
+  <li>I'm <strong><em>sure</em></strong> this page is HTML, why does
+     <code>mechanize.Browser</code> think otherwise?
+@{colorize("""
+b = mechanize.Browser(
+    # mechanize's XHTML support needs work, so is currently switched off.  If
+    # we want to get our work done, we have to turn it on by supplying a
+    # mechanize.Factory (with XHTML support turned on):
+    factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
+    )
+""")}
+</ul>
+
+<p>I prefer questions and comments to be sent to the <a
+href="http://lists.sourceforge.net/lists/listinfo/wwwsearch-general">
+mailing list</a> rather than direct to me.
+
+<p><a href="mailto:jjl@@pobox.com">John J. Lee</a>,
+@(time.strftime("%B %Y", last_modified)).
+
+<hr>
+
+</div>
+
+<div id="Menu">
+
+@(release.navbar('mechanize'))
+
+<br>
+
+<a href="./#examples">Examples</a><br>
+<a href="./#compatnotes">Compatibility</a><br>
+<a href="./#docs">Documentation</a><br>
+<a href="./#todo">To-do</a><br>
+<a href="./#download">Download</a><br>
+<a href="./#svn">Subversion</a><br>
+<a href="./#tests">More examples</a><br>
+<a href="./#faq">FAQs</a><br>
+
+</div>
+
+
+</body>
+</html>

Added: mechanize/tags/0.1.10/README.txt
===================================================================
--- mechanize/tags/0.1.10/README.txt	                        (rev 0)
+++ mechanize/tags/0.1.10/README.txt	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,515 @@
+   [1]SourceForge.net Logo
+
+                                   mechanize
+
+   Stateful programmatic web browsing in Python, after Andy Lester's Perl
+   module [2]WWW::Mechanize .
+     * mechanize.Browser is a subclass of mechanize.UserAgentBase, which
+       is, in turn, a subclass of urllib2.OpenerDirector (in fact, of
+       mechanize.OpenerDirector), so:
+          + any URL can be opened, not just http:
+          + mechanize.UserAgentBase offers easy dynamic configuration of
+            user-agent features like protocol, cookie, redirection and
+            robots.txt handling, without having to make a new
+            OpenerDirector each time, e.g. by calling build_opener().
+     * Easy HTML form filling, using [3]ClientForm interface.
+     * Convenient link parsing and following.
+     * Browser history (.back() and .reload() methods).
+     * The Referer HTTP header is added properly (optional).
+     * Automatic observance of [4]robots.txt.
+     * Automatic handling of HTTP-Equiv and Refresh.
+
+Examples
+
+   This documentation is in need of reorganisation and extension!
+
+   The two below are just to give the gist. There are also some [5]actual
+   working examples.
+import re
+from mechanize import Browser
+
+br = Browser()
+br.open("http://www.example.com/")
+# follow second link with element text matching regular expression
+response1 = br.follow_link(text_regex=r"cheese\s*shop", nr=1)
+assert br.viewing_html()
+print br.title()
+print response1.geturl()
+print response1.info()  # headers
+print response1.read()  # body
+response1.close()  # (shown for clarity; in fact Browser does this for you)
+
+br.select_form(name="order")
+# Browser passes through unknown attributes (including methods)
+# to the selected HTMLForm (from ClientForm).
+br["cheeses"] = ["mozzarella", "caerphilly"]  # (the method here is __setitem__)
+response2 = br.submit()  # submit current form
+
+# print currently selected form (don't call .submit() on this, use br.submit())
+print br.form
+
+response3 = br.back()  # back to cheese shop (same data as response1)
+# the history mechanism returns cached response objects
+# we can still use the response, even though we closed it:
+response3.seek(0)
+response3.read()
+response4 = br.reload()  # fetches from server
+
+for form in br.forms():
+    print form
+# .links() optionally accepts the keyword args of .follow_/.find_link()
+for link in br.links(url_regex="python.org"):
+    print link
+    br.follow_link(link)  # takes EITHER Link instance OR keyword args
+    br.back()
+
+   You may control the browser's policy by using the methods of
+   mechanize.Browser's base class, mechanize.UserAgent. For example:
+br = Browser()
+# Explicitly configure proxies (Browser will attempt to set good defaults).
+# Note the userinfo ("joe:password@") and port number (":3128") are optional.
+br.set_proxies({"http": "joe:password at myproxy.example.com:3128",
+                "ftp": "proxy.example.com",
+                })
+# Add HTTP Basic/Digest auth username and password for HTTP proxy access.
+# (equivalent to using "joe:password at ..." form above)
+br.add_proxy_password("joe", "password")
+# Add HTTP Basic/Digest auth username and password for website access.
+br.add_password("http://example.com/protected/", "joe", "password")
+# Don't handle HTTP-EQUIV headers (HTTP headers embedded in HTML).
+br.set_handle_equiv(False)
+# Ignore robots.txt.  Do not do this without thought and consideration.
+br.set_handle_robots(False)
+# Don't add Referer (sic) header
+br.set_handle_referer(False)
+# Don't handle Refresh redirections
+br.set_handle_refresh(False)
+# Don't handle cookies
+br.set_cookiejar()
+# Supply your own mechanize.CookieJar (NOTE: cookie handling is ON by
+# default: no need to do this unless you have some reason to use a
+# particular cookiejar)
+br.set_cookiejar(cj)
+# Log information about HTTP redirects and Refreshes.
+br.set_debug_redirects(True)
+# Log HTTP response bodies (ie. the HTML, most of the time).
+br.set_debug_responses(True)
+# Print HTTP headers.
+br.set_debug_http(True)
+
+# To make sure you're seeing all debug output:
+logger = logging.getLogger("mechanize")
+logger.addHandler(logging.StreamHandler(sys.stdout))
+logger.setLevel(logging.INFO)
+
+# Sometimes it's useful to process bad headers or bad HTML:
+response = br.response()  # this is a copy of response
+headers = response.info()  # currently, this is a mimetools.Message
+headers["Content-type"] = "text/html; charset=utf-8"
+response.set_data(response.get_data().replace("<!---", "<!--"))
+br.set_response(response)
+
+   mechanize exports the complete interface of urllib2:
+import mechanize
+response = mechanize.urlopen("http://www.example.com/")
+print response.read()
+
+   so anything you would normally import from urllib2 can (and should, by
+   preference, to insulate you from future changes) be imported from
+   mechanize instead. In many cases if you import an object from mechanize
+   it will be the very same object you would get if you imported from
+   urllib2. In many other cases, though, the implementation comes from
+   mechanize, either because bug fixes have been applied or the
+   functionality of urllib2 has been extended in some way.
+
+UserAgent vs UserAgentBase
+
+   mechanize.UserAgent is a trivial subclass of mechanize.UserAgentBase,
+   adding just one method, .set_seekable_responses() (see the
+   [6]documentation on seekable responses).
+
+   The reason for the extra class is that mechanize.Browser depends on
+   seekable response objects (because response objects are used to
+   implement the browser history).
+
+Compatibility
+
+   These notes explain the relationship between mechanize, ClientCookie,
+   cookielib and urllib2, and which to use when. If you're just using
+   mechanize, and not any of those other libraries, you can ignore this
+   section.
+    1. mechanize works with Python 2.4, Python 2.5, and Python 2.6.
+    2. ClientCookie is no longer maintained as a separate package. The
+       code is now part of mechanize, and its interface is now exported
+       through module mechanize (since mechanize 0.1.0). Old code can
+       simply be changed to import mechanize as ClientCookie and should
+       continue to work.
+    3. The cookie handling parts of mechanize are in Python 2.4 standard
+       library as module cookielib and extensions to module urllib2.
+
+   IMPORTANT: The following are the ONLY cases where mechanize and urllib2
+   code are intended to work together. For all other code, use mechanize
+   exclusively: do NOT mix use of mechanize and urllib2!
+    1. Handler classes that are missing from 2.4's urllib2 (e.g.
+       HTTPRefreshProcessor, HTTPEquivProcessor, HTTPRobotRulesProcessor)
+       may be used with the urllib2 of Python 2.4 or newer. There are not
+       currently any functional tests for this in mechanize, however, so
+       this feature may be broken.
+    2. If you want to use mechanize.RefreshProcessor with Python >= 2.4's
+       urllib2, you must also use mechanize.HTTPRedirectHandler.
+    3. mechanize.HTTPRefererProcessor requires special support from
+       mechanize.Browser, so cannot be used with vanilla urllib2.
+    4. mechanize.HTTPRequestUpgradeProcessor and
+       mechanize.ResponseUpgradeProcessor are not useful outside of
+       mechanize.
+    5. Request and response objects from code based on urllib2 work with
+       mechanize, and vice-versa.
+    6. The classes and functions exported by mechanize in its public
+       interface that come straight from urllib2 (e.g. FTPHandler, at the
+       time of writing) do work with mechanize (duh ;-). Exactly which of
+       these classes and functions come straight from urllib2 without
+       extension or modification will change over time, though, so don't
+       rely on it; instead, just import everything you need from
+       mechanize, never from urllib2. The exception is usage as described
+       in the first item in this list, which is explicitly OK (though not
+       well tested ATM), subject to the other restrictions in the list
+       above .
+
+Documentation
+
+   Full documentation is in the docstrings.
+
+   The documentation in the web pages is in need of reorganisation at the
+   moment, after the merge of ClientCookie into mechanize.
+
+Credits
+
+   Thanks to all the too-numerous-to-list people who reported bugs and
+   provided patches. Also thanks to Ian Bicking, for persuading me that a
+   UserAgent class would be useful, and to Ronald Tschalar for advice on
+   Netscape cookies.
+
+   A lot of credit must go to Gisle Aas, who wrote libwww-perl, from which
+   large parts of mechanize originally derived, and Andy Lester for the
+   original, [7]WWW::Mechanize . Finally, thanks to the
+   (coincidentally-named) Johnny Lee for the MSIE CookieJar Perl code from
+   which mechanize's support for that is derived.
+
+To do
+
+   Contributions welcome!
+
+   The documentation to-do list has moved to the new "docs-in-progress"
+   directory in SVN.
+
+   This is very roughly in order of priority
+     * Test .any_response() two handlers case: ordering.
+     * Test referer bugs (frags and don't add in redirect unless orig req
+       had Referer)
+     * Remove use of urlparse from _auth.py.
+     * Proper XHTML support!
+     * Fix BeautifulSoup support to use a single BeautifulSoup instance
+       per page.
+     * Test BeautifulSoup support better / fix encoding issue.
+     * Support BeautifulSoup 3.
+     * Add another History implementation or two and finalise interface.
+     * History cache expiration.
+     * Investigate possible leak further (see Balazs Ree's list posting).
+     * Make EncodingFinder public, I guess (but probably improve it
+       first). (For example: support Mark Pilgrim's universal encoding
+       detector?)
+     * Add two-way links between BeautifulSoup & ClientForm object models.
+     * In 0.2: switch to Python unicode strings everywhere appropriate
+       (HTTP level should still use byte strings, of course).
+     * clean_url(): test browser behaviour. I think this is correct...
+     * Use a nicer RFC 3986 join / split / unsplit implementation.
+     * Figure out the Right Thing (if such a thing exists) for %-encoding.
+     * How do IRIs fit into the world?
+     * IDNA -- must read about security stuff first.
+     * Unicode support in general.
+     * Provide per-connection access to timeouts.
+     * Keep-alive / connection caching.
+     * Pipelining??
+     * Content negotiation.
+     * gzip transfer encoding (there's already a handler for this in
+       mechanize, but it's poorly implemented ATM).
+     * proxy.pac parsing (I don't think this needs JS interpretation)
+     * Topological sort for handlers, instead of .handler_order attribute.
+       Ordering and other dependencies (where unavoidable) should be
+       defined separate from handlers themselves. Add new build_opener and
+       deprecate the old one? Actually, _useragent is probably not far off
+       what I'd have in mind (would just need a method or two and a base
+       class adding I think), and it's not a high priority since I guess
+       most people will just use the UserAgent and Browser classes.
+
+Getting mechanize
+
+   You can install the [8]old-fashioned way, or using [9]EasyInstall. I
+   recommend the latter even though EasyInstall is still in alpha, because
+   it will automatically ensure you have the necessary dependencies,
+   downloading if necessary.
+
+   [10]Subversion (SVN) access is also available.
+
+   Since EasyInstall is new, I include some instructions below, but
+   mechanize follows standard EasyInstall / setuptools conventions, so you
+   should refer to the [11]EasyInstall and [12]setuptools documentation if
+   you need more detailed or up-to-date instructions.
+
+EasyInstall / setuptools
+
+   The benefit of EasyInstall and the new setuptools-supporting setup.py
+   is that they grab all dependencies for you. Also, using EasyInstall is
+   a one-liner for the common case, to be compared with the usual
+   download-unpack-install cycle with setup.py.
+
+Using EasyInstall to download and install mechanize
+
+    1. [13]Install easy_install
+    2. easy_install mechanize
+
+   If you're on a Unix-like OS, you may need root permissions for that
+   last step (or see the [14]EasyInstall documentation for other
+   installation options).
+
+   If you already have mechanize installed as a [15]Python Egg (as you do
+   if you installed using EasyInstall, or using setup.py install from
+   mechanize 0.0.10a or newer), you can upgrade to the latest version
+   using:
+easy_install --upgrade mechanize
+
+   You may want to read up on the -m option to easy_install, which lets
+   you install multiple versions of a package.
+
+Using EasyInstall to download and install the latest in-development (SVN
+HEAD) version of mechanize
+
+easy_install "mechanize==dev"
+
+   Note that that will not necessarily grab the SVN versions of
+   dependencies, such as ClientForm: It will use SVN to fetch dependencies
+   if and only if the SVN HEAD version of mechanize declares itself to
+   depend on the SVN versions of those dependencies; even then, those
+   declared dependencies won't necessarily be on SVN HEAD, but rather a
+   particular revision. If you want SVN HEAD for a dependency project, you
+   should ask for it explicitly by running easy_install "projectname=dev"
+   for that project.
+
+   Note also that you can still carry on using a plain old SVN checkout as
+   usual if you like.
+
+Using setup.py from a .tar.gz, .zip or an SVN checkout to download and
+install mechanize
+
+   setup.py should correctly resolve and download dependencies:
+python setup.py install
+
+   Or, to get access to the same options that easy_install accepts, use
+   the easy_install distutils command instead of install (see python
+   setup.py --help easy_install)
+python setup.py easy_install mechanize
+
+Download
+
+   All documentation (including this web page) is included in the
+   distribution.
+
+   This is a stable release.
+
+   Development release.
+     * [16]mechanize-0.1.10.tar.gz
+     * [17]mechanize-0.1.10.zip
+     * [18]Change Log (included in distribution)
+     * [19]Older versions.
+
+   For old-style installation instructions, see the INSTALL file included
+   in the distribution. Better, [20]use EasyInstall.
+
+Subversion
+
+   The [21]Subversion (SVN) trunk is
+   [22]http://codespeak.net/svn/wwwsearch/mechanize/trunk, so to check out
+   the source:
+svn co http://codespeak.net/svn/wwwsearch/mechanize/trunk mechanize
+
+Tests and examples
+
+Examples
+
+   The examples directory in the [23]source packages contains a couple of
+   silly, but working, scripts to demonstrate basic use of the module.
+   Note that it's in the nature of web scraping for such scripts to break,
+   so don't be too suprised if that happens - do let me know, though!
+
+   It's worth knowing also that the examples on the [24]ClientForm web
+   page are useful for mechanize users, and are now real run-able scripts
+   rather than just documentation.
+
+Functional tests
+
+   To run the functional tests (which do access the network), run the
+   following command:
+python functional_tests.py
+
+Unit tests
+
+   Note that ClientForm (a dependency of mechanize) has its own unit
+   tests, which must be run separately.
+
+   To run the unit tests (none of which access the network), run the
+   following command:
+python test.py
+
+   This runs the tests against the source files extracted from the
+   package. For help on command line options:
+python test.py --help
+
+See also
+
+   There are several wrappers around mechanize designed for functional
+   testing of web applications:
+     * [25]zope.testbrowser (or [26]ZopeTestBrowser, the standalone
+       version).
+     * [27]twill.
+
+   Richard Jones' [28]webunit (this is not the same as Steven Purcell's
+   [29]code of the same name). webunit and mechanize are quite similar. On
+   the minus side, webunit is missing things like browser history,
+   high-level forms and links handling, thorough cookie handling, refresh
+   redirection, adding of the Referer header, observance of robots.txt and
+   easy extensibility. On the plus side, webunit has a bunch of utility
+   functions bound up in its WebFetcher class, which look useful for
+   writing tests (though they'd be easy to duplicate using mechanize). In
+   general, webunit has more of a frameworky emphasis, with aims limited
+   to writing tests, where mechanize and the modules it depends on try
+   hard to be general-purpose libraries.
+
+   There are many related links in the [30]General FAQ page, too.
+
+FAQs - pre install
+
+     * Which version of Python do I need?
+       Python 2.4, 2.5 or 2.6.
+     * What else do I need?
+       mechanize depends on [31]ClientForm.
+     * Does mechanize depend on BeautifulSoup? No. mechanize offers a few
+       (still rather experimental) classes that make use of BeautifulSoup,
+       but these classes are not required to use mechanize. mechanize
+       bundles BeautifulSoup version 2, so that module is no longer
+       required. A future version of mechanize will support BeautifulSoup
+       version 3, at which point mechanize will likely no longer bundle
+       the module.
+       The versions of those required modules are listed in the setup.py
+       for mechanize (included with the download). The dependencies are
+       automatically fetched by [32]EasyInstall (or by [33]downloading a
+       mechanize source package and running python setup.py install). If
+       you like you can fetch and install them manually, instead - see the
+       INSTALL.txt file (included with the distribution).
+     * Which license?
+       mechanize is dual-licensed: you may pick either the [34]BSD
+       license, or the [35]ZPL 2.1 (both are included in the
+       distribution).
+
+FAQs - usage
+
+     * I'm not getting the HTML page I expected to see.
+          + [36]Debugging tips
+          + [37]More tips
+     * I'm sure this page is HTML, why does mechanize.Browser think
+       otherwise?
+b = mechanize.Browser(
+    # mechanize's XHTML support needs work, so is currently switched off.  If
+    # we want to get our work done, we have to turn it on by supplying a
+    # mechanize.Factory (with XHTML support turned on):
+    factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
+    )
+
+   I prefer questions and comments to be sent to the [38]mailing list
+   rather than direct to me.
+
+   [39]John J. Lee, December 2008.
+     __________________________________________________________________
+
+   [40]Home
+   [41]General FAQs
+   mechanize
+   [42]mechanize docs
+   [43]ClientForm
+   [44]ClientCookie
+   [45]ClientCookie docs
+   [46]pullparser
+   [47]DOMForm
+   [48]python-spidermonkey
+   [49]ClientTable
+   [50]1.5.2 urllib2.py
+   [51]1.5.2 urllib.py
+   [52]Examples
+   [53]Compatibility
+   [54]Documentation
+   [55]To-do
+   [56]Download
+   [57]Subversion
+   [58]More examples
+   [59]FAQs
+
+References
+
+   1. http://sourceforge.net/
+   2. http://search.cpan.org/dist/WWW-Mechanize/
+   3. file://localhost/tmp/ClientForm/
+   4. http://www.robotstxt.org/wc/norobots.html
+   5. file://localhost/tmp/tmpg4nYYM/#tests
+   6. file://localhost/tmp/tmpg4nYYM/doc.html#seekable
+   7. http://search.cpan.org/dist/WWW-Mechanize/
+   8. file://localhost/tmp/tmpg4nYYM/#source
+   9. http://peak.telecommunity.com/DevCenter/EasyInstall
+  10. file://localhost/tmp/tmpg4nYYM/#svn
+  11. http://peak.telecommunity.com/DevCenter/EasyInstall
+  12. http://peak.telecommunity.com/DevCenter/setuptools
+  13. http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install
+  14. http://peak.telecommunity.com/DevCenter/EasyInstall
+  15. http://peak.telecommunity.com/DevCenter/PythonEggs
+  16. file://localhost/tmp/tmpg4nYYM/src/mechanize-0.1.10.tar.gz
+  17. file://localhost/tmp/tmpg4nYYM/src/mechanize-0.1.10.zip
+  18. file://localhost/tmp/tmpg4nYYM/src/ChangeLog.txt
+  19. file://localhost/tmp/tmpg4nYYM/src/
+  20. file://localhost/tmp/tmpg4nYYM/#download
+  21. http://subversion.tigris.org/
+  22. http://codespeak.net/svn/wwwsearch/mechanize/trunk#egg=mechanize-dev
+  23. file://localhost/tmp/tmpg4nYYM/#source
+  24. file://localhost/tmp/ClientForm/
+  25. http://cheeseshop.python.org/pypi?:action=display&name=zope.testbrowser
+  26. http://cheeseshop.python.org/pypi?%3Aaction=display&name=ZopeTestbrowser
+  27. http://www.idyll.org/~t/www-tools/twill.html
+  28. http://mechanicalcat.net/tech/webunit/
+  29. http://webunit.sourceforge.net/
+  30. file://localhost/tmp/bits/GeneralFAQ.html
+  31. file://localhost/tmp/ClientForm/
+  32. http://peak.telecommunity.com/DevCenter/EasyInstall
+  33. file://localhost/tmp/tmpg4nYYM/#source
+  34. http://www.opensource.org/licenses/bsd-license.php
+  35. http://www.zope.org/Resources/ZPL
+  36. http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging
+  37. http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html
+  38. http://lists.sourceforge.net/lists/listinfo/wwwsearch-general
+  39. mailto:jjl at pobox.com
+  40. file://localhost/tmp
+  41. file://localhost/tmp/bits/GeneralFAQ.html
+  42. file://localhost/tmp/mechanize/doc.html
+  43. file://localhost/tmp/ClientForm/
+  44. file://localhost/tmp/ClientCookie/
+  45. file://localhost/tmp/ClientCookie/doc.html
+  46. file://localhost/tmp/pullparser/
+  47. file://localhost/tmp/DOMForm/
+  48. file://localhost/tmp/python-spidermonkey/
+  49. file://localhost/tmp/ClientTable/
+  50. file://localhost/tmp/bits/urllib2_152.py
+  51. file://localhost/tmp/bits/urllib_152.py
+  52. file://localhost/tmp/tmpg4nYYM/#examples
+  53. file://localhost/tmp/tmpg4nYYM/#compatnotes
+  54. file://localhost/tmp/tmpg4nYYM/#docs
+  55. file://localhost/tmp/tmpg4nYYM/#todo
+  56. file://localhost/tmp/tmpg4nYYM/#download
+  57. file://localhost/tmp/tmpg4nYYM/#svn
+  58. file://localhost/tmp/tmpg4nYYM/#tests
+  59. file://localhost/tmp/tmpg4nYYM/#faq

Added: mechanize/tags/0.1.10/attic/BSDDBCookieJar.py
===================================================================
--- mechanize/tags/0.1.10/attic/BSDDBCookieJar.py	                        (rev 0)
+++ mechanize/tags/0.1.10/attic/BSDDBCookieJar.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,179 @@
+"""Persistent CookieJar based on bsddb standard library module.
+
+Copyright 2003-2006 John J Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+**********************************************************************
+THIS WAS NEVER FULLY TESTED, AND IS NOT MAINTAINED!
+**********************************************************************
+
+"""
+
+from mechanize import CookieJar
+
+# this is importing from a private module: don't do this in your own code!
+from _clientcookie import MappingIterator
+
+import bsddb
+import cPickle as pickle
+
+import logging
+debug = getLogger("mechanize").debug
+
+
+def CreateBSDDBCookieJar(filename, policy=None):
+    """Return a BSDDBCookieJar given a BSDDB filename.
+
+    Use this unless rather than directly using the BSDDBCookieJar constructor
+    unless you know what you're doing.
+
+    filename: filename for sleepycat BSDDB database; if the file doesn't exist,
+     it will be created; otherwise, it will be opened
+
+    **********************************************************************
+    BSDDBCookieJar IS NOT FULLY TESTED!
+    **********************************************************************
+
+    """
+    db = bsddb.db.DB()
+    db.open(filename, bsddb.db.DB_HASH, bsddb.db.DB_CREATE, 0666)
+    return BSDDBCookieJar(policy, db)
+
+class BSDDBIterator:
+    # XXXX should this use thread lock?
+    def __init__(self, cursor):
+        iterator = None
+        self._c = cursor
+        self._i = iterator
+    def __iter__(self): return self
+    def close(self):
+        if self._c is not None:
+            self._c.close()
+        self._c = self._i = self.next = self.__iter__ = None
+    def next(self):
+        while 1:
+            if self._i is None:
+                item = self._c.next()
+                if item is None:
+                    self.close()
+                    raise StopIteration()
+                domain, data = item
+                self._i = MappingIterator(pickle.loads(data))
+            try:
+                return self._i.next()
+            except StopIteration:
+                self._i = None
+                continue
+    def __del__(self):
+        # XXXX will this work?
+        self.close()
+
+class BSDDBCookieJar(CookieJar):
+    """CookieJar based on a BSDDB database, using the standard bsddb module.
+
+    You should use CreateBSDDBCookieJar instead of the constructor, unless you
+    know what you're doing.
+
+    Note that session cookies ARE stored in the database (marked as session
+    cookies), and will be written to disk if the database is file-based.  In
+    order to clear session cookies at the end of a session, you must call
+    .clear_session_cookies().
+
+    Call the .close() method after you've finished using an instance of this
+    class.
+
+    **********************************************************************
+    THIS IS NOT FULLY TESTED!
+    **********************************************************************
+
+    """
+    # XXX
+    # use transactions to make multiple reader processes possible
+    def __init__(self, policy=None, db=None):
+        CookieJar.__init__(self, policy)
+        del self._cookies
+        if db is None:
+            db = bsddb.db.DB()
+        self._db = db
+    def close(self):
+        self._db.close()
+    def __del__(self):
+        # XXXX will this work?
+        self.close()
+    def clear(self, domain=None, path=None, name=None):
+        if name is not None:
+            if (domain is None) or (path is None):
+                raise ValueError(
+                    "domain and path must be given to remove a cookie by name")
+        elif path is not None:
+            if domain is None:
+                raise ValueError(
+                    "domain must be given to remove cookies by path")
+
+        db = self._db
+        self._cookies_lock.acquire()
+        try:
+            if domain is not None:
+                data = db.get(domain)
+                if data is not None:
+                    if path is name is None:
+                        db.delete(domain)
+                    else:
+                        c2 = pickle.loads(data)
+                        if name is None:
+                            del c2[path]
+                        else:
+                            del c2[path][name]
+                else:
+                    raise KeyError("no domain '%s'" % domain)
+        finally:
+            self._cookies_lock.release()
+    def set_cookie(self, cookie):
+        db = self._db
+        self._cookies_lock.acquire()
+        try:
+            # store 2-level dict under domain, like {path: {name: value}}
+            data = db.get(cookie.domain)
+            if data is None:
+                c2 = {}
+            else:
+                c2 = pickle.loads(data)
+            if not c2.has_key(cookie.path): c2[cookie.path] = {}
+            c3 = c2[cookie.path]
+            c3[cookie.name] = cookie
+            db.put(cookie.domain, pickle.dumps(c2))
+        finally:
+            self._cookies_lock.release()
+    def __iter__(self):
+        return BSDDBIterator(self._db.cursor())
+    def _cookies_for_request(self, request):
+        """Return a list of cookies to be returned to server."""
+        cookies = []
+        for domain in self._db.keys():
+            cookies.extend(self._cookies_for_domain(domain, request))
+        return cookies
+    def _cookies_for_domain(self, domain, request, unverifiable):
+        debug("Checking %s for cookies to return", domain)
+        if not self._policy.domain_return_ok(domain, request, unverifiable):
+            return []
+
+        data = self._db.get(domain)
+        if data is None:
+            return []
+        cookies_by_path = pickle.loads(data)
+
+        cookies = []
+        for path in cookies_by_path.keys():
+            if not self._policy.path_return_ok(path, request, unverifiable):
+                continue
+            for name, cookie in cookies_by_path[path].items():
+                if not self._policy.return_ok(cookie, request, unverifiable):
+                    debug("   not returning cookie")
+                    continue
+                debug("   it's a match")
+                cookies.append(cookie)
+
+        return cookies

Added: mechanize/tags/0.1.10/attic/MSIEDBCookieJar.py
===================================================================
--- mechanize/tags/0.1.10/attic/MSIEDBCookieJar.py	                        (rev 0)
+++ mechanize/tags/0.1.10/attic/MSIEDBCookieJar.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,140 @@
+"""Persistent CookieJar based on MS Internet Explorer cookie database.
+
+Copyright 2003-2006 John J Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+**********************************************************************
+THIS DOESN'T WORK!
+
+It's just a sketch, to check the base class is OK.
+
+**********************************************************************
+
+"""
+
+from ClientCookie import MSIEBase, CookieJar
+from _util import time2netscape
+
+def set_cookie_hdr_from_cookie(cookie):
+    params = []
+    if cookie.name is not None:
+        params.append("%s=%s" % cookie.name, cookie.value)
+    else:
+        params.append(cookie.name)
+    if cookie.expires:
+        params.append("expires=" % time2netscape(cookie.expires))
+    if cookie.domain_specified:
+        params.append("Domain=%s" % cookie.domain)
+    if cookie.path_specified:
+        params.append("path=%s" % cookie.path)
+    if cookie.port_specified:
+        if cookie.port is None:
+            params.append("Port")
+        else:
+            params.append("Port=%s" % cookie.port)
+    if cookie.secure:
+        params.append("secure")
+##     if cookie.comment:
+##         params.append("Comment=%s" % cookie.comment)
+##     if cookie.comment_url:
+##         params.append("CommentURL=%s" % cookie.comment_url)
+    return "; ".join(params)
+
+class MSIEDBCookieJar(MSIEBase, CookieJar):
+    """A CookieJar that relies on MS Internet Explorer's cookie database.
+
+    XXX Require ctypes or write C extension?  win32all probably requires
+    latter.
+
+    **********************************************************************
+    THIS DOESN'T WORK!
+
+    It's just a sketch, to check the base class is OK.
+
+    **********************************************************************
+
+    MSIEDBCookieJar, unlike MSIECookieJar, keeps no state for itself, but
+    relies on the MS Internet Explorer's cookie database.  It uses the win32
+    API functions InternetGetCookie() and InternetSetCookie(), from the wininet
+    library.
+
+    Note that MSIE itself may impose additional conditions on cookie processing
+    on top of that done by CookiePolicy.  For cookie setting, the class tries
+    to foil that by providing the request details and Set-Cookie header it
+    thinks MSIE wants to see.  For returning cookies to the server, it's up to
+    MSIE.
+
+    Note that session cookies ARE NOT written to disk and won't be accessible
+    from other processes.  .clear_session_cookies() has no effect.
+
+    .clear_expired_cookies() has no effect: MSIE is responsible for this.
+
+    .clear() will raise NotImplementedError unless all three arguments are
+    given.
+
+    """
+    def __init__(self, policy=None):
+        MSIEBase.__init__(self)
+        FileCookieJar.__init__(self, policy)
+    def clear_session_cookies(self): pass
+    def clear_expired_cookies(self): pass
+    def clear(self, domain=None, path=None, name=None):
+        if None in [domain, path, name]:
+            raise NotImplementedError()
+        # XXXX
+        url = self._fake_url(domain, path)
+        hdr = "%s=; domain=%s; path=%s; max-age=0" % (name, domain, path)
+        r = windll.InternetSetCookie(url, None, hdr)
+        # XXX return value of InternetSetCookie?
+    def _fake_url(self, domain, path):
+        # to convince MSIE that Set-Cookie is OK
+        return "http://%s%s" % (domain, path)
+    def set_cookie(self, cookie):
+        # XXXX
+        url = self._fake_url(cookie.domain, cookie.path)
+        r = windll.InternetSetCookie(
+            url, None, set_cookie_hdr_from_cookie(cookie))
+        # XXX return value of InternetSetCookie?
+    def add_cookie_header(self, request, unverifiable=False):
+        # XXXX
+        cookie_header = windll.InternetGetCookie(request.get_full_url())
+        # XXX return value of InternetGetCookie?
+        request.add_unredirected_header(cookie_header)
+    def __iter__(self):
+        self._load_index_dat()
+        return CookieJar.__iter__(self)
+    def _cookies_for_request(self, request):
+        raise NotImplementedError()  # XXXX
+    def _cookies_for_domain(self, domain, request):
+        #raise NotImplementedError()  # XXXX
+        debug("Checking %s for cookies to return", domain)
+        if not self._policy.domain_return_ok(domain, request):
+            return []
+
+        # XXXX separate out actual loading of cookie data, so only index.dat is
+        #  read in ._load_index_dat(), and ._really_load() calls that, then
+        #  ._delayload_domain for all domains if not self.delayload.
+        #  We then just call ._load_index_dat()
+        self._delayload = False
+        self._really_load()
+
+        cookies_by_path = self._cookies.get(domain)
+        if cookies_by_path is None:
+            return []
+
+        cookies = []
+        for path in cookies_by_path.keys():
+            if not self._policy.path_return_ok(path, request, unverifiable):
+                continue
+            for name, cookie in cookies_by_path[path].items():
+                if not self._policy.return_ok(cookie, request, unverifiable):
+                    debug("   not returning cookie")
+                    continue
+                debug("   it's a match")
+                cookies.append(cookie)
+
+        return cookies
+

Added: mechanize/tags/0.1.10/doc.html
===================================================================
--- mechanize/tags/0.1.10/doc.html	                        (rev 0)
+++ mechanize/tags/0.1.10/doc.html	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,928 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
+        "http://www.w3.org/TR/html4/strict.dtd">
+
+<html>
+<head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
+  <meta name="author" content="John J. Lee &lt;jjl at pobox.com&gt;">
+  <meta name="date" content="2008-12-02">
+  <title>mechanize documentation</title>
+  <style type="text/css" media="screen">@import "../styles/style.css";</style>
+  
+</head>
+<body>
+
+<div id="sf"><a href="http://sourceforge.net">
+<img src="http://sourceforge.net/sflogo.php?group_id=48205&amp;type=2"
+ width="125" height="37" alt="SourceForge.net Logo"></a></div>
+
+<h1>mechanize handlers</h1>
+
+<div id="Content">
+
+<p class="docwarning">This documentation is in need of reorganisation!</p>
+
+<p>This page is the old ClientCookie documentation.  It deals with operation on
+the level of urllib2 Handler objects, and also with adding headers, debugging,
+and cookie handling.  Documentation for the higher-level browser-style
+interface is <a href="./mechanize">elsewhere</a>.
+
+
+<a name="examples"></a>
+<h2>Examples</h2>
+
+<pre>
+<span class="pykw">import</span> mechanize
+response = mechanize.urlopen(<span class="pystr">"http://foo.bar.com/"</span>)</pre>
+
+
+<p>This function behaves identically to <code>urllib2.urlopen()</code>, except
+that it deals with cookies automatically.
+
+<p>Here is a more complicated example, involving <code>Request</code> objects
+(useful if you want to pass <code>Request</code>s around, add headers to them,
+etc.):
+
+<pre>
+<span class="pykw">import</span> mechanize
+request = mechanize.Request(<span class="pystr">"http://www.acme.com/"</span>)
+<span class="pycmt"># note we're using the urlopen from mechanize, not urllib2
+</span>response = mechanize.urlopen(request)
+<span class="pycmt"># let's say this next request requires a cookie that was set in response
+</span>request2 = mechanize.Request(<span class="pystr">"http://www.acme.com/flying_machines.html"</span>)
+response2 = mechanize.urlopen(request2)
+
+<span class="pykw">print</span> response2.geturl()
+<span class="pykw">print</span> response2.info()  <span class="pycmt"># headers</span>
+<span class="pykw">print</span> response2.read()  <span class="pycmt"># body (readline and readlines work too)</span></pre>
+
+
+<p>(The above example would also work with <code>urllib2.Request</code> objects
+too, since <code>mechanize.HTTPRequestUpgradeProcessor</code> knows about
+that class, but don't if you can avoid it, because this is an obscure hack for
+compatibility purposes only).
+
+<p>In these examples, the workings are hidden inside the
+<code>mechanize.urlopen()</code> function, which is an extension of
+<code>urllib2.urlopen()</code>.  Redirects, proxies and cookies are handled
+automatically by this function (note that you may need a bit of configuration
+to get your proxies correctly set up: see <code>urllib2</code> documentation).
+
+<p>Cookie processing (etc.) is handled by processor objects, which are an
+extension of <code>urllib2</code>'s handlers: <code>HTTPCookieProcessor</code>,
+<code>HTTPRefererProcessor</code> etc.  They are used like any other handler.
+There is quite a bit of other <code>urllib2</code>-workalike code, too.  Note:
+This duplication has gone away in Python 2.4, since 2.4's <code>urllib2</code>
+contains the processor extensions from mechanize, so you can simply use
+mechanize's processor classes direct with 2.4's <code>urllib2</code>; also,
+mechanize's cookie functionality is included in Python 2.4 as module
+<code>cookielib</code> and <code>urllib2.HTTPCookieProcessor</code>.
+
+<p>There is also a <code>urlretrieve()</code> function, which works like
+<code>urllib.urlretrieve()</code>.
+
+<p>An example at a slightly lower level shows how the module processes
+cookies more clearly:
+
+<pre>
+<span class="pycmt"># Don't copy this blindly!  You probably want to follow the examples
+</span><span class="pycmt"># above, not this one.
+</span><span class="pykw">import</span> mechanize
+
+<span class="pycmt"># Build an opener that *doesn't* automatically call .add_cookie_header()
+</span><span class="pycmt"># and .extract_cookies(), so we can do it manually without interference.
+</span><span class="pykw">class</span> NullCookieProcessor(mechanize.HTTPCookieProcessor):
+    <span class="pykw">def</span> http_request(self, request): <span class="pykw">return</span> request
+    <span class="pykw">def</span> http_response(self, request, response): <span class="pykw">return</span> response
+opener = mechanize.build_opener(NullCookieProcessor)
+
+request = mechanize.Request(<span class="pystr">"http://www.acme.com/"</span>)
+response = mechanize.urlopen(request)
+cj = mechanize.CookieJar()
+cj.extract_cookies(response, request)
+<span class="pycmt"># let's say this next request requires a cookie that was set in response
+</span>request2 = mechanize.Request(<span class="pystr">"http://www.acme.com/flying_machines.html"</span>)
+cj.add_cookie_header(request2)
+response2 = mechanize.urlopen(request2)</pre>
+
+
+<p>The <code>CookieJar</code> class does all the work.  There are essentially
+two operations: <code>.extract_cookies()</code> extracts HTTP cookies from
+<code>Set-Cookie</code> (the original <a
+href="http://www.netscape.com/newsref/std/cookie_spec.html">Netscape cookie
+standard</a>) and <code>Set-Cookie2</code> (<a
+href="http://www.ietf.org/rfc/rfc2965.txt">RFC 2965</a>) headers from a
+response if and only if they should be set given the request, and
+<code>.add_cookie_header()</code> adds <code>Cookie</code> headers if and only
+if they are appropriate for a particular HTTP request.  Incoming cookies are
+checked for acceptability based on the host name, etc.  Cookies are only set on
+outgoing requests if they match the request's host name, path, etc.
+
+<p><strong>Note that if you're using <code>mechanize.urlopen()</code> (or if
+you're using <code>mechanize.HTTPCookieProcessor</code> by some other
+means), you don't need to call <code>.extract_cookies()</code> or
+<code>.add_cookie_header()</code> yourself</strong>.  If, on the other hand,
+you don't want to use <code>urllib2</code>, you will need to use this pair of
+methods.  You can make your own <code>request</code> and <code>response</code>
+objects, which must support the interfaces described in the docstrings of
+<code>.extract_cookies()</code> and <code>.add_cookie_header()</code>.
+
+<p>There are also some <code>CookieJar</code> subclasses which can store
+cookies in files and databases.  <code>FileCookieJar</code> is the abstract
+class for <code>CookieJar</code>s that can store cookies in disk files.
+<code>LWPCookieJar</code> saves cookies in a format compatible with the
+libwww-perl library.  This class is convenient if you want to store cookies in
+a human-readable file:
+
+<pre>
+<span class="pykw">import</span> mechanize
+cj = mechanize.LWPCookieJar()
+cj.revert(<span class="pystr">"cookie3.txt"</span>)
+opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cj))
+r = opener.open(<span class="pystr">"http://foobar.com/"</span>)
+cj.save(<span class="pystr">"cookie3.txt"</span>)</pre>
+
+
+<p>The <code>.revert()</code> method discards all existing cookies held by the
+<code>CookieJar</code> (it won't lose any existing cookies if the load fails).
+The <code>.load()</code> method, on the other hand, adds the loaded cookies to
+existing cookies held in the <code>CookieJar</code> (old cookies are kept
+unless overwritten by newly loaded ones).
+
+<p><code>MozillaCookieJar</code> can load and save to the
+Mozilla/Netscape/lynx-compatible <code>'cookies.txt'</code> format.  This
+format loses some information (unusual and nonstandard cookie attributes such
+as comment, and also information specific to RFC 2965 cookies).  The subclass
+<code>MSIECookieJar</code> can load (but not save, yet) from Microsoft Internet
+Explorer's cookie files (on Windows).  <code>BSDDBCookieJar</code> (NOT FULLY
+TESTED!) saves to a BSDDB database using the standard library's
+<code>bsddb</code> module.  There's an unfinished <code>MSIEDBCookieJar</code>,
+which uses (reads and writes) the Windows MSIE cookie database directly, rather
+than storing copies of cookies as <code>MSIECookieJar</code> does.
+
+<h2>Important note</h2>
+
+<p>Only use names you can import directly from the <code>mechanize</code>
+package, and that don't start with a single underscore.  Everything else is
+subject to change or disappearance without notice.
+
+<a name="browsers"></a>
+<h2>Cooperating with Mozilla/Netscape, lynx and Internet Explorer</h2>
+
+<p>The subclass <code>MozillaCookieJar</code> differs from
+<code>CookieJar</code> only in storing cookies using a different,
+Mozilla/Netscape-compatible, file format.  The lynx browser also uses this
+format.  This file format can't store RFC 2965 cookies, so they are downgraded
+to Netscape cookies on saving.  <code>LWPCookieJar</code> itself uses a
+libwww-perl specific format (`Set-Cookie3') - see the example above.  Python
+and your browser should be able to share a cookies file (note that the file
+location here will differ on non-unix OSes):
+
+<p><strong>WARNING:</strong> you may want to backup your browser's cookies file
+if you use <code>MozillaCookieJar</code> to save cookies.  I <em>think</em> it
+works, but there have been bugs in the past!
+
+<pre>
+<span class="pykw">import</span> os, mechanize
+cookies = mechanize.MozillaCookieJar()
+cookies.load(os.path.join(os.environ[<span class="pystr">"HOME"</span>], <span class="pystr">"/.netscape/cookies.txt"</span>))
+<span class="pycmt"># see also the save and revert methods</span></pre>
+
+
+<p>Note that cookies saved while Mozilla is running will get clobbered by
+Mozilla - see <code>MozillaCookieJar.__doc__</code>.
+
+<p><code>MSIECookieJar</code> does the same for Microsoft Internet Explorer
+(MSIE) 5.x and 6.x on Windows, but does not allow saving cookies in this
+format.  In future, the Windows API calls might be used to load and save
+(though the index has to be read directly, since there is no API for that,
+AFAIK; there's also an unfinished <code>MSIEDBCookieJar</code>, which uses
+(reads and writes) the Windows MSIE cookie database directly, rather than
+storing copies of cookies as <code>MSIECookieJar</code> does).
+
+<pre>
+<span class="pykw">import</span> mechanize
+cj = mechanize.MSIECookieJar(delayload=True)
+cj.load_from_registry()  <span class="pycmt"># finds cookie index file from registry</span></pre>
+
+
+<p>A true <code>delayload</code> argument speeds things up.
+
+<p>On Windows 9x (win 95, win 98, win ME), you need to supply a username to the
+<code>.load_from_registry()</code> method:
+
+<pre>
+cj.load_from_registry(username=<span class="pystr">"jbloggs"</span>)</pre>
+
+
+<p>Konqueror/Safari and Opera use different file formats, which aren't yet
+supported.
+
+<a name="file"></a>
+<h2>Saving cookies in a file</h2>
+
+<p>If you have no need to co-operate with a browser, the most convenient way to
+save cookies on disk between sessions in human-readable form is to use
+<code>LWPCookieJar</code>.  This class uses a libwww-perl specific format
+(`Set-Cookie3').  Unlike <code>MozilliaCookieJar</code>, this file format
+doesn't lose information.
+
+<a name="cookiejar"></a>
+<h2>Using your own CookieJar instance</h2>
+
+<p>You might want to do this to <a href="./doc.html#browsers">use your
+browser's cookies</a>, to customize <code>CookieJar</code>'s behaviour by
+passing constructor arguments, or to be able to get at the cookies it will hold
+(for example, for saving cookies between sessions and for debugging).
+
+<p>If you're using the higher-level <code>urllib2</code>-like interface
+(<code>urlopen()</code>, etc), you'll have to let it know what
+<code>CookieJar</code> it should use:
+
+<pre>
+<span class="pykw">import</span> mechanize
+cookies = mechanize.CookieJar()
+<span class="pycmt"># build_opener() adds standard handlers (such as HTTPHandler and
+</span><span class="pycmt"># HTTPCookieProcessor) by default.  The cookie processor we supply
+</span><span class="pycmt"># will replace the default one.
+</span>opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies))
+
+r = opener.open(<span class="pystr">"http://acme.com/"</span>)  <span class="pycmt"># GET</span>
+r = opener.open(<span class="pystr">"http://acme.com/"</span>, data)  <span class="pycmt"># POST</span></pre>
+
+
+<p>The <code>urlopen()</code> function uses a global
+<code>OpenerDirector</code> instance to do its work, so if you want to use
+<code>urlopen()</code> with your own <code>CookieJar</code>, install the
+<code>OpenerDirector</code> you built with <code>build_opener()</code> using
+the <code>mechanize.install_opener()</code> function, then proceed as usual:
+
+<pre>
+mechanize.install_opener(opener)
+r = mechanize.urlopen(<span class="pystr">"http://www.acme.com/"</span>)</pre>
+
+
+<p>Of course, everyone using <code>urlopen</code> is using the same global
+<code>CookieJar</code> instance!
+
+<a name="policy"></a>
+
+<p>You can set a policy object (must satisfy the interface defined by
+<code>mechanize.CookiePolicy</code>), which determines which cookies are
+allowed to be set and returned.  Use the policy argument to the
+<code>CookieJar</code> constructor, or use the .set_policy() method.  The
+default implementation has some useful switches:
+
+<pre>
+<span class="pykw">from</span> mechanize <span class="pykw">import</span> CookieJar, DefaultCookiePolicy <span class="pykw">as</span> Policy
+cookies = CookieJar()
+<span class="pycmt"># turn on RFC 2965 cookies, be more strict about domains when setting and
+</span><span class="pycmt"># returning Netscape cookies, and block some domains from setting cookies
+</span><span class="pycmt"># or having them returned (read the DefaultCookiePolicy docstring for the
+</span><span class="pycmt"># domain matching rules here)
+</span>policy = Policy(rfc2965=True, strict_ns_domain=Policy.DomainStrict,
+                blocked_domains=[<span class="pystr">"ads.net"</span>, <span class="pystr">".ads.net"</span>])
+cookies.set_policy(policy)</pre>
+
+
+
+<a name="extras"></a>
+<h2>Optional extras: robots.txt, HTTP-EQUIV, Refresh, Referer</h2>
+
+<p>These are implemented as processor classes.  Processors are an extension of
+<code>urllib2</code>'s handlers (now a standard part of urllib2 in Python 2.4):
+you just pass them to <code>build_opener()</code> (example code below).
+
+<dl>
+
+<dt><code>HTTPRobotRulesProcessor</code>
+
+<dd><p>WWW Robots (also called wanderers or spiders) are programs that traverse
+many pages in the World Wide Web by recursively retrieving linked pages.  This
+kind of program can place significant loads on web servers, so there is a <a
+href="http://www.robotstxt.org/wc/norobots.html">standard</a> for a <code>
+robots.txt</code> file by which web site operators can request robots to keep
+out of their site, or out of particular areas of it.  This processor uses the
+standard Python library's <code>robotparser</code> module.  It raises
+<code>mechanize.RobotExclusionError</code> (subclass of
+<code>urllib2.HTTPError</code>) if an attempt is made to open a URL prohibited
+by <code>robots.txt</code>.  XXX ATM, this makes use of code in the
+<code>robotparser</code> module that uses <code>urllib</code> - this will
+likely change in future to use <code>urllib2</code>.
+
+<dt><code>HTTPEquivProcessor</code>
+
+<dd><p>The <code>&lt;META HTTP-EQUIV&gt;</code> tag is a way of including data
+in HTML to be treated as if it were part of the HTTP headers.  mechanize can
+automatically read these tags and add the <code>HTTP-EQUIV</code> headers to
+the response object's real HTTP headers.  The HTML is left unchanged.
+
+<dt><code>HTTPRefreshProcessor</code>
+
+<dd><p>The <code>Refresh</code> HTTP header is a non-standard header which is
+widely used.  It requests that the user-agent follow a URL after a specified
+time delay.  mechanize can treat these headers (which may have been set in
+<code>&lt;META HTTP-EQUIV&gt;</code> tags) as if they were 302 redirections.
+Exactly when and how <code>Refresh</code> headers are handled is configurable
+using the constructor arguments.
+
+<dt><code>HTTPRefererProcessor</code>
+
+<dd><p>The <code>Referer</code> HTTP header lets the server know which URL
+you've just visited.  Some servers use this header as state information, and
+don't like it if this is not present.  It's a chore to add this header by hand
+every time you make a request.  This adds it automatically.
+<strong>NOTE</strong>: this only makes sense if you use each processor for a
+single chain of HTTP requests (so, for example, if you use a single
+HTTPRefererProcessor to fetch a series of URLs extracted from a single page,
+<strong>this will break</strong>).  <a
+href="../mechanize/">mechanize.Browser</a> does this properly.</p>
+
+</dl>
+
+<pre>
+<span class="pykw">import</span> mechanize
+cookies = mechanize.CookieJar()
+
+opener = mechanize.build_opener(mechanize.HTTPRefererProcessor,
+                                mechanize.HTTPEquivProcessor,
+                                mechanize.HTTPRefreshProcessor,
+                                )
+opener.open(<span class="pystr">"http://www.rhubarb.com/"</span>)</pre>
+
+
+
+
+<a name="seekable"></a>
+<h2>Seekable responses</h2>
+
+<p>Response objects returned from (or raised as exceptions by)
+<code>mechanize.SeekableResponseOpener</code>, <code>mechanize.UserAgent</code>
+(if <code>.set_seekable_responses(True)</code> has been called) and
+<code>mechanize.Browser()</code> have <code>.seek()</code>,
+<code>.get_data()</code> and <code>.set_data()</code> methods:
+
+<pre>
+<span class="pykw">import</span> mechanize
+opener = mechanize.OpenerFactory(mechanize.SeekableResponseOpener).build_opener()
+response = opener.open(<span class="pystr">"http://example.com/"</span>)
+<span class="pycmt"># same return value as .read(), but without affecting seek position
+</span>total_nr_bytes = len(response.get_data())
+<span class="pykw">assert</span> len(response.read()) == total_nr_bytes
+<span class="pykw">assert</span> len(response.read()) == 0  <span class="pycmt"># we've already read the data</span>
+response.seek(0)
+<span class="pykw">assert</span> len(response.read()) == total_nr_bytes
+response.set_data(<span class="pystr">"blah\n"</span>)
+<span class="pykw">assert</span> response.get_data() == <span class="pystr">"blah\n"</span>
+...</pre>
+
+
+<p>This caching behaviour can be avoided by using
+<code>mechanize.OpenerDirector</code> (as long as
+<code>SeekableProcessor</code>, <code>HTTPEquivProcessor</code> and
+<code>HTTPResponseDebugProcessor</code> are not used).  It can also be avoided
+with <code>mechanize.UserAgent</code>:
+
+<pre>
+<span class="pykw">import</span> mechanize
+ua = mechanize.UserAgent()
+ua.set_seekable_responses(False)
+ua.set_handle_equiv(False)
+ua.set_debug_responses(False)</pre>
+
+
+<p>Note that if you turn on features that use seekable responses (currently:
+HTTP-EQUIV handling and response body debug printing), returned responses
+<em>may</em> be seekable as a side-effect of these features.  However, this is
+not guaranteed (currently, in these cases, returned response objects are
+seekable, but raised respose objects &#8212; <code>mechanize.HTTPError</code>
+instances &#8212; are not seekable).  This applies regardless of whether you
+use <code>mechanize.UserAgent</code> or <code>mechanize.OpenerDirector</code>.
+If you explicitly request seekable responses by calling
+<code>.set_seekable_responses(True)</code> on a
+<code>mechanize.UserAgent</code> instance, or by using
+<code>mechanize.Browser</code> or
+<code>mechanize.SeekableResponseOpener</code>, which always return seekable
+responses, then both returned and raised responses are guaranteed to be
+seekable.
+
+<p>Handlers should call <code>response =
+mechanize.seek_wrapped_response(response)</code> if they require the
+<code>.seek()</code>, <code>.get_data()</code> or <code>.set_data()</code>
+methods.
+
+<p>Note that <code>SeekableProcessor</code> (and
+<code>ResponseUpgradeProcessor</code>) are deprecated since mechanize 0.1.6b.
+The reason for the deprecation is that these were really abuses of the response
+processing chain (the <code>.process_response()</code> support documented by
+urllib2).  The response processing chain is sensibly used only for processing
+response headers and data, not for processing response <em>objects</em>,
+because the same data may occur as different Python objects (this can occur for
+example when <code>HTTPError</code> is raised by
+<code>HTTPDefaultErrorHandler</code>), but should only get processed once
+(during <code>.open()</code>).
+
+
+
+<a name="requests"></a>
+<h2>Confusing fact about headers and Requests</h2>
+
+<p>mechanize automatically upgrades <code>urllib2.Request</code> objects to
+<code>mechanize.Request</code>, as a backwards-compatibility hack.  This
+means that you won't see any headers that are added to Request objects by
+handlers unless you use <code>mechanize.Request</code> in the first place.
+Sorry about that.
+
+<p>Note also that handlers may create new <code>Request</code> instances (for
+example when performing redirects) rather than adding headers to existing
+<code>Request objects</code>.
+
+
+<a name="headers"></a>
+<h2>Adding headers</h2>
+
+<p>Adding headers is done like so:
+
+<pre>
+<span class="pykw">import</span> mechanize, urllib2
+req = urllib2.Request(<span class="pystr">"http://foobar.com/"</span>)
+req.add_header(<span class="pystr">"Referer"</span>, <span class="pystr">"http://wwwsearch.sourceforge.net/mechanize/"</span>)
+r = mechanize.urlopen(req)</pre>
+
+
+<p>You can also use the headers argument to the <code>urllib2.Request</code>
+constructor.
+
+<p><code>urllib2</code> (in fact, mechanize takes over this task from
+<code>urllib2</code>) adds some headers to <code>Request</code> objects
+automatically - see the next section for details.
+
+
+<h2>Changing the automatically-added headers (User-Agent)</h2>
+
+<p><code>OpenerDirector</code> automatically adds a <code>User-Agent</code>
+header to every <code>Request</code>.
+
+<p>To change this and/or add similar headers, use your own
+<code>OpenerDirector</code>:
+
+<pre>
+<span class="pykw">import</span> mechanize
+cookies = mechanize.CookieJar()
+opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies))
+opener.addheaders = [(<span class="pystr">"User-agent"</span>, <span class="pystr">"Mozilla/5.0 (compatible; MyProgram/0.1)"</span>),
+                     (<span class="pystr">"From"</span>, <span class="pystr">"responsible.person at example.com"</span>)]</pre>
+
+
+<p>Again, to use <code>urlopen()</code>, install your
+<code>OpenerDirector</code> globally:
+
+<pre>
+mechanize.install_opener(opener)
+r = mechanize.urlopen(<span class="pystr">"http://acme.com/"</span>)</pre>
+
+
+<p>Also, a few standard headers (<code>Content-Length</code>,
+<code>Content-Type</code> and <code>Host</code>) are added when the
+<code>Request</code> is passed to <code>urlopen()</code> (or
+<code>OpenerDirector.open()</code>).  You shouldn't need to change these
+headers, but since this is done by <code>AbstractHTTPHandler</code>, you can
+change the way it works by passing a subclass of that handler to
+<code>build_opener()</code> (or, as always, by constructing an opener yourself
+and calling .add_handler()).
+
+
+<a name="unverifiable"></a>
+<h2>Initiating unverifiable transactions</h2>
+
+<p>This section is only of interest for correct handling of third-party HTTP
+cookies.  See <a href="./doc.html#standards">below</a> for an explanation of
+'third-party'.
+
+<p>First, some terminology.
+
+<p>An <em>unverifiable request</em> (defined fully by RFC 2965) is one whose
+URL the user did not have the option to approve.  For example, a transaction is
+unverifiable if the request is for an image in an HTML document, and the user
+had no option to approve the fetching of the image from a particular URL.
+
+<p>The <em>request-host of the origin transaction</em> (defined fully by RFC
+2965) is the host name or IP address of the original request that was initiated
+by the user.  For example, if the request is for an image in an HTML document,
+this is the request-host of the request for the page containing the image.
+
+<p><strong>mechanize knows that redirected transactions are unverifiable,
+and will handle that on its own (ie. you don't need to think about the origin
+request-host or verifiability yourself).</strong>
+
+<p>If you want to initiate an unverifiable transaction yourself (which you
+should if, for example, you're downloading the images from a page, and 'the
+user' hasn't explicitly OKed those URLs):
+
+<pre>
+request = Request(origin_req_host=<span class="pystr">"www.example.com"</span>, unverifiable=True)</pre>
+
+
+
+<a name="rfc2965"></a>
+<h2>RFC 2965 handling</h2>
+
+<p>RFC 2965 handling is switched off by default, because few browsers implement
+it, so the RFC 2965 protocol is essentially never seen on the internet.  To
+switch it on, see <a href="./doc.html#policy">here</a>.
+
+
+<a name="debugging"></a>
+<h2>Debugging</h2>
+
+<!--XXX move as much as poss. to General page-->
+
+<p>First, a few common problems.  The most frequent mistake people seem to make
+is to use <code>mechanize.urlopen()</code>, <em>and</em> the
+<code>.extract_cookies()</code> and <code>.add_cookie_header()</code> methods
+on a cookie object themselves.  If you use <code>mechanize.urlopen()</code>
+(or <code>OpenerDirector.open()</code>), the module handles extraction and
+adding of cookies by itself, so you should not call
+<code>.extract_cookies()</code> or <code>.add_cookie_header()</code>.
+
+<p>Are you sure the server is sending you any cookies in the first place?
+Maybe the server is keeping track of state in some other way
+(<code>HIDDEN</code> HTML form entries (possibly in a separate page referenced
+by a frame), URL-encoded session keys, IP address, HTTP <code>Referer</code>
+headers)?  Perhaps some embedded script in the HTML is setting cookies (see
+below)?  Maybe you messed up your request, and the server is sending you some
+standard failure page (even if the page doesn't appear to indicate any
+failure).  Sometimes, a server wants particular headers set to the values it
+expects, or it won't play nicely.  The most frequent offenders here are the
+<code>Referer</code> [<em>sic</em>] and / or <code>User-Agent</code> HTTP
+headers (<a href="./doc.html#headers">see above</a> for how to set these).  The
+<code>User-Agent</code> header may need to be set to a value like that of a
+popular browser.  The <code>Referer</code> header may need to be set to the URL
+that the server expects you to have followed a link from.  Occasionally, it may
+even be that operators deliberately configure a server to insist on precisely
+the headers that the popular browsers (MS Internet Explorer, Mozilla/Netscape,
+Opera, Konqueror/Safari) generate, but remember that incompetence (possibly on
+your part) is more probable than deliberate sabotage (and if a site owner is
+that keen to stop robots, you probably shouldn't be scraping it anyway).
+
+<p>When you <code>.save()</code> to or
+<code>.load()</code>/<code>.revert()</code> from a file, single-session cookies
+will expire unless you explicitly request otherwise with the
+<code>ignore_discard</code> argument.  This may be your problem if you find
+cookies are going away after saving and loading.
+
+<pre>
+<span class="pykw">import</span> mechanize
+cj = mechanize.LWPCookieJar()
+opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cj))
+mechanize.install_opener(opener)
+r = mechanize.urlopen(<span class="pystr">"http://foobar.com/"</span>)
+cj.save(<span class="pystr">"/some/file"</span>, ignore_discard=True, ignore_expires=True)</pre>
+
+
+<p>If none of the advice above solves your problem quickly, try comparing the
+headers and data that you are sending out with those that a browser emits.
+Often this will give you the clue you need.  Of course, you'll want to check
+that the browser is able to do manually what you're trying to achieve
+programatically before minutely examining the headers.  Make sure that what you
+do manually is <em>exactly</em> the same as what you're trying to do from
+Python - you may simply be hitting a server bug that only gets revealed if you
+view pages in a particular order, for example.  In order to see what your
+browser is sending to the server (even if HTTPS is in use), see <a
+href="../clientx.html">the General FAQ page</a>.  If nothing is obviously wrong
+with the requests your program is sending and you're out of ideas, you can try
+the last resort of good old brute force binary-search debugging.  Temporarily
+switch to sending HTTP headers (with <code>httplib</code>).  Start by copying
+Netscape/Mozilla or IE slavishly (apart from session IDs, etc., of course),
+then begin the tedious process of mutating your headers and data until they
+match what your higher-level code was sending.  This will at least reliably
+find your problem.
+
+<p>You can turn on display of HTTP headers:
+
+<pre>
+<span class="pykw">import</span> mechanize
+hh = mechanize.HTTPHandler()  <span class="pycmt"># you might want HTTPSHandler, too</span>
+hh.set_http_debuglevel(1)
+opener = mechanize.build_opener(hh)
+response = opener.open(url)</pre>
+
+
+<p>Alternatively, you can examine your individual request and response
+objects to see what's going on.  Note, though, that mechanize upgrades
+<code>urllib2.Request</code> objects to <code>mechanize.Request</code>, so you
+won't see any headers that are added to requests by handlers unless you use
+<code>mechanize.Request</code> in the first place.  In addition, requests may
+involve "sub-requests" in cases such as redirection, in which case you will
+also not see everything that's going on just by examining the original request
+and final response.  mechanize's responses can be made to
+have <code>.seek()</code> and <code>.get_data()</code> methods.  It's often
+useful to use the <code>.get_data()</code> method during debugging
+(see <a href="./doc.html#seekable">above</a>).
+
+<p>Also, note <code>HTTPRedirectDebugProcessor</code> (which prints information
+about redirections) and <code>HTTPResponseDebugProcessor</code> (which prints
+out all response bodies, including those that are read during redirections).
+<strong>NOTE</strong>: as well as having these processors in your
+<code>OpenerDirector</code> (for example, by passing them to
+<code>build_opener()</code>) you have to turn on logging at the
+<code>INFO</code> level or lower in order to see any output.
+
+<p>If you would like to see what is going on in mechanize's tiny mind, do
+this:
+
+<pre>
+<span class="pykw">import</span> sys, logging
+<span class="pycmt"># logging.DEBUG covers masses of debugging information,
+</span><span class="pycmt"># logging.INFO just shows the output from HTTPRedirectDebugProcessor,
+</span>logger = logging.getLogger(<span class="pystr">"mechanize"</span>)
+logger.addHandler(logging.StreamHandler(sys.stdout))
+logger.setLevel(logging.DEBUG)</pre>
+
+
+<p>The <code>DEBUG</code> level (as opposed to the <code>INFO</code> level) can
+actually be quite useful, as it explains why particular cookies are accepted or
+rejected and why they are or are not returned.
+
+<p>One final thing to note is that there are some catch-all bare
+<code>except:</code> statements in the module, which are there to handle
+unexpected bad input without crashing your program.  If this happens, it's a
+bug in mechanize, so please mail me the warning text.
+
+
+<a name="script"></a>
+<h2>Embedded script that sets cookies</h2>
+
+<p>It is possible to embed script in HTML pages (sandwiched between
+<code>&lt;SCRIPT&gt;here&lt;/SCRIPT&gt;</code> tags, and in
+<code>javascript:</code> URLs) - JavaScript / ECMAScript, VBScript, or even
+Python - that causes cookies to be set in a browser.  See the <a
+href="../bits/clientx.html">General FAQs</a> page for what to do about this.
+
+
+<a name="dates"></a>
+<h2>Parsing HTTP date strings</h2>
+
+<p>A function named <code>str2time</code> is provided by the package,
+which may be useful for parsing dates in HTTP headers.
+<code>str2time</code> is intended to be liberal, since HTTP date/time
+formats are poorly standardised in practice.  There is no need to use this
+function in normal operations: <code>CookieJar</code> instances keep track
+of cookie lifetimes automatically.  This function will stay around in some
+form, though the supported date/time formats may change.
+
+
+<a name="badhtml"></a>
+<h2>Dealing with bad HTML</h2>
+
+<p>XXX Intro
+
+<p>XXX Test me
+
+<pre><span class="pykw">import</span> copy
+<span class="pykw">import</span> mechanize
+<span class="pykw">class</span> CommentCleanProcessor(mechanize.BaseProcessor):
+      <span class="pykw">def</span> http_response(self, request, response):
+          <span class="pykw">if</span> <span class="pykw">not</span> hasattr(response, <span class="pystr">"seek"</span>):
+              response = mechanize.response_seek_wrapper(response)
+          response.seek(0)
+          new_response = copy.copy(response)
+          new_response.set_data(
+              re.sub(<span class="pystr">"&lt;!-([^-]*)-&gt;"</span>, <span class="pystr">"&lt;!--\1--&gt;"</span>, response.read()))
+          <span class="pykw">return</span> new_response
+      https_response = http_response</pre>
+
+
+<p>XXX TidyProcessor: mxTidy?  tidylib?  tidy?
+
+
+<a name="standards"></a>
+<h2>Note about cookie standards</h2>
+
+<p>The various cookie standards and their history form a case study of the
+terrible things that can happen to a protocol.  The long-suffering David
+Kristol has written a <a
+href="http://arxiv.org/abs/cs.SE/0105018">paper</a> about it, if you
+want to know the gory details.
+
+<p>Here is a summary.
+
+<p>The <a href="http://www.netscape.com/newsref/std/cookie_spec.html">Netscape
+protocol</a> (cookie_spec.html) is still the only standard supported by most
+browsers (including Internet Explorer and Netscape).  Be aware that
+cookie_spec.html is not, and never was, actually followed to the letter (or
+anything close) by anyone (including Netscape, IE and mechanize): the
+Netscape protocol standard is really defined by the behaviour of Netscape (and
+now IE).  Netscape cookies are also known as V0 cookies, to distinguish them
+from RFC 2109 or RFC 2965 cookies, which have a version cookie-attribute with a
+value of 1.
+
+<p><a href="http://www.ietf.org/rfcs/rfc2109.txt">RFC 2109</a> was introduced
+to fix some problems identified with the Netscape protocol, while still keeping
+the same HTTP headers (<code>Cookie</code> and <code>Set-Cookie</code>).  The
+most prominent of these problems is the 'third-party' cookie issue, which was
+an accidental feature of the Netscape protocol.  When one visits www.bland.org,
+one doesn't expect to get a cookie from www.lurid.com, a site one has never
+visited.  Depending on browser configuration, this can still happen, because
+the unreconstructed Netscape protocol is happy to accept cookies from, say, an
+image in a webpage (www.bland.org) that's included by linking to an
+advertiser's server (www.lurid.com).  This kind of event, where your browser
+talks to a server that you haven't explicitly okayed by some means, is what the
+RFCs call an 'unverifiable transaction'.  In addition to the potential for
+embarrassment caused by the presence of lurid.com's cookies on one's machine,
+this may also be used to track your movements on the web, because advertising
+agencies like doubleclick.net place ads on many sites.  RFC 2109 tried to
+change this by requiring cookies to be turned off during unverifiable
+transactions with third-party servers - unless the user explicitly asks them to
+be turned on.  This clashed with the business model of advertisers like
+doubleclick.net, who had started to take advantage of the third-party cookies
+'bug'.  Since the browser vendors were more interested in the advertisers'
+concerns than those of the browser users, this arguably doomed both RFC 2109
+and its successor, RFC 2965, from the start.  Other problems than the
+third-party cookie issue were also fixed by 2109.  However, even ignoring the
+advertising issue, 2109 was stillborn, because Internet Explorer and Netscape
+behaved differently in response to its extended <code>Set-Cookie</code>
+headers.  This was not really RFC 2109's fault: it worked the way it did to
+keep compatibility with the Netscape protocol as implemented by Netscape.
+Microsoft Internet Explorer (MSIE) was very new when the standard was designed,
+but was starting to be very popular when the standard was finalised.  XXX P3P,
+and MSIE &amp; Mozilla options
+
+<p>XXX Apparently MSIE implements bits of RFC 2109 - but not very compliant
+(surprise).  Presumably other browsers do too, as a result.  mechanize
+already does allow Netscape cookies to have <code>max-age</code> and
+<code>port</code> cookie-attributes, and as far as I know that's the extent of
+the support present in MSIE.  I haven't tested, though!
+
+<p><a href="http://www.ietf.org/rfcs/rfc2965.txt">RFC 2965</a> attempted to fix
+the compatibility problem by introducing two new headers,
+<code>Set-Cookie2</code> and <code>Cookie2</code>.  Unlike the
+<code>Cookie</code> header, <code>Cookie2</code> does <em>not</em> carry
+cookies to the server - rather, it simply advertises to the server that RFC
+2965 is understood.  <code>Set-Cookie2</code> <em>does</em> carry cookies, from
+server to client: the new header means that both IE and Netscape completely
+ignore these cookies.  This prevents breakage, but introduces a chicken-egg
+problem that means 2965 may never be widely adopted, especially since Microsoft
+shows no interest in it.  XXX Rumour has it that the European Union is unhappy
+with P3P, and might introduce legislation that requires something better,
+forming a gap that RFC 2965 might fill - any truth in this?  Opera is the only
+browser I know of that supports the standard.  On the server side, Apache's
+<code>mod_usertrack</code> supports it.  One confusing point to note about RFC
+2965 is that it uses the same value (1) of the Version attribute in HTTP
+headers as does RFC 2109.
+
+<p>Most recently, it was discovered that RFC 2965 does not fully take account
+of issues arising when 2965 and Netscape cookies coexist, and errata were
+discussed on the W3C http-state mailing list, but the list traffic died and it
+seems RFC 2965 is dead as an internet protocol (but still a useful basis for
+implementing the de-facto standards, and perhaps as an intranet protocol).
+
+<p>Because Netscape cookies are so poorly specified, the general philosophy
+of the module's Netscape cookie implementation is to start with RFC 2965
+and open holes where required for Netscape protocol-compatibility.  RFC
+2965 cookies are <em>always</em> treated as RFC 2965 requires, of course!
+
+
+<a name="faq_pre"></a>
+<h2>FAQs - pre install</h2>
+<ul>
+  <li>Doesn't the standard Python library module, <code>Cookie</code>, do
+     this?
+  <p>No: Cookie.py does the server end of the job.  It doesn't know when to
+     accept cookies from a server or when to pass them back.
+  <li>Is urllib2.py required?
+  <p>No.  You probably want it, though.
+  <li>Where can I find out more about the HTTP cookie protocol?
+  <p>There is more than one protocol, in fact (see the <a href="./doc.html">docs</a>
+     for a brief explanation of the history):
+  <ul>
+    <li>The original <a href="http://www.netscape.com/newsref/std/cookie_spec.html">
+        Netscape cookie protocol</a> - the standard still in use today, in
+        theory (in reality, the protocol implemented by all the major browsers
+        only bears a passing resemblance to the protocol sketched out in this
+        document).
+    <li><a href="http://www.ietf.org/rfcs/rfc2109.txt">RFC 2109</a> - obsoleted
+        by RFC 2965.
+     <li><a href="http://www.ietf.org/rfcs/rfc2965.txt">RFC 2965</a> - the
+        Netscape protocol with the bugs fixed (not widely used - the Netscape
+        protocol still dominates, and seems likely to remain dominant
+        indefinitely, at least on the Internet).
+        <a href="http://www.ietf.org/rfcs/rfc2964.txt">RFC 2964</a> discusses use
+        of the protocol.
+        <a href="http://kristol.org/cookie/errata.html">Errata</a> to RFC 2965
+        are currently being discussed on the
+        <a href="http://lists.bell-labs.com/mailman/listinfo/http-state">
+        http-state mailing list</a> (update: list traffic died months ago and
+        hasn't revived).
+    <li>A <a href="http://doi.acm.org/10.1145/502152.502153">paper</a> by David
+        Kristol setting out the history of the cookie standards in exhausting
+        detail.
+    <li>HTTP cookies <a href="http://www.cookiecentral.com/">FAQ</a>.
+  </ul>
+  <li>Which protocols does mechanize support?
+     <p>Netscape and RFC 2965.  RFC 2965 handling is switched off by default.
+  <li>What about RFC 2109?
+     <p>RFC 2109 cookies are currently parsed as Netscape cookies, and treated
+     by default as RFC 2965 cookies thereafter if RFC 2965 handling is enabled,
+     or as Netscape cookies otherwise.  RFC 2109 is officially obsoleted by RFC
+     2965.  Browsers do use a few RFC 2109 features in their Netscape cookie
+     implementations (<code>port</code> and <code>max-age</code>), and
+     mechanize knows about that, too.
+</ul>
+
+
+<a name="faq_use"></a>
+<h2>FAQs - usage</h2>
+<ul>
+  <li>Why don't I have any cookies?
+  <p>Read the <a href="./doc.html#debugging">debugging section</a> of this page.
+  <li>My response claims to be empty, but I know it's not!
+  <p>Did you call <code>response.read()</code> (eg., in a debug statement),
+     then forget that all the data has already been read?  In that case, you
+     may want to use <code>mechanize.response_seek_wrapper</code>.
+  <li>How do I download only part of a response body?
+  <p>Just call <code>.read()</code> or <code>.readline()</code> methods on your
+     response object as many times as you need.  The <code>.seek()</code>
+     method (which is not always present, see <a
+     href="./doc.html#seekable">above</a>) still works, because mechanize
+     caches read data.
+  <li>What's the difference between the <code>.load()</code> and
+      <code>.revert()</code> methods of <code>CookieJar</code>?
+  <p><code>.load()</code> <em>appends</em> cookies from a file.
+     <code>.revert()</code> discards all existing cookies held by the
+     <code>CookieJar</code> first (but it won't lose any existing cookies if
+     the loading fails).
+  <li>Is it threadsafe?
+  <p>No.  <em>Tested</em> patches welcome.  Clarification: As far as I know,
+     it's perfectly possible to use mechanize in threaded code, but it
+     provides no synchronisation: you have to provide that yourself.
+  <li>How do I do &lt;X&gt;
+  <p>The module docstrings are worth reading if you want to do something
+     unusual.
+  <li>What's this &quot;processor&quot; business about?  I knew
+      <code>urllib2</code> used &quot;handlers&quot;, but not these
+      &quot;processors&quot;.
+  <p>This Python library <a href="http://www.python.org/sf/852995">patch</a>
+     contains an explanation.  Processors are now a standard part of urllib2
+     in Python 2.4.
+  <li>How do I use it without urllib2.py?
+  <pre>
+<span class="pykw">from</span> mechanize <span class="pykw">import</span> CookieJar
+<span class="pykw">print</span> CookieJar.extract_cookies.__doc__
+<span class="pykw">print</span> CookieJar.add_cookie_header.__doc__</pre>
+
+</ul>
+
+<p>I prefer questions and comments to be sent to the <a
+href="http://lists.sourceforge.net/lists/listinfo/wwwsearch-general">
+mailing list</a> rather than direct to me.
+
+<p><a href="mailto:jjl at pobox.com">John J. Lee</a>,
+December 2008.
+
+<hr>
+
+</div>
+
+<div id="Menu">
+
+<a href="..">Home</a><br>
+<br>
+<a href="../bits/GeneralFAQ.html">General FAQs</a><br>
+<br>
+<a href="../mechanize/">mechanize</a><br>
+<span class="thispage"><span class="subpage">mechanize docs</span></span><br>
+<a href="../ClientForm/">ClientForm</a><br>
+<br>
+<a href="../ClientCookie/">ClientCookie</a><br>
+<span class="thispage"><span class="subpage">ClientCookie docs</span></span><br>
+<a href="../pullparser/">pullparser</a><br>
+<a href="../DOMForm/">DOMForm</a><br>
+<a href="../python-spidermonkey/">python-spidermonkey</a><br>
+<a href="../ClientTable/">ClientTable</a><br>
+<a href="../bits/urllib2_152.py">1.5.2 urllib2.py</a><br>
+<a href="../bits/urllib_152.py">1.5.2 urllib.py</a><br>
+
+<br>
+
+<a href="./doc.html#examples">Examples</a><br>
+<a href="./doc.html#browsers">Mozilla &amp; MSIE</a><br>
+<a href="./doc.html#file">Cookies in a file</a><br>
+<a href="./doc.html#cookiejar">Using a <code>CookieJar</code></a><br>
+<a href="./doc.html#extras">Processors</a><br>
+<a href="./doc.html#seekable">Seekable responses</a><br>
+<a href="./doc.html#requests">Request confusion</a><br>
+<a href="./doc.html#headers">Adding headers</a><br>
+<a href="./doc.html#unverifiable">Verifiability</a><br>
+<a href="./doc.html#rfc2965">RFC 2965</a><br>
+<a href="./doc.html#debugging">Debugging</a><br>
+<a href="./doc.html#script">Embedded scripts</a><br>
+<a href="./doc.html#dates">HTTP date parsing</a><br>
+<a href="./doc.html#standards">Standards</a><br>
+<a href="./doc.html#faq_use">FAQs - usage</a><br>
+
+</div>
+
+</body>
+
+</html>

Added: mechanize/tags/0.1.10/doc.html.in
===================================================================
--- mechanize/tags/0.1.10/doc.html.in	                        (rev 0)
+++ mechanize/tags/0.1.10/doc.html.in	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,925 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
+        "http://www.w3.org/TR/html4/strict.dtd">
+@# This file is processed by EmPy to colorize Python source code
+@# http://wwwsearch.sf.net/bits/colorize.py
+@{
+from colorize import colorize
+import time
+import release
+last_modified = release.svn_id_to_time("$Id: doc.html.in 60274 2008-12-02 00:18:37Z jjlee $")
+try:
+    base
+except NameError:
+    base = False
+}
+<html>
+<head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
+  <meta name="author" content="John J. Lee &lt;jjl@@pobox.com&gt;">
+  <meta name="date" content="@(time.strftime("%Y-%m-%d", last_modified))">
+  <title>mechanize documentation</title>
+  <style type="text/css" media="screen">@@import "../styles/style.css";</style>
+  @[if base]<base href="http://wwwsearch.sourceforge.net/mechanize/">@[end if]
+</head>
+<body>
+
+<div id="sf"><a href="http://sourceforge.net">
+<img src="http://sourceforge.net/sflogo.php?group_id=48205&amp;type=2"
+ width="125" height="37" alt="SourceForge.net Logo"></a></div>
+
+<h1>mechanize handlers</h1>
+
+<div id="Content">
+
+<p class="docwarning">This documentation is in need of reorganisation!</p>
+
+<p>This page is the old ClientCookie documentation.  It deals with operation on
+the level of urllib2 Handler objects, and also with adding headers, debugging,
+and cookie handling.  Documentation for the higher-level browser-style
+interface is <a href="./mechanize">elsewhere</a>.
+
+
+<a name="examples"></a>
+<h2>Examples</h2>
+
+@{colorize(r"""
+import mechanize
+response = mechanize.urlopen("http://foo.bar.com/")
+""")}
+
+<p>This function behaves identically to <code>urllib2.urlopen()</code>, except
+that it deals with cookies automatically.
+
+<p>Here is a more complicated example, involving <code>Request</code> objects
+(useful if you want to pass <code>Request</code>s around, add headers to them,
+etc.):
+
+@{colorize(r"""
+import mechanize
+request = mechanize.Request("http://www.acme.com/")
+# note we're using the urlopen from mechanize, not urllib2
+response = mechanize.urlopen(request)
+# let's say this next request requires a cookie that was set in response
+request2 = mechanize.Request("http://www.acme.com/flying_machines.html")
+response2 = mechanize.urlopen(request2)
+
+print response2.geturl()
+print response2.info()  # headers
+print response2.read()  # body (readline and readlines work too)
+""")}
+
+<p>(The above example would also work with <code>urllib2.Request</code> objects
+too, since <code>mechanize.HTTPRequestUpgradeProcessor</code> knows about
+that class, but don't if you can avoid it, because this is an obscure hack for
+compatibility purposes only).
+
+<p>In these examples, the workings are hidden inside the
+<code>mechanize.urlopen()</code> function, which is an extension of
+<code>urllib2.urlopen()</code>.  Redirects, proxies and cookies are handled
+automatically by this function (note that you may need a bit of configuration
+to get your proxies correctly set up: see <code>urllib2</code> documentation).
+
+<p>Cookie processing (etc.) is handled by processor objects, which are an
+extension of <code>urllib2</code>'s handlers: <code>HTTPCookieProcessor</code>,
+<code>HTTPRefererProcessor</code> etc.  They are used like any other handler.
+There is quite a bit of other <code>urllib2</code>-workalike code, too.  Note:
+This duplication has gone away in Python 2.4, since 2.4's <code>urllib2</code>
+contains the processor extensions from mechanize, so you can simply use
+mechanize's processor classes direct with 2.4's <code>urllib2</code>; also,
+mechanize's cookie functionality is included in Python 2.4 as module
+<code>cookielib</code> and <code>urllib2.HTTPCookieProcessor</code>.
+
+<p>There is also a <code>urlretrieve()</code> function, which works like
+<code>urllib.urlretrieve()</code>.
+
+<p>An example at a slightly lower level shows how the module processes
+cookies more clearly:
+
+@{colorize(r"""
+# Don't copy this blindly!  You probably want to follow the examples
+# above, not this one.
+import mechanize
+
+# Build an opener that *doesn't* automatically call .add_cookie_header()
+# and .extract_cookies(), so we can do it manually without interference.
+class NullCookieProcessor(mechanize.HTTPCookieProcessor):
+    def http_request(self, request): return request
+    def http_response(self, request, response): return response
+opener = mechanize.build_opener(NullCookieProcessor)
+
+request = mechanize.Request("http://www.acme.com/")
+response = mechanize.urlopen(request)
+cj = mechanize.CookieJar()
+cj.extract_cookies(response, request)
+# let's say this next request requires a cookie that was set in response
+request2 = mechanize.Request("http://www.acme.com/flying_machines.html")
+cj.add_cookie_header(request2)
+response2 = mechanize.urlopen(request2)
+""")}
+
+<p>The <code>CookieJar</code> class does all the work.  There are essentially
+two operations: <code>.extract_cookies()</code> extracts HTTP cookies from
+<code>Set-Cookie</code> (the original <a
+href="http://www.netscape.com/newsref/std/cookie_spec.html">Netscape cookie
+standard</a>) and <code>Set-Cookie2</code> (<a
+href="http://www.ietf.org/rfc/rfc2965.txt">RFC 2965</a>) headers from a
+response if and only if they should be set given the request, and
+<code>.add_cookie_header()</code> adds <code>Cookie</code> headers if and only
+if they are appropriate for a particular HTTP request.  Incoming cookies are
+checked for acceptability based on the host name, etc.  Cookies are only set on
+outgoing requests if they match the request's host name, path, etc.
+
+<p><strong>Note that if you're using <code>mechanize.urlopen()</code> (or if
+you're using <code>mechanize.HTTPCookieProcessor</code> by some other
+means), you don't need to call <code>.extract_cookies()</code> or
+<code>.add_cookie_header()</code> yourself</strong>.  If, on the other hand,
+you don't want to use <code>urllib2</code>, you will need to use this pair of
+methods.  You can make your own <code>request</code> and <code>response</code>
+objects, which must support the interfaces described in the docstrings of
+<code>.extract_cookies()</code> and <code>.add_cookie_header()</code>.
+
+<p>There are also some <code>CookieJar</code> subclasses which can store
+cookies in files and databases.  <code>FileCookieJar</code> is the abstract
+class for <code>CookieJar</code>s that can store cookies in disk files.
+<code>LWPCookieJar</code> saves cookies in a format compatible with the
+libwww-perl library.  This class is convenient if you want to store cookies in
+a human-readable file:
+
+@{colorize(r"""
+import mechanize
+cj = mechanize.LWPCookieJar()
+cj.revert("cookie3.txt")
+opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cj))
+r = opener.open("http://foobar.com/")
+cj.save("cookie3.txt")
+""")}
+
+<p>The <code>.revert()</code> method discards all existing cookies held by the
+<code>CookieJar</code> (it won't lose any existing cookies if the load fails).
+The <code>.load()</code> method, on the other hand, adds the loaded cookies to
+existing cookies held in the <code>CookieJar</code> (old cookies are kept
+unless overwritten by newly loaded ones).
+
+<p><code>MozillaCookieJar</code> can load and save to the
+Mozilla/Netscape/lynx-compatible <code>'cookies.txt'</code> format.  This
+format loses some information (unusual and nonstandard cookie attributes such
+as comment, and also information specific to RFC 2965 cookies).  The subclass
+<code>MSIECookieJar</code> can load (but not save, yet) from Microsoft Internet
+Explorer's cookie files (on Windows).  <code>BSDDBCookieJar</code> (NOT FULLY
+TESTED!) saves to a BSDDB database using the standard library's
+<code>bsddb</code> module.  There's an unfinished <code>MSIEDBCookieJar</code>,
+which uses (reads and writes) the Windows MSIE cookie database directly, rather
+than storing copies of cookies as <code>MSIECookieJar</code> does.
+
+<h2>Important note</h2>
+
+<p>Only use names you can import directly from the <code>mechanize</code>
+package, and that don't start with a single underscore.  Everything else is
+subject to change or disappearance without notice.
+
+<a name="browsers"></a>
+<h2>Cooperating with Mozilla/Netscape, lynx and Internet Explorer</h2>
+
+<p>The subclass <code>MozillaCookieJar</code> differs from
+<code>CookieJar</code> only in storing cookies using a different,
+Mozilla/Netscape-compatible, file format.  The lynx browser also uses this
+format.  This file format can't store RFC 2965 cookies, so they are downgraded
+to Netscape cookies on saving.  <code>LWPCookieJar</code> itself uses a
+libwww-perl specific format (`Set-Cookie3') - see the example above.  Python
+and your browser should be able to share a cookies file (note that the file
+location here will differ on non-unix OSes):
+
+<p><strong>WARNING:</strong> you may want to backup your browser's cookies file
+if you use <code>MozillaCookieJar</code> to save cookies.  I <em>think</em> it
+works, but there have been bugs in the past!
+
+@{colorize(r"""
+import os, mechanize
+cookies = mechanize.MozillaCookieJar()
+cookies.load(os.path.join(os.environ["HOME"], "/.netscape/cookies.txt"))
+# see also the save and revert methods
+""")}
+
+<p>Note that cookies saved while Mozilla is running will get clobbered by
+Mozilla - see <code>MozillaCookieJar.__doc__</code>.
+
+<p><code>MSIECookieJar</code> does the same for Microsoft Internet Explorer
+(MSIE) 5.x and 6.x on Windows, but does not allow saving cookies in this
+format.  In future, the Windows API calls might be used to load and save
+(though the index has to be read directly, since there is no API for that,
+AFAIK; there's also an unfinished <code>MSIEDBCookieJar</code>, which uses
+(reads and writes) the Windows MSIE cookie database directly, rather than
+storing copies of cookies as <code>MSIECookieJar</code> does).
+
+@{colorize(r"""
+import mechanize
+cj = mechanize.MSIECookieJar(delayload=True)
+cj.load_from_registry()  # finds cookie index file from registry
+""")}
+
+<p>A true <code>delayload</code> argument speeds things up.
+
+<p>On Windows 9x (win 95, win 98, win ME), you need to supply a username to the
+<code>.load_from_registry()</code> method:
+
+@{colorize(r"""
+cj.load_from_registry(username="jbloggs")
+""")}
+
+<p>Konqueror/Safari and Opera use different file formats, which aren't yet
+supported.
+
+<a name="file"></a>
+<h2>Saving cookies in a file</h2>
+
+<p>If you have no need to co-operate with a browser, the most convenient way to
+save cookies on disk between sessions in human-readable form is to use
+<code>LWPCookieJar</code>.  This class uses a libwww-perl specific format
+(`Set-Cookie3').  Unlike <code>MozilliaCookieJar</code>, this file format
+doesn't lose information.
+
+<a name="cookiejar"></a>
+<h2>Using your own CookieJar instance</h2>
+
+<p>You might want to do this to <a href="./doc.html#browsers">use your
+browser's cookies</a>, to customize <code>CookieJar</code>'s behaviour by
+passing constructor arguments, or to be able to get at the cookies it will hold
+(for example, for saving cookies between sessions and for debugging).
+
+<p>If you're using the higher-level <code>urllib2</code>-like interface
+(<code>urlopen()</code>, etc), you'll have to let it know what
+<code>CookieJar</code> it should use:
+
+@{colorize(r"""
+import mechanize
+cookies = mechanize.CookieJar()
+# build_opener() adds standard handlers (such as HTTPHandler and
+# HTTPCookieProcessor) by default.  The cookie processor we supply
+# will replace the default one.
+opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies))
+
+r = opener.open("http://acme.com/")  # GET
+r = opener.open("http://acme.com/", data)  # POST
+""")}
+
+<p>The <code>urlopen()</code> function uses a global
+<code>OpenerDirector</code> instance to do its work, so if you want to use
+<code>urlopen()</code> with your own <code>CookieJar</code>, install the
+<code>OpenerDirector</code> you built with <code>build_opener()</code> using
+the <code>mechanize.install_opener()</code> function, then proceed as usual:
+
+@{colorize(r"""
+mechanize.install_opener(opener)
+r = mechanize.urlopen("http://www.acme.com/")
+""")}
+
+<p>Of course, everyone using <code>urlopen</code> is using the same global
+<code>CookieJar</code> instance!
+
+<a name="policy"></a>
+
+<p>You can set a policy object (must satisfy the interface defined by
+<code>mechanize.CookiePolicy</code>), which determines which cookies are
+allowed to be set and returned.  Use the policy argument to the
+<code>CookieJar</code> constructor, or use the .set_policy() method.  The
+default implementation has some useful switches:
+
+@{colorize(r"""
+from mechanize import CookieJar, DefaultCookiePolicy as Policy
+cookies = CookieJar()
+# turn on RFC 2965 cookies, be more strict about domains when setting and
+# returning Netscape cookies, and block some domains from setting cookies
+# or having them returned (read the DefaultCookiePolicy docstring for the
+# domain matching rules here)
+policy = Policy(rfc2965=True, strict_ns_domain=Policy.DomainStrict,
+                blocked_domains=["ads.net", ".ads.net"])
+cookies.set_policy(policy)
+""")}
+
+
+<a name="extras"></a>
+<h2>Optional extras: robots.txt, HTTP-EQUIV, Refresh, Referer</h2>
+
+<p>These are implemented as processor classes.  Processors are an extension of
+<code>urllib2</code>'s handlers (now a standard part of urllib2 in Python 2.4):
+you just pass them to <code>build_opener()</code> (example code below).
+
+<dl>
+
+<dt><code>HTTPRobotRulesProcessor</code>
+
+<dd><p>WWW Robots (also called wanderers or spiders) are programs that traverse
+many pages in the World Wide Web by recursively retrieving linked pages.  This
+kind of program can place significant loads on web servers, so there is a <a
+href="http://www.robotstxt.org/wc/norobots.html">standard</a> for a <code>
+robots.txt</code> file by which web site operators can request robots to keep
+out of their site, or out of particular areas of it.  This processor uses the
+standard Python library's <code>robotparser</code> module.  It raises
+<code>mechanize.RobotExclusionError</code> (subclass of
+<code>urllib2.HTTPError</code>) if an attempt is made to open a URL prohibited
+by <code>robots.txt</code>.  XXX ATM, this makes use of code in the
+<code>robotparser</code> module that uses <code>urllib</code> - this will
+likely change in future to use <code>urllib2</code>.
+
+<dt><code>HTTPEquivProcessor</code>
+
+<dd><p>The <code>&lt;META HTTP-EQUIV&gt;</code> tag is a way of including data
+in HTML to be treated as if it were part of the HTTP headers.  mechanize can
+automatically read these tags and add the <code>HTTP-EQUIV</code> headers to
+the response object's real HTTP headers.  The HTML is left unchanged.
+
+<dt><code>HTTPRefreshProcessor</code>
+
+<dd><p>The <code>Refresh</code> HTTP header is a non-standard header which is
+widely used.  It requests that the user-agent follow a URL after a specified
+time delay.  mechanize can treat these headers (which may have been set in
+<code>&lt;META HTTP-EQUIV&gt;</code> tags) as if they were 302 redirections.
+Exactly when and how <code>Refresh</code> headers are handled is configurable
+using the constructor arguments.
+
+<dt><code>HTTPRefererProcessor</code>
+
+<dd><p>The <code>Referer</code> HTTP header lets the server know which URL
+you've just visited.  Some servers use this header as state information, and
+don't like it if this is not present.  It's a chore to add this header by hand
+every time you make a request.  This adds it automatically.
+<strong>NOTE</strong>: this only makes sense if you use each processor for a
+single chain of HTTP requests (so, for example, if you use a single
+HTTPRefererProcessor to fetch a series of URLs extracted from a single page,
+<strong>this will break</strong>).  <a
+href="../mechanize/">mechanize.Browser</a> does this properly.</p>
+
+</dl>
+
+@{colorize(r"""
+import mechanize
+cookies = mechanize.CookieJar()
+
+opener = mechanize.build_opener(mechanize.HTTPRefererProcessor,
+                                mechanize.HTTPEquivProcessor,
+                                mechanize.HTTPRefreshProcessor,
+				)
+opener.open("http://www.rhubarb.com/")
+""")}
+
+
+
+<a name="seekable"></a>
+<h2>Seekable responses</h2>
+
+<p>Response objects returned from (or raised as exceptions by)
+<code>mechanize.SeekableResponseOpener</code>, <code>mechanize.UserAgent</code>
+(if <code>.set_seekable_responses(True)</code> has been called) and
+<code>mechanize.Browser()</code> have <code>.seek()</code>,
+<code>.get_data()</code> and <code>.set_data()</code> methods:
+
+@{colorize(r"""
+import mechanize
+opener = mechanize.OpenerFactory(mechanize.SeekableResponseOpener).build_opener()
+response = opener.open("http://example.com/")
+# same return value as .read(), but without affecting seek position
+total_nr_bytes = len(response.get_data())
+assert len(response.read()) == total_nr_bytes
+assert len(response.read()) == 0  # we've already read the data
+response.seek(0)
+assert len(response.read()) == total_nr_bytes
+response.set_data("blah\n")
+assert response.get_data() == "blah\n"
+...
+""")}
+
+<p>This caching behaviour can be avoided by using
+<code>mechanize.OpenerDirector</code> (as long as
+<code>SeekableProcessor</code>, <code>HTTPEquivProcessor</code> and
+<code>HTTPResponseDebugProcessor</code> are not used).  It can also be avoided
+with <code>mechanize.UserAgent</code>:
+
+@{colorize(r"""
+import mechanize
+ua = mechanize.UserAgent()
+ua.set_seekable_responses(False)
+ua.set_handle_equiv(False)
+ua.set_debug_responses(False)
+""")}
+
+<p>Note that if you turn on features that use seekable responses (currently:
+HTTP-EQUIV handling and response body debug printing), returned responses
+<em>may</em> be seekable as a side-effect of these features.  However, this is
+not guaranteed (currently, in these cases, returned response objects are
+seekable, but raised respose objects &#8212; <code>mechanize.HTTPError</code>
+instances &#8212; are not seekable).  This applies regardless of whether you
+use <code>mechanize.UserAgent</code> or <code>mechanize.OpenerDirector</code>.
+If you explicitly request seekable responses by calling
+<code>.set_seekable_responses(True)</code> on a
+<code>mechanize.UserAgent</code> instance, or by using
+<code>mechanize.Browser</code> or
+<code>mechanize.SeekableResponseOpener</code>, which always return seekable
+responses, then both returned and raised responses are guaranteed to be
+seekable.
+
+<p>Handlers should call <code>response =
+mechanize.seek_wrapped_response(response)</code> if they require the
+<code>.seek()</code>, <code>.get_data()</code> or <code>.set_data()</code>
+methods.
+
+<p>Note that <code>SeekableProcessor</code> (and
+<code>ResponseUpgradeProcessor</code>) are deprecated since mechanize 0.1.6b.
+The reason for the deprecation is that these were really abuses of the response
+processing chain (the <code>.process_response()</code> support documented by
+urllib2).  The response processing chain is sensibly used only for processing
+response headers and data, not for processing response <em>objects</em>,
+because the same data may occur as different Python objects (this can occur for
+example when <code>HTTPError</code> is raised by
+<code>HTTPDefaultErrorHandler</code>), but should only get processed once
+(during <code>.open()</code>).
+
+
+
+<a name="requests"></a>
+<h2>Confusing fact about headers and Requests</h2>
+
+<p>mechanize automatically upgrades <code>urllib2.Request</code> objects to
+<code>mechanize.Request</code>, as a backwards-compatibility hack.  This
+means that you won't see any headers that are added to Request objects by
+handlers unless you use <code>mechanize.Request</code> in the first place.
+Sorry about that.
+
+<p>Note also that handlers may create new <code>Request</code> instances (for
+example when performing redirects) rather than adding headers to existing
+<code>Request objects</code>.
+
+
+<a name="headers"></a>
+<h2>Adding headers</h2>
+
+<p>Adding headers is done like so:
+
+@{colorize(r"""
+import mechanize, urllib2
+req = urllib2.Request("http://foobar.com/")
+req.add_header("Referer", "http://wwwsearch.sourceforge.net/mechanize/")
+r = mechanize.urlopen(req)
+""")}
+
+<p>You can also use the headers argument to the <code>urllib2.Request</code>
+constructor.
+
+<p><code>urllib2</code> (in fact, mechanize takes over this task from
+<code>urllib2</code>) adds some headers to <code>Request</code> objects
+automatically - see the next section for details.
+
+
+<h2>Changing the automatically-added headers (User-Agent)</h2>
+
+<p><code>OpenerDirector</code> automatically adds a <code>User-Agent</code>
+header to every <code>Request</code>.
+
+<p>To change this and/or add similar headers, use your own
+<code>OpenerDirector</code>:
+
+@{colorize(r"""
+import mechanize
+cookies = mechanize.CookieJar()
+opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cookies))
+opener.addheaders = [("User-agent", "Mozilla/5.0 (compatible; MyProgram/0.1)"),
+                     ("From", "responsible.person at example.com")]
+""")}
+
+<p>Again, to use <code>urlopen()</code>, install your
+<code>OpenerDirector</code> globally:
+
+@{colorize(r"""
+mechanize.install_opener(opener)
+r = mechanize.urlopen("http://acme.com/")
+""")}
+
+<p>Also, a few standard headers (<code>Content-Length</code>,
+<code>Content-Type</code> and <code>Host</code>) are added when the
+<code>Request</code> is passed to <code>urlopen()</code> (or
+<code>OpenerDirector.open()</code>).  You shouldn't need to change these
+headers, but since this is done by <code>AbstractHTTPHandler</code>, you can
+change the way it works by passing a subclass of that handler to
+<code>build_opener()</code> (or, as always, by constructing an opener yourself
+and calling .add_handler()).
+
+
+<a name="unverifiable"></a>
+<h2>Initiating unverifiable transactions</h2>
+
+<p>This section is only of interest for correct handling of third-party HTTP
+cookies.  See <a href="./doc.html#standards">below</a> for an explanation of
+'third-party'.
+
+<p>First, some terminology.
+
+<p>An <em>unverifiable request</em> (defined fully by RFC 2965) is one whose
+URL the user did not have the option to approve.  For example, a transaction is
+unverifiable if the request is for an image in an HTML document, and the user
+had no option to approve the fetching of the image from a particular URL.
+
+<p>The <em>request-host of the origin transaction</em> (defined fully by RFC
+2965) is the host name or IP address of the original request that was initiated
+by the user.  For example, if the request is for an image in an HTML document,
+this is the request-host of the request for the page containing the image.
+
+<p><strong>mechanize knows that redirected transactions are unverifiable,
+and will handle that on its own (ie. you don't need to think about the origin
+request-host or verifiability yourself).</strong>
+
+<p>If you want to initiate an unverifiable transaction yourself (which you
+should if, for example, you're downloading the images from a page, and 'the
+user' hasn't explicitly OKed those URLs):
+
+@{colorize(r"""
+request = Request(origin_req_host="www.example.com", unverifiable=True)
+""")}
+
+
+<a name="rfc2965"></a>
+<h2>RFC 2965 handling</h2>
+
+<p>RFC 2965 handling is switched off by default, because few browsers implement
+it, so the RFC 2965 protocol is essentially never seen on the internet.  To
+switch it on, see <a href="./doc.html#policy">here</a>.
+
+
+<a name="debugging"></a>
+<h2>Debugging</h2>
+
+<!--XXX move as much as poss. to General page-->
+
+<p>First, a few common problems.  The most frequent mistake people seem to make
+is to use <code>mechanize.urlopen()</code>, <em>and</em> the
+<code>.extract_cookies()</code> and <code>.add_cookie_header()</code> methods
+on a cookie object themselves.  If you use <code>mechanize.urlopen()</code>
+(or <code>OpenerDirector.open()</code>), the module handles extraction and
+adding of cookies by itself, so you should not call
+<code>.extract_cookies()</code> or <code>.add_cookie_header()</code>.
+
+<p>Are you sure the server is sending you any cookies in the first place?
+Maybe the server is keeping track of state in some other way
+(<code>HIDDEN</code> HTML form entries (possibly in a separate page referenced
+by a frame), URL-encoded session keys, IP address, HTTP <code>Referer</code>
+headers)?  Perhaps some embedded script in the HTML is setting cookies (see
+below)?  Maybe you messed up your request, and the server is sending you some
+standard failure page (even if the page doesn't appear to indicate any
+failure).  Sometimes, a server wants particular headers set to the values it
+expects, or it won't play nicely.  The most frequent offenders here are the
+<code>Referer</code> [<em>sic</em>] and / or <code>User-Agent</code> HTTP
+headers (<a href="./doc.html#headers">see above</a> for how to set these).  The
+<code>User-Agent</code> header may need to be set to a value like that of a
+popular browser.  The <code>Referer</code> header may need to be set to the URL
+that the server expects you to have followed a link from.  Occasionally, it may
+even be that operators deliberately configure a server to insist on precisely
+the headers that the popular browsers (MS Internet Explorer, Mozilla/Netscape,
+Opera, Konqueror/Safari) generate, but remember that incompetence (possibly on
+your part) is more probable than deliberate sabotage (and if a site owner is
+that keen to stop robots, you probably shouldn't be scraping it anyway).
+
+<p>When you <code>.save()</code> to or
+<code>.load()</code>/<code>.revert()</code> from a file, single-session cookies
+will expire unless you explicitly request otherwise with the
+<code>ignore_discard</code> argument.  This may be your problem if you find
+cookies are going away after saving and loading.
+
+@{colorize(r"""
+import mechanize
+cj = mechanize.LWPCookieJar()
+opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cj))
+mechanize.install_opener(opener)
+r = mechanize.urlopen("http://foobar.com/")
+cj.save("/some/file", ignore_discard=True, ignore_expires=True)
+""")}
+
+<p>If none of the advice above solves your problem quickly, try comparing the
+headers and data that you are sending out with those that a browser emits.
+Often this will give you the clue you need.  Of course, you'll want to check
+that the browser is able to do manually what you're trying to achieve
+programatically before minutely examining the headers.  Make sure that what you
+do manually is <em>exactly</em> the same as what you're trying to do from
+Python - you may simply be hitting a server bug that only gets revealed if you
+view pages in a particular order, for example.  In order to see what your
+browser is sending to the server (even if HTTPS is in use), see <a
+href="../clientx.html">the General FAQ page</a>.  If nothing is obviously wrong
+with the requests your program is sending and you're out of ideas, you can try
+the last resort of good old brute force binary-search debugging.  Temporarily
+switch to sending HTTP headers (with <code>httplib</code>).  Start by copying
+Netscape/Mozilla or IE slavishly (apart from session IDs, etc., of course),
+then begin the tedious process of mutating your headers and data until they
+match what your higher-level code was sending.  This will at least reliably
+find your problem.
+
+<p>You can turn on display of HTTP headers:
+
+@{colorize(r"""
+import mechanize
+hh = mechanize.HTTPHandler()  # you might want HTTPSHandler, too
+hh.set_http_debuglevel(1)
+opener = mechanize.build_opener(hh)
+response = opener.open(url)
+""")}
+
+<p>Alternatively, you can examine your individual request and response
+objects to see what's going on.  Note, though, that mechanize upgrades
+<code>urllib2.Request</code> objects to <code>mechanize.Request</code>, so you
+won't see any headers that are added to requests by handlers unless you use
+<code>mechanize.Request</code> in the first place.  In addition, requests may
+involve "sub-requests" in cases such as redirection, in which case you will
+also not see everything that's going on just by examining the original request
+and final response.  mechanize's responses can be made to
+have <code>.seek()</code> and <code>.get_data()</code> methods.  It's often
+useful to use the <code>.get_data()</code> method during debugging
+(see <a href="./doc.html#seekable">above</a>).
+
+<p>Also, note <code>HTTPRedirectDebugProcessor</code> (which prints information
+about redirections) and <code>HTTPResponseDebugProcessor</code> (which prints
+out all response bodies, including those that are read during redirections).
+<strong>NOTE</strong>: as well as having these processors in your
+<code>OpenerDirector</code> (for example, by passing them to
+<code>build_opener()</code>) you have to turn on logging at the
+<code>INFO</code> level or lower in order to see any output.
+
+<p>If you would like to see what is going on in mechanize's tiny mind, do
+this:
+
+@{colorize(r"""
+import sys, logging
+# logging.DEBUG covers masses of debugging information,
+# logging.INFO just shows the output from HTTPRedirectDebugProcessor,
+logger = logging.getLogger("mechanize")
+logger.addHandler(logging.StreamHandler(sys.stdout))
+logger.setLevel(logging.DEBUG)
+""")}
+
+<p>The <code>DEBUG</code> level (as opposed to the <code>INFO</code> level) can
+actually be quite useful, as it explains why particular cookies are accepted or
+rejected and why they are or are not returned.
+
+<p>One final thing to note is that there are some catch-all bare
+<code>except:</code> statements in the module, which are there to handle
+unexpected bad input without crashing your program.  If this happens, it's a
+bug in mechanize, so please mail me the warning text.
+
+
+<a name="script"></a>
+<h2>Embedded script that sets cookies</h2>
+
+<p>It is possible to embed script in HTML pages (sandwiched between
+<code>&lt;SCRIPT&gt;here&lt;/SCRIPT&gt;</code> tags, and in
+<code>javascript:</code> URLs) - JavaScript / ECMAScript, VBScript, or even
+Python - that causes cookies to be set in a browser.  See the <a
+href="../bits/clientx.html">General FAQs</a> page for what to do about this.
+
+
+<a name="dates"></a>
+<h2>Parsing HTTP date strings</h2>
+
+<p>A function named <code>str2time</code> is provided by the package,
+which may be useful for parsing dates in HTTP headers.
+<code>str2time</code> is intended to be liberal, since HTTP date/time
+formats are poorly standardised in practice.  There is no need to use this
+function in normal operations: <code>CookieJar</code> instances keep track
+of cookie lifetimes automatically.  This function will stay around in some
+form, though the supported date/time formats may change.
+
+
+<a name="badhtml"></a>
+<h2>Dealing with bad HTML</h2>
+
+<p>XXX Intro
+
+<p>XXX Test me
+
+@{colorize("""\
+import copy
+import mechanize
+class CommentCleanProcessor(mechanize.BaseProcessor):
+      def http_response(self, request, response):
+          if not hasattr(response, "seek"):
+              response = mechanize.response_seek_wrapper(response)
+          response.seek(0)
+          new_response = copy.copy(response)
+          new_response.set_data(
+              re.sub("<!-([^-]*)->", "<!--\\1-->", response.read()))
+          return new_response
+      https_response = http_response
+""")}
+
+<p>XXX TidyProcessor: mxTidy?  tidylib?  tidy?
+
+
+<a name="standards"></a>
+<h2>Note about cookie standards</h2>
+
+<p>The various cookie standards and their history form a case study of the
+terrible things that can happen to a protocol.  The long-suffering David
+Kristol has written a <a
+href="http://arxiv.org/abs/cs.SE/0105018">paper</a> about it, if you
+want to know the gory details.
+
+<p>Here is a summary.
+
+<p>The <a href="http://www.netscape.com/newsref/std/cookie_spec.html">Netscape
+protocol</a> (cookie_spec.html) is still the only standard supported by most
+browsers (including Internet Explorer and Netscape).  Be aware that
+cookie_spec.html is not, and never was, actually followed to the letter (or
+anything close) by anyone (including Netscape, IE and mechanize): the
+Netscape protocol standard is really defined by the behaviour of Netscape (and
+now IE).  Netscape cookies are also known as V0 cookies, to distinguish them
+from RFC 2109 or RFC 2965 cookies, which have a version cookie-attribute with a
+value of 1.
+
+<p><a href="http://www.ietf.org/rfcs/rfc2109.txt">RFC 2109</a> was introduced
+to fix some problems identified with the Netscape protocol, while still keeping
+the same HTTP headers (<code>Cookie</code> and <code>Set-Cookie</code>).  The
+most prominent of these problems is the 'third-party' cookie issue, which was
+an accidental feature of the Netscape protocol.  When one visits www.bland.org,
+one doesn't expect to get a cookie from www.lurid.com, a site one has never
+visited.  Depending on browser configuration, this can still happen, because
+the unreconstructed Netscape protocol is happy to accept cookies from, say, an
+image in a webpage (www.bland.org) that's included by linking to an
+advertiser's server (www.lurid.com).  This kind of event, where your browser
+talks to a server that you haven't explicitly okayed by some means, is what the
+RFCs call an 'unverifiable transaction'.  In addition to the potential for
+embarrassment caused by the presence of lurid.com's cookies on one's machine,
+this may also be used to track your movements on the web, because advertising
+agencies like doubleclick.net place ads on many sites.  RFC 2109 tried to
+change this by requiring cookies to be turned off during unverifiable
+transactions with third-party servers - unless the user explicitly asks them to
+be turned on.  This clashed with the business model of advertisers like
+doubleclick.net, who had started to take advantage of the third-party cookies
+'bug'.  Since the browser vendors were more interested in the advertisers'
+concerns than those of the browser users, this arguably doomed both RFC 2109
+and its successor, RFC 2965, from the start.  Other problems than the
+third-party cookie issue were also fixed by 2109.  However, even ignoring the
+advertising issue, 2109 was stillborn, because Internet Explorer and Netscape
+behaved differently in response to its extended <code>Set-Cookie</code>
+headers.  This was not really RFC 2109's fault: it worked the way it did to
+keep compatibility with the Netscape protocol as implemented by Netscape.
+Microsoft Internet Explorer (MSIE) was very new when the standard was designed,
+but was starting to be very popular when the standard was finalised.  XXX P3P,
+and MSIE &amp; Mozilla options
+
+<p>XXX Apparently MSIE implements bits of RFC 2109 - but not very compliant
+(surprise).  Presumably other browsers do too, as a result.  mechanize
+already does allow Netscape cookies to have <code>max-age</code> and
+<code>port</code> cookie-attributes, and as far as I know that's the extent of
+the support present in MSIE.  I haven't tested, though!
+
+<p><a href="http://www.ietf.org/rfcs/rfc2965.txt">RFC 2965</a> attempted to fix
+the compatibility problem by introducing two new headers,
+<code>Set-Cookie2</code> and <code>Cookie2</code>.  Unlike the
+<code>Cookie</code> header, <code>Cookie2</code> does <em>not</em> carry
+cookies to the server - rather, it simply advertises to the server that RFC
+2965 is understood.  <code>Set-Cookie2</code> <em>does</em> carry cookies, from
+server to client: the new header means that both IE and Netscape completely
+ignore these cookies.  This prevents breakage, but introduces a chicken-egg
+problem that means 2965 may never be widely adopted, especially since Microsoft
+shows no interest in it.  XXX Rumour has it that the European Union is unhappy
+with P3P, and might introduce legislation that requires something better,
+forming a gap that RFC 2965 might fill - any truth in this?  Opera is the only
+browser I know of that supports the standard.  On the server side, Apache's
+<code>mod_usertrack</code> supports it.  One confusing point to note about RFC
+2965 is that it uses the same value (1) of the Version attribute in HTTP
+headers as does RFC 2109.
+
+<p>Most recently, it was discovered that RFC 2965 does not fully take account
+of issues arising when 2965 and Netscape cookies coexist, and errata were
+discussed on the W3C http-state mailing list, but the list traffic died and it
+seems RFC 2965 is dead as an internet protocol (but still a useful basis for
+implementing the de-facto standards, and perhaps as an intranet protocol).
+
+<p>Because Netscape cookies are so poorly specified, the general philosophy
+of the module's Netscape cookie implementation is to start with RFC 2965
+and open holes where required for Netscape protocol-compatibility.  RFC
+2965 cookies are <em>always</em> treated as RFC 2965 requires, of course!
+
+
+<a name="faq_pre"></a>
+<h2>FAQs - pre install</h2>
+<ul>
+  <li>Doesn't the standard Python library module, <code>Cookie</code>, do
+     this?
+  <p>No: Cookie.py does the server end of the job.  It doesn't know when to
+     accept cookies from a server or when to pass them back.
+  <li>Is urllib2.py required?
+  <p>No.  You probably want it, though.
+  <li>Where can I find out more about the HTTP cookie protocol?
+  <p>There is more than one protocol, in fact (see the <a href="./doc.html">docs</a>
+     for a brief explanation of the history):
+  <ul>
+    <li>The original <a href="http://www.netscape.com/newsref/std/cookie_spec.html">
+        Netscape cookie protocol</a> - the standard still in use today, in
+        theory (in reality, the protocol implemented by all the major browsers
+        only bears a passing resemblance to the protocol sketched out in this
+        document).
+    <li><a href="http://www.ietf.org/rfcs/rfc2109.txt">RFC 2109</a> - obsoleted
+        by RFC 2965.
+     <li><a href="http://www.ietf.org/rfcs/rfc2965.txt">RFC 2965</a> - the
+        Netscape protocol with the bugs fixed (not widely used - the Netscape
+        protocol still dominates, and seems likely to remain dominant
+        indefinitely, at least on the Internet).
+        <a href="http://www.ietf.org/rfcs/rfc2964.txt">RFC 2964</a> discusses use
+        of the protocol.
+        <a href="http://kristol.org/cookie/errata.html">Errata</a> to RFC 2965
+        are currently being discussed on the
+        <a href="http://lists.bell-labs.com/mailman/listinfo/http-state">
+        http-state mailing list</a> (update: list traffic died months ago and
+        hasn't revived).
+    <li>A <a href="http://doi.acm.org/10.1145/502152.502153">paper</a> by David
+        Kristol setting out the history of the cookie standards in exhausting
+        detail.
+    <li>HTTP cookies <a href="http://www.cookiecentral.com/">FAQ</a>.
+  </ul>
+  <li>Which protocols does mechanize support?
+     <p>Netscape and RFC 2965.  RFC 2965 handling is switched off by default.
+  <li>What about RFC 2109?
+     <p>RFC 2109 cookies are currently parsed as Netscape cookies, and treated
+     by default as RFC 2965 cookies thereafter if RFC 2965 handling is enabled,
+     or as Netscape cookies otherwise.  RFC 2109 is officially obsoleted by RFC
+     2965.  Browsers do use a few RFC 2109 features in their Netscape cookie
+     implementations (<code>port</code> and <code>max-age</code>), and
+     mechanize knows about that, too.
+</ul>
+
+
+<a name="faq_use"></a>
+<h2>FAQs - usage</h2>
+<ul>
+  <li>Why don't I have any cookies?
+  <p>Read the <a href="./doc.html#debugging">debugging section</a> of this page.
+  <li>My response claims to be empty, but I know it's not!
+  <p>Did you call <code>response.read()</code> (eg., in a debug statement),
+     then forget that all the data has already been read?  In that case, you
+     may want to use <code>mechanize.response_seek_wrapper</code>.
+  <li>How do I download only part of a response body?
+  <p>Just call <code>.read()</code> or <code>.readline()</code> methods on your
+     response object as many times as you need.  The <code>.seek()</code>
+     method (which is not always present, see <a
+     href="./doc.html#seekable">above</a>) still works, because mechanize
+     caches read data.
+  <li>What's the difference between the <code>.load()</code> and
+      <code>.revert()</code> methods of <code>CookieJar</code>?
+  <p><code>.load()</code> <em>appends</em> cookies from a file.
+     <code>.revert()</code> discards all existing cookies held by the
+     <code>CookieJar</code> first (but it won't lose any existing cookies if
+     the loading fails).
+  <li>Is it threadsafe?
+  <p>No.  <em>Tested</em> patches welcome.  Clarification: As far as I know,
+     it's perfectly possible to use mechanize in threaded code, but it
+     provides no synchronisation: you have to provide that yourself.
+  <li>How do I do &lt;X&gt;
+  <p>The module docstrings are worth reading if you want to do something
+     unusual.
+  <li>What's this &quot;processor&quot; business about?  I knew
+      <code>urllib2</code> used &quot;handlers&quot;, but not these
+      &quot;processors&quot;.
+  <p>This Python library <a href="http://www.python.org/sf/852995">patch</a>
+     contains an explanation.  Processors are now a standard part of urllib2
+     in Python 2.4.
+  <li>How do I use it without urllib2.py?
+  @{colorize(r"""
+from mechanize import CookieJar
+print CookieJar.extract_cookies.__doc__
+print CookieJar.add_cookie_header.__doc__
+""")}
+</ul>
+
+<p>I prefer questions and comments to be sent to the <a
+href="http://lists.sourceforge.net/lists/listinfo/wwwsearch-general">
+mailing list</a> rather than direct to me.
+
+<p><a href="mailto:jjl@@pobox.com">John J. Lee</a>,
+@(time.strftime("%B %Y", last_modified)).
+
+<hr>
+
+</div>
+
+<div id="Menu">
+
+@(release.navbar('ccdocs'))
+
+<br>
+
+<a href="./doc.html#examples">Examples</a><br>
+<a href="./doc.html#browsers">Mozilla &amp; MSIE</a><br>
+<a href="./doc.html#file">Cookies in a file</a><br>
+<a href="./doc.html#cookiejar">Using a <code>CookieJar</code></a><br>
+<a href="./doc.html#extras">Processors</a><br>
+<a href="./doc.html#seekable">Seekable responses</a><br>
+<a href="./doc.html#requests">Request confusion</a><br>
+<a href="./doc.html#headers">Adding headers</a><br>
+<a href="./doc.html#unverifiable">Verifiability</a><br>
+<a href="./doc.html#rfc2965">RFC 2965</a><br>
+<a href="./doc.html#debugging">Debugging</a><br>
+<a href="./doc.html#script">Embedded scripts</a><br>
+<a href="./doc.html#dates">HTTP date parsing</a><br>
+<a href="./doc.html#standards">Standards</a><br>
+<a href="./doc.html#faq_use">FAQs - usage</a><br>
+
+</div>
+
+</body>
+
+</html>

Added: mechanize/tags/0.1.10/examples/hack21.py
===================================================================
--- mechanize/tags/0.1.10/examples/hack21.py	                        (rev 0)
+++ mechanize/tags/0.1.10/examples/hack21.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,57 @@
+#/usr/bin/env python
+
+# Port of Hack 21 from the O'Reilly book "Spidering Hacks" by Tara
+# Calishain and Kevin Hemenway.  Of course, there's no need to explicitly
+# catch exceptions in Python, unlike checking error return values in Perl,
+# but I've left those in for the sake of a direct port.
+
+import sys, os, re
+from urllib2 import HTTPError
+
+import mechanize
+assert mechanize.__version__ >= (0, 0, 6, "a")
+
+mech = mechanize.Browser()
+# Addition 2005-01-05: Be naughty, since robots.txt asks not to
+# access /search now.  We're not madly searching for everything, so
+# I don't feel too guilty.
+mech.set_handle_robots(False)
+#mech.set_debug_http(True)
+
+# Get the starting search page
+try:
+    mech.open("http://search.cpan.org")
+except HTTPError, e:
+    sys.exit("%d: %s" % (e.code, e.msg))
+
+# Select the form, fill the fields, and submit
+mech.select_form(nr=0)
+mech["query"] = "Lester"
+mech["mode"] = ["author"]
+try:
+    mech.submit()
+except HTTPError, e:
+    sys.exit("post failed: %d: %s" % (e.code, e.msg))
+
+# Find the link for "Andy"
+try:
+    mech.follow_link(text_regex=re.compile("Andy"))
+except HTTPError, e:
+    sys.exit("post failed: %d: %s" % (e.code, e.msg))
+
+# Get all the tarballs
+urls = [link.absolute_url for link in
+        mech.links(url_regex=re.compile(r"\.tar\.gz$"))]
+print "Found", len(urls), "tarballs to download"
+
+for url in urls:
+    filename = os.path.basename(url)
+    f = open(filename, "wb")
+    print "%s -->" % filename,
+    r = mech.open(url)
+    while 1:
+        data = r.read(1024)
+        if not data: break
+        f.write(data)
+    f.close()
+    print os.stat(filename).st_size, "bytes"

Added: mechanize/tags/0.1.10/examples/pypi.py
===================================================================
--- mechanize/tags/0.1.10/examples/pypi.py	                        (rev 0)
+++ mechanize/tags/0.1.10/examples/pypi.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,68 @@
+#!/usr/bin/env python
+
+# ------------------------------------------------------------------------
+# THIS SCRIPT IS CURRENTLY NOT WORKING, SINCE PYPI's SEARCH FEATURE HAS
+# BEEN REMOVED!
+# ------------------------------------------------------------------------
+
+# Search PyPI, the Python Package Index, and retrieve latest mechanize
+# tarball.
+
+# This is just to demonstrate mechanize: You should use EasyInstall to
+# do this, not this silly script.
+
+import sys, os, re
+
+import mechanize
+
+b = mechanize.Browser(
+    # mechanize's XHTML support needs work, so is currently switched off.  If
+    # we want to get our work done, we have to turn it on by supplying a
+    # mechanize.Factory (with XHTML support turned on):
+    factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
+    )
+# Addition 2005-06-13: Be naughty, since robots.txt asks not to
+# access /pypi now.  We're not madly searching for everything, so
+# I don't feel too guilty.
+b.set_handle_robots(False)
+
+# search PyPI
+b.open("http://www.python.org/pypi")
+b.follow_link(text="Search", nr=1)
+b.select_form(nr=0)
+b["name"] = "mechanize"
+b.submit()
+
+# 2005-05-20 no longer necessary, only one version there, so PyPI takes
+# us direct to PKG-INFO page
+## # find latest release
+## VERSION_RE = re.compile(r"(?P<major>\d+)\.(?P<minor>\d+)\.(?P<bugfix>\d+)"
+##                         r"(?P<state>[ab])?(?:-pre)?(?P<pre>\d+)?$")
+## def parse_version(text):
+##     m = VERSION_RE.match(text)
+##     if m is None:
+##         raise ValueError
+##     return tuple([m.groupdict()[part] for part in
+##                   ("major", "minor", "bugfix", "state", "pre")])
+## MECHANIZE_RE = re.compile(r"mechanize-?(.*)")
+## links = b.links(text_regex=MECHANIZE_RE)
+## versions = []
+## for link in links:
+##     m = MECHANIZE_RE.search(link.text)
+##     version_string = m.group(1).strip(' \t\xa0')
+##     tup = parse_version(version_string)[:3]
+##     versions.append(tup)
+## latest = links[versions.index(max(versions))]
+
+# get tarball
+## b.follow_link(latest)  # to PKG-INFO page
+r = b.follow_link(text_regex=re.compile(r"\.tar\.gz"))
+filename = os.path.basename(b.geturl())
+if os.path.exists(filename):
+    sys.exit("%s already exists, not grabbing" % filename)
+f = file(filename, "wb")
+while 1:
+    data = r.read(1024)
+    if not data: break
+    f.write(data)
+f.close()

Added: mechanize/tags/0.1.10/ez_setup.py
===================================================================
--- mechanize/tags/0.1.10/ez_setup.py	                        (rev 0)
+++ mechanize/tags/0.1.10/ez_setup.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,222 @@
+#!python
+"""Bootstrap setuptools installation
+
+If you want to use setuptools in your package's setup.py, just include this
+file in the same directory with it, and add this to the top of your setup.py::
+
+    from ez_setup import use_setuptools
+    use_setuptools()
+
+If you want to require a specific version of setuptools, set a download
+mirror, or use an alternate download directory, you can do so by supplying
+the appropriate options to ``use_setuptools()``.
+
+This file can also be run as a script to install or upgrade setuptools.
+"""
+import sys
+DEFAULT_VERSION = "0.6c3"
+DEFAULT_URL     = "http://cheeseshop.python.org/packages/%s/s/setuptools/" % sys.version[:3]
+
+md5_data = {
+    'setuptools-0.6b1-py2.3.egg': '8822caf901250d848b996b7f25c6e6ca',
+    'setuptools-0.6b1-py2.4.egg': 'b79a8a403e4502fbb85ee3f1941735cb',
+    'setuptools-0.6b2-py2.3.egg': '5657759d8a6d8fc44070a9d07272d99b',
+    'setuptools-0.6b2-py2.4.egg': '4996a8d169d2be661fa32a6e52e4f82a',
+    'setuptools-0.6b3-py2.3.egg': 'bb31c0fc7399a63579975cad9f5a0618',
+    'setuptools-0.6b3-py2.4.egg': '38a8c6b3d6ecd22247f179f7da669fac',
+    'setuptools-0.6b4-py2.3.egg': '62045a24ed4e1ebc77fe039aa4e6f7e5',
+    'setuptools-0.6b4-py2.4.egg': '4cb2a185d228dacffb2d17f103b3b1c4',
+    'setuptools-0.6c1-py2.3.egg': 'b3f2b5539d65cb7f74ad79127f1a908c',
+    'setuptools-0.6c1-py2.4.egg': 'b45adeda0667d2d2ffe14009364f2a4b',
+    'setuptools-0.6c2-py2.3.egg': 'f0064bf6aa2b7d0f3ba0b43f20817c27',
+    'setuptools-0.6c2-py2.4.egg': '616192eec35f47e8ea16cd6a122b7277',
+    'setuptools-0.6c3-py2.3.egg': 'f181fa125dfe85a259c9cd6f1d7b78fa',
+    'setuptools-0.6c3-py2.4.egg': 'e0ed74682c998bfb73bf803a50e7b71e',
+    'setuptools-0.6c3-py2.5.egg': 'abef16fdd61955514841c7c6bd98965e',
+}
+
+import sys, os
+
+def _validate_md5(egg_name, data):
+    if egg_name in md5_data:
+        from md5 import md5
+        digest = md5(data).hexdigest()
+        if digest != md5_data[egg_name]:
+            print >>sys.stderr, (
+                "md5 validation of %s failed!  (Possible download problem?)"
+                % egg_name
+            )
+            sys.exit(2)
+    return data
+
+
+def use_setuptools(
+    version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir,
+    download_delay=15
+):
+    """Automatically find/download setuptools and make it available on sys.path
+
+    `version` should be a valid setuptools version number that is available
+    as an egg for download under the `download_base` URL (which should end with
+    a '/').  `to_dir` is the directory where setuptools will be downloaded, if
+    it is not already available.  If `download_delay` is specified, it should
+    be the number of seconds that will be paused before initiating a download,
+    should one be required.  If an older version of setuptools is installed,
+    this routine will print a message to ``sys.stderr`` and raise SystemExit in
+    an attempt to abort the calling script.
+    """
+    try:
+        import setuptools
+        if setuptools.__version__ == '0.0.1':
+            print >>sys.stderr, (
+            "You have an obsolete version of setuptools installed.  Please\n"
+            "remove it from your system entirely before rerunning this script."
+            )
+            sys.exit(2)
+    except ImportError:
+        egg = download_setuptools(version, download_base, to_dir, download_delay)
+        sys.path.insert(0, egg)
+        import setuptools; setuptools.bootstrap_install_from = egg
+
+    import pkg_resources
+    try:
+        pkg_resources.require("setuptools>="+version)
+
+    except pkg_resources.VersionConflict, e:
+        # XXX could we install in a subprocess here?
+        print >>sys.stderr, (
+            "The required version of setuptools (>=%s) is not available, and\n"
+            "can't be installed while this script is running. Please install\n"
+            " a more recent version first.\n\n(Currently using %r)"
+        ) % (version, e.args[0])
+        sys.exit(2)
+
+def download_setuptools(
+    version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir,
+    delay = 15
+):
+    """Download setuptools from a specified location and return its filename
+
+    `version` should be a valid setuptools version number that is available
+    as an egg for download under the `download_base` URL (which should end
+    with a '/'). `to_dir` is the directory where the egg will be downloaded.
+    `delay` is the number of seconds to pause before an actual download attempt.
+    """
+    import urllib2, shutil
+    egg_name = "setuptools-%s-py%s.egg" % (version,sys.version[:3])
+    url = download_base + egg_name
+    saveto = os.path.join(to_dir, egg_name)
+    src = dst = None
+    if not os.path.exists(saveto):  # Avoid repeated downloads
+        try:
+            from distutils import log
+            if delay:
+                log.warn("""
+---------------------------------------------------------------------------
+This script requires setuptools version %s to run (even to display
+help).  I will attempt to download it for you (from
+%s), but
+you may need to enable firewall access for this script first.
+I will start the download in %d seconds.
+
+(Note: if this machine does not have network access, please obtain the file
+
+   %s
+
+and place it in this directory before rerunning this script.)
+---------------------------------------------------------------------------""",
+                    version, download_base, delay, url
+                ); from time import sleep; sleep(delay)
+            log.warn("Downloading %s", url)
+            src = urllib2.urlopen(url)
+            # Read/write all in one block, so we don't create a corrupt file
+            # if the download is interrupted.
+            data = _validate_md5(egg_name, src.read())
+            dst = open(saveto,"wb"); dst.write(data)
+        finally:
+            if src: src.close()
+            if dst: dst.close()
+    return os.path.realpath(saveto)
+
+def main(argv, version=DEFAULT_VERSION):
+    """Install or upgrade setuptools and EasyInstall"""
+
+    try:
+        import setuptools
+    except ImportError:
+        egg = None
+        try:
+            egg = download_setuptools(version, delay=0)
+            sys.path.insert(0,egg)
+            from setuptools.command.easy_install import main
+            return main(list(argv)+[egg])   # we're done here
+        finally:
+            if egg and os.path.exists(egg):
+                os.unlink(egg)
+    else:
+        if setuptools.__version__ == '0.0.1':
+            # tell the user to uninstall obsolete version
+            use_setuptools(version)
+
+    req = "setuptools>="+version
+    import pkg_resources
+    try:
+        pkg_resources.require(req)
+    except pkg_resources.VersionConflict:
+        try:
+            from setuptools.command.easy_install import main
+        except ImportError:
+            from easy_install import main
+        main(list(argv)+[download_setuptools(delay=0)])
+        sys.exit(0) # try to force an exit
+    else:
+        if argv:
+            from setuptools.command.easy_install import main
+            main(argv)
+        else:
+            print "Setuptools version",version,"or greater has been installed."
+            print '(Run "ez_setup.py -U setuptools" to reinstall or upgrade.)'
+
+
+
+def update_md5(filenames):
+    """Update our built-in md5 registry"""
+
+    import re
+    from md5 import md5
+
+    for name in filenames:
+        base = os.path.basename(name)
+        f = open(name,'rb')
+        md5_data[base] = md5(f.read()).hexdigest()
+        f.close()
+
+    data = ["    %r: %r,\n" % it for it in md5_data.items()]
+    data.sort()
+    repl = "".join(data)
+
+    import inspect
+    srcfile = inspect.getsourcefile(sys.modules[__name__])
+    f = open(srcfile, 'rb'); src = f.read(); f.close()
+
+    match = re.search("\nmd5_data = {\n([^}]+)}", src)
+    if not match:
+        print >>sys.stderr, "Internal error!"
+        sys.exit(2)
+
+    src = src[:match.start(1)] + repl + src[match.end(1):]
+    f = open(srcfile,'w')
+    f.write(src)
+    f.close()
+
+
+if __name__=='__main__':
+    if len(sys.argv)>2 and sys.argv[1]=='--md5update':
+        update_md5(sys.argv[2:])
+    else:
+        main(sys.argv[1:])
+
+
+
+
+

Added: mechanize/tags/0.1.10/functional_tests.py
===================================================================
--- mechanize/tags/0.1.10/functional_tests.py	                        (rev 0)
+++ mechanize/tags/0.1.10/functional_tests.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,644 @@
+#!/usr/bin/env python
+
+# These tests access the network.
+
+# thanks Moof (aka Giles Antonio Radford) for some of these
+
+import errno
+import os
+import socket
+import sys
+import tempfile
+import urllib
+
+import mechanize
+from mechanize import build_opener, install_opener, urlopen, urlretrieve
+from mechanize import CookieJar, HTTPCookieProcessor, \
+     HTTPHandler, HTTPRefreshProcessor, \
+     HTTPEquivProcessor, HTTPRedirectHandler, \
+     HTTPRedirectDebugProcessor, HTTPResponseDebugProcessor
+from mechanize._rfc3986 import urljoin
+from mechanize._util import hide_experimental_warnings, \
+    reset_experimental_warnings
+import mechanize._sockettimeout
+from mechanize._testcase import TestCase
+
+#from cookielib import CookieJar
+#from urllib2 import build_opener, install_opener, urlopen
+#from urllib2 import HTTPCookieProcessor, HTTPHandler
+
+#from mechanize import CreateBSDDBCookieJar
+
+## import logging
+## logger = logging.getLogger("mechanize")
+## logger.addHandler(logging.StreamHandler(sys.stdout))
+## #logger.setLevel(logging.DEBUG)
+## logger.setLevel(logging.INFO)
+
+
+def sanepathname2url(path):
+    import urllib
+    urlpath = urllib.pathname2url(path)
+    if os.name == "nt" and urlpath.startswith("///"):
+        urlpath = urlpath[2:]
+    # XXX don't ask me about the mac...
+    return urlpath
+
+
+def read_file(filename):
+    fh = open(filename)
+    try:
+        return fh.read()
+    finally:
+        fh.close()
+
+
+class SocketTimeoutTest(TestCase):
+
+    # the timeout tests in this module aren't full functional tests: in order
+    # to speed things up, don't actually call .settimeout on the socket.  XXX
+    # allow running the tests against a slow server with a real timeout
+
+    def _monkey_patch_socket(self):
+        class Delegator(object):
+            def __init__(self, delegate):
+                self._delegate = delegate
+            def __getattr__(self, name):
+                return getattr(self._delegate, name)
+
+        assertEquals = self.assertEquals
+
+        class TimeoutLog(object):
+            AnyValue = object()
+            def __init__(self):
+                self._nr_sockets = 0
+                self._timeouts = []
+                self.start()
+            def start(self):
+                self._monitoring = True
+            def stop(self):
+                self._monitoring = False
+            def socket_created(self):
+                if self._monitoring:
+                    self._nr_sockets += 1
+            def settimeout_called(self, timeout):
+                if self._monitoring:
+                    self._timeouts.append(timeout)
+            def verify(self, value=AnyValue):
+                if sys.version_info[:2] < (2, 6):
+                    # per-connection timeout not supported in Python 2.5
+                    self.verify_default()
+                else:
+                    assertEquals(len(self._timeouts), self._nr_sockets)
+                    if value is not self.AnyValue:
+                        for timeout in self._timeouts:
+                            assertEquals(timeout, value)
+            def verify_default(self):
+                assertEquals(len(self._timeouts), 0)
+
+        log = TimeoutLog()
+        def settimeout(timeout):
+            log.settimeout_called(timeout)
+        orig_socket = socket.socket
+        def make_socket(*args, **kwds):
+            sock = Delegator(orig_socket(*args, **kwds))
+            log.socket_created()
+            sock.settimeout = settimeout
+            return sock
+        self.monkey_patch(socket, "socket", make_socket)
+        return log
+
+
+class SimpleTests(SocketTimeoutTest):
+    # thanks Moof (aka Giles Antonio Radford)
+
+    def setUp(self):
+        super(SimpleTests, self).setUp()
+        self.browser = mechanize.Browser()
+
+    def test_simple(self):
+        self.browser.open(self.uri)
+        self.assertEqual(self.browser.title(), 'Python bits')
+        # relative URL
+        self.browser.open('/mechanize/')
+        self.assertEqual(self.browser.title(), 'mechanize')
+
+    def test_basic_auth(self):
+        uri = urljoin(self.uri, "basic_auth")
+        self.assertRaises(mechanize.URLError, self.browser.open, uri)
+        self.browser.add_password(uri, "john", "john")
+        self.browser.open(uri)
+        self.assertEqual(self.browser.title(), 'Basic Auth Protected Area')
+
+    def test_digest_auth(self):
+        uri = urljoin(self.uri, "digest_auth")
+        self.assertRaises(mechanize.URLError, self.browser.open, uri)
+        self.browser.add_password(uri, "digestuser", "digestuser")
+        self.browser.open(uri)
+        self.assertEqual(self.browser.title(), 'Digest Auth Protected Area')
+
+    def test_open_with_default_timeout(self):
+        timeout_log = self._monkey_patch_socket()
+        self.browser.open(self.uri)
+        self.assertEqual(self.browser.title(), 'Python bits')
+        timeout_log.verify_default()
+
+    def test_open_with_timeout(self):
+        timeout_log = self._monkey_patch_socket()
+        timeout = 10.
+        self.browser.open(self.uri, timeout=timeout)
+        self.assertEqual(self.browser.title(), 'Python bits')
+        timeout_log.verify(timeout)
+
+    def test_urlopen_with_default_timeout(self):
+        timeout_log = self._monkey_patch_socket()
+        response = mechanize.urlopen(self.uri)
+        self.assert_contains(response.read(), "Python bits")
+        timeout_log.verify_default()
+
+    def test_urlopen_with_timeout(self):
+        timeout_log = self._monkey_patch_socket()
+        timeout = 10.
+        response = mechanize.urlopen(self.uri, timeout=timeout)
+        self.assert_contains(response.read(), "Python bits")
+        timeout_log.verify(timeout)
+
+    def test_302_and_404(self):
+        # the combination of 302 and 404 (/redirected is configured to redirect
+        # to a non-existent URL /nonexistent) has caused problems in the past
+        # due to accidental double-wrapping of the error response
+        import urllib2
+        self.assertRaises(
+            urllib2.HTTPError,
+            self.browser.open, urljoin(self.uri, "/redirected"),
+            )
+
+    def test_reread(self):
+        # closing response shouldn't stop methods working (this happens also to
+        # be true for e.g. mechanize.OpenerDirector when mechanize's own
+        # handlers are in use, but is guaranteed to be true for
+        # mechanize.Browser)
+        r = self.browser.open(self.uri)
+        data = r.read()
+        r.close()
+        r.seek(0)
+        self.assertEqual(r.read(), data)
+        self.assertEqual(self.browser.response().read(), data)
+
+    def test_error_recovery(self):
+        self.assertRaises(mechanize.URLError, self.browser.open,
+                          'file:///c|thisnoexistyiufheiurgbueirgbue')
+        self.browser.open(self.uri)
+        self.assertEqual(self.browser.title(), 'Python bits')
+
+    def test_redirect(self):
+        # 301 redirect due to missing final '/'
+        r = self.browser.open(urljoin(self.uri, "bits"))
+        self.assertEqual(r.code, 200)
+        self.assert_("GeneralFAQ.html" in r.read(2048))
+
+    def test_refresh(self):
+        def refresh_request(seconds):
+            uri = urljoin(self.uri, "/cgi-bin/cookietest.cgi")
+            val = urllib.quote_plus('%d; url="%s"' % (seconds, self.uri))
+            return uri + ("?refresh=%s" % val)
+        self.browser.set_handle_refresh(True, honor_time=False)
+        r = self.browser.open(refresh_request(5))
+        self.assertEqual(r.geturl(), self.uri)
+        # Set a maximum refresh time of 30 seconds (these long refreshes tend
+        # to be there only because the website owner wants you to see the
+        # latest news, or whatever -- they're not essential to the operation of
+        # the site, and not really useful or appropriate when scraping).
+        refresh_uri = refresh_request(60)
+        self.browser.set_handle_refresh(True, max_time=30., honor_time=True)
+        r = self.browser.open(refresh_uri)
+        self.assertEqual(r.geturl(), refresh_uri)
+        # allow long refreshes (but don't actually wait 60 seconds)
+        self.browser.set_handle_refresh(True, max_time=None, honor_time=False)
+        r = self.browser.open(refresh_request(60))
+        self.assertEqual(r.geturl(), self.uri)
+
+    def test_file_url(self):
+        url = "file://%s" % sanepathname2url(
+            os.path.abspath('functional_tests.py'))
+        r = self.browser.open(url)
+        self.assert_("this string appears in this file ;-)" in r.read())
+
+    def test_open_local_file(self):
+        # Since the file: URL scheme is not well standardised, Browser has a
+        # special method to open files by name, for convenience:
+        br = mechanize.Browser()
+        response = br.open_local_file("mechanize/_mechanize.py")
+        self.assert_("def open_local_file(self, filename):" in
+                     response.get_data())
+
+    def test_open_novisit(self):
+        def test_state(br):
+            self.assert_(br.request is None)
+            self.assert_(br.response() is None)
+            self.assertRaises(mechanize.BrowserStateError, br.back)
+        test_state(self.browser)
+        uri = urljoin(self.uri, "bits")
+        # note this involves a redirect, which should itself be non-visiting
+        r = self.browser.open_novisit(uri)
+        test_state(self.browser)
+        self.assert_("GeneralFAQ.html" in r.read(2048))
+
+        # Request argument instead of URL
+        r = self.browser.open_novisit(mechanize.Request(uri))
+        test_state(self.browser)
+        self.assert_("GeneralFAQ.html" in r.read(2048))
+
+    def test_non_seekable(self):
+        # check everything still works without response_seek_wrapper and
+        # the .seek() method on response objects
+        ua = mechanize.UserAgent()
+        ua.set_seekable_responses(False)
+        ua.set_handle_equiv(False)
+        response = ua.open(self.uri)
+        self.failIf(hasattr(response, "seek"))
+        data = response.read()
+        self.assert_("Python bits" in data)
+
+
+class ResponseTests(TestCase):
+
+    def test_seek(self):
+        br = mechanize.Browser()
+        r = br.open(self.uri)
+        html = r.read()
+        r.seek(0)
+        self.assertEqual(r.read(), html)
+
+    def test_seekable_response_opener(self):
+        opener = mechanize.OpenerFactory(
+            mechanize.SeekableResponseOpener).build_opener()
+        r = opener.open(urljoin(self.uri, "bits/cctest2.txt"))
+        r.read()
+        r.seek(0)
+        self.assertEqual(r.read(),
+                         r.get_data(),
+                         "Hello ClientCookie functional test suite.\n")
+
+    def test_seek_wrapper_class_name(self):
+        opener = mechanize.UserAgent()
+        opener.set_seekable_responses(True)
+        try:
+            opener.open(urljoin(self.uri, "nonexistent"))
+        except mechanize.HTTPError, exc:
+            self.assert_("HTTPError instance" in repr(exc))
+
+    def test_no_seek(self):
+        # should be possible to turn off UserAgent's .seek() functionality
+        def check_no_seek(opener):
+            r = opener.open(urljoin(self.uri, "bits/cctest2.txt"))
+            self.assert_(not hasattr(r, "seek"))
+            try:
+                opener.open(urljoin(self.uri, "nonexistent"))
+            except mechanize.HTTPError, exc:
+                self.assert_(not hasattr(exc, "seek"))
+
+        # mechanize.UserAgent
+        opener = mechanize.UserAgent()
+        opener.set_handle_equiv(False)
+        opener.set_seekable_responses(False)
+        opener.set_debug_http(False)
+        check_no_seek(opener)
+
+        # mechanize.OpenerDirector
+        opener = mechanize.build_opener()
+        check_no_seek(opener)
+
+    def test_consistent_seek(self):
+        # if we explicitly request that returned response objects have the
+        # .seek() method, then raised HTTPError exceptions should also have the
+        # .seek() method
+        def check(opener, excs_also):
+            r = opener.open(urljoin(self.uri, "bits/cctest2.txt"))
+            data = r.read()
+            r.seek(0)
+            self.assertEqual(data, r.read(), r.get_data())
+            try:
+                opener.open(urljoin(self.uri, "nonexistent"))
+            except mechanize.HTTPError, exc:
+                data = exc.read()
+                if excs_also:
+                    exc.seek(0)
+                    self.assertEqual(data, exc.read(), exc.get_data())
+            else:
+                self.assert_(False)
+
+        opener = mechanize.UserAgent()
+        opener.set_debug_http(False)
+
+        # Here, only the .set_handle_equiv() causes .seek() to be present, so
+        # exceptions don't necessarily support the .seek() method (and do not,
+        # at present).
+        opener.set_handle_equiv(True)
+        opener.set_seekable_responses(False)
+        check(opener, excs_also=False)
+
+        # Here, (only) the explicit .set_seekable_responses() causes .seek() to
+        # be present (different mechanism from .set_handle_equiv()).  Since
+        # there's an explicit request, ALL responses are seekable, even
+        # exception responses (HTTPError instances).
+        opener.set_handle_equiv(False)
+        opener.set_seekable_responses(True)
+        check(opener, excs_also=True)
+
+    def test_set_response(self):
+        br = mechanize.Browser()
+        r = br.open(self.uri)
+        html = r.read()
+        self.assertEqual(br.title(), "Python bits")
+
+        newhtml = """<html><body><a href="spam">click me</a></body></html>"""
+
+        r.set_data(newhtml)
+        self.assertEqual(r.read(), newhtml)
+        self.assertEqual(br.response().read(), html)
+        br.response().set_data(newhtml)
+        self.assertEqual(br.response().read(), html)
+        self.assertEqual(list(br.links())[0].url, 'http://sourceforge.net')
+
+        br.set_response(r)
+        self.assertEqual(br.response().read(), newhtml)
+        self.assertEqual(list(br.links())[0].url, "spam")
+
+    def test_new_response(self):
+        br = mechanize.Browser()
+        data = "<html><head><title>Test</title></head><body><p>Hello.</p></body></html>"
+        response = mechanize.make_response(
+            data,
+            [("Content-type", "text/html")],
+            "http://example.com/",
+            200,
+            "OK"
+            )
+        br.set_response(response)
+        self.assertEqual(br.response().get_data(), data)
+
+    def hidden_test_close_pickle_load(self):
+        print ("Test test_close_pickle_load is expected to fail unless Python "
+               "standard library patch http://python.org/sf/1144636 has been "
+               "applied")
+        import pickle
+
+        b = mechanize.Browser()
+        r = b.open(urljoin(self.uri, "bits/cctest2.txt"))
+        r.read()
+
+        r.close()
+        r.seek(0)
+        self.assertEqual(r.read(),
+                         "Hello ClientCookie functional test suite.\n")
+
+        HIGHEST_PROTOCOL = -1
+        p = pickle.dumps(b, HIGHEST_PROTOCOL)
+        b = pickle.loads(p)
+        r = b.response()
+        r.seek(0)
+        self.assertEqual(r.read(),
+                         "Hello ClientCookie functional test suite.\n")
+
+
+class FunctionalTests(SocketTimeoutTest):
+
+    def test_referer(self):
+        br = mechanize.Browser()
+        br.set_handle_refresh(True, honor_time=False)
+        referer = urljoin(self.uri, "bits/referertest.html")
+        info = urljoin(self.uri, "/cgi-bin/cookietest.cgi")
+        r = br.open(info)
+        self.assert_(referer not in r.get_data())
+
+        br.open(referer)
+        r = br.follow_link(text="Here")
+        self.assert_(referer in r.get_data())
+
+    def test_cookies(self):
+        import urllib2
+        # this test page depends on cookies, and an http-equiv refresh
+        #cj = CreateBSDDBCookieJar("/home/john/db.db")
+        cj = CookieJar()
+        handlers = [
+            HTTPCookieProcessor(cj),
+            HTTPRefreshProcessor(max_time=None, honor_time=False),
+            HTTPEquivProcessor(),
+
+            HTTPRedirectHandler(),  # needed for Refresh handling in 2.4.0
+#            HTTPHandler(True),
+#            HTTPRedirectDebugProcessor(),
+#            HTTPResponseDebugProcessor(),
+            ]
+
+        o = apply(build_opener, handlers)
+        try:
+            install_opener(o)
+            try:
+                r = urlopen(urljoin(self.uri, "/cgi-bin/cookietest.cgi"))
+            except urllib2.URLError, e:
+                #print e.read()
+                raise
+            data = r.read()
+            #print data
+            self.assert_(
+                data.find("Your browser supports cookies!") >= 0)
+            self.assert_(len(cj) == 1)
+
+            # test response.seek() (added by HTTPEquivProcessor)
+            r.seek(0)
+            samedata = r.read()
+            r.close()
+            self.assert_(samedata == data)
+        finally:
+            o.close()
+            install_opener(None)
+
+    def test_robots(self):
+        plain_opener = mechanize.build_opener(mechanize.HTTPRobotRulesProcessor)
+        browser = mechanize.Browser()
+        for opener in plain_opener, browser:
+            r = opener.open(urljoin(self.uri, "robots"))
+            self.assertEqual(r.code, 200)
+            self.assertRaises(
+                mechanize.RobotExclusionError,
+                opener.open, urljoin(self.uri, "norobots"))
+
+    def _check_retrieve(self, url, filename, headers):
+        from urllib import urlopen
+        self.assertEqual(headers.get('Content-Type'), 'text/html')
+        self.assertEqual(read_file(filename), urlopen(url).read())
+
+    def test_retrieve_to_named_file(self):
+        url = urljoin(self.uri, "/mechanize/")
+        test_filename = os.path.join(self.make_temp_dir(), "python.html")
+        opener = mechanize.build_opener()
+        verif = CallbackVerifier(self)
+        filename, headers = opener.retrieve(url, test_filename, verif.callback)
+        self.assertEqual(filename, test_filename)
+        self._check_retrieve(url, filename, headers)
+        self.assert_(os.path.isfile(filename))
+
+    def test_retrieve(self):
+        # not passing an explicit filename downloads to a temporary file
+        # using a Request object instead of a URL works
+        url = urljoin(self.uri, "/mechanize/")
+        opener = mechanize.build_opener()
+        verif = CallbackVerifier(self)
+        request = mechanize.Request(url)
+        filename, headers = opener.retrieve(request, reporthook=verif.callback)
+        self.assertEquals(request.visit, False)
+        self._check_retrieve(url, filename, headers)
+        opener.close()
+        # closing the opener removed the temporary file
+        self.failIf(os.path.isfile(filename))
+
+    def test_urlretrieve(self):
+        timeout_log = self._monkey_patch_socket()
+        timeout = 10.
+        url = urljoin(self.uri, "/mechanize/")
+        verif = CallbackVerifier(self)
+        filename, headers = mechanize.urlretrieve(url,
+                                                  reporthook=verif.callback,
+                                                  timeout=timeout)
+        timeout_log.stop()
+        self._check_retrieve(url, filename, headers)
+        timeout_log.verify(timeout)
+
+    def test_reload_read_incomplete(self):
+        from mechanize import Browser
+        browser = Browser()
+        r1 = browser.open(urljoin(self.uri, "bits/mechanize_reload_test.html"))
+        # if we don't do anything and go straight to another page, most of the
+        # last page's response won't be .read()...
+        r2 = browser.open(urljoin(self.uri, "mechanize"))
+        self.assert_(len(r1.get_data()) < 4097)  # we only .read() a little bit
+        # ...so if we then go back, .follow_link() for a link near the end (a
+        # few kb in, past the point that always gets read in HTML files because
+        # of HEAD parsing) will only work if it causes a .reload()...
+        r3 = browser.back()
+        browser.follow_link(text="near the end")
+        # ... good, no LinkNotFoundError, so we did reload.
+        # we have .read() the whole file
+        self.assertEqual(len(r3._seek_wrapper__cache.getvalue()), 4202)
+
+##     def test_cacheftp(self):
+##         from urllib2 import CacheFTPHandler, build_opener
+##         o = build_opener(CacheFTPHandler())
+##         r = o.open("ftp://ftp.python.org/pub/www.python.org/robots.txt")
+##         data1 = r.read()
+##         r.close()
+##         r = o.open("ftp://ftp.python.org/pub/www.python.org/2.3.2/announce.txt")
+##         data2 = r.read()
+##         r.close()
+##         self.assert_(data1 != data2)
+
+
+class CookieJarTests(TestCase):
+
+    def test_mozilla_cookiejar(self):
+        filename = tempfile.mktemp()
+        try:
+            def get_cookiejar():
+                cj = mechanize.MozillaCookieJar(filename=filename)
+                try:
+                    cj.revert()
+                except IOError, exc:
+                    if exc.errno != errno.ENOENT:
+                        raise
+                return cj
+            def commit(cj):
+                cj.save()
+            self._test_cookiejar(get_cookiejar, commit)
+        finally:
+            try:
+                os.remove(filename)
+            except OSError, exc:
+                if exc.errno != errno.ENOENT:
+                    raise
+
+    def test_firefox3_cookiejar(self):
+        try:
+            mechanize.Firefox3CookieJar
+        except AttributeError:
+            # firefox 3 cookiejar is only supported in Python 2.5 and later;
+            # also, sqlite3 must be available
+            return
+
+        filename = tempfile.mktemp()
+        try:
+            def get_cookiejar():
+                hide_experimental_warnings()
+                try:
+                    cj = mechanize.Firefox3CookieJar(filename=filename)
+                finally:
+                    reset_experimental_warnings()
+                cj.connect()
+                return cj
+            def commit(cj):
+                pass
+            self._test_cookiejar(get_cookiejar, commit)
+        finally:
+            os.remove(filename)
+
+    def _test_cookiejar(self, get_cookiejar, commit):
+        cookiejar = get_cookiejar()
+        br = mechanize.Browser()
+        br.set_cookiejar(cookiejar)
+        br.set_handle_refresh(False)
+        url = urljoin(self.uri, "/cgi-bin/cookietest.cgi")
+        # no cookie was set on the first request
+        html = br.open(url).read()
+        self.assertEquals(html.find("Your browser supports cookies!"), -1)
+        self.assertEquals(len(cookiejar), 1)
+        # ... but now we have the cookie
+        html = br.open(url).read()
+        self.assert_("Your browser supports cookies!" in html)
+        commit(cookiejar)
+
+        # should still have the cookie when we load afresh
+        cookiejar = get_cookiejar()
+        br.set_cookiejar(cookiejar)
+        html = br.open(url).read()
+        self.assert_("Your browser supports cookies!" in html)
+
+
+class CallbackVerifier:
+    # for .test_urlretrieve()
+    def __init__(self, testcase):
+        self._count = 0
+        self._testcase = testcase
+    def callback(self, block_nr, block_size, total_size):
+        self._testcase.assertEqual(block_nr, self._count)
+        self._count = self._count + 1
+
+
+if __name__ == "__main__":
+    import sys
+    sys.path.insert(0, "test-tools")
+    test_path = os.path.join(os.path.dirname(sys.argv[0]), "test")
+    sys.path.insert(0, test_path)
+    import testprogram
+    USAGE_EXAMPLES = """
+Examples:
+  %(progName)s
+                 - run all tests
+  %(progName)s functional_tests.SimpleTests
+                 - run all 'test*' test methods in class SimpleTests
+  %(progName)s functional_tests.SimpleTests.test_redirect
+                 - run SimpleTests.test_redirect
+
+  %(progName)s -l
+                 - start a local Twisted HTTP server and run the functional
+                   tests against that, rather than against SourceForge
+                   (quicker!)
+                   If this option doesn't work on Windows/Mac, somebody please
+                   tell me about it, or I'll never find out...
+"""
+    prog = testprogram.TestProgram(
+        ["functional_tests"],
+        localServerProcess=testprogram.TwistedServerProcess(),
+        usageExamples=USAGE_EXAMPLES,
+        )
+    result = prog.runTests()


Property changes on: mechanize/tags/0.1.10/functional_tests.py
___________________________________________________________________
Added: svn:executable
   + 

Added: mechanize/tags/0.1.10/mechanize/__init__.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/__init__.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/__init__.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,140 @@
+__all__ = [
+    'AbstractBasicAuthHandler',
+    'AbstractDigestAuthHandler',
+    'BaseHandler',
+    'Browser',
+    'BrowserStateError',
+    'CacheFTPHandler',
+    'ContentTooShortError',
+    'Cookie',
+    'CookieJar',
+    'CookiePolicy',
+    'DefaultCookiePolicy',
+    'DefaultFactory',
+    'FTPHandler',
+    'Factory',
+    'FileCookieJar',
+    'FileHandler',
+    'FormNotFoundError',
+    'FormsFactory',
+    'HTTPBasicAuthHandler',
+    'HTTPCookieProcessor',
+    'HTTPDefaultErrorHandler',
+    'HTTPDigestAuthHandler',
+    'HTTPEquivProcessor',
+    'HTTPError',
+    'HTTPErrorProcessor',
+    'HTTPHandler',
+    'HTTPPasswordMgr',
+    'HTTPPasswordMgrWithDefaultRealm',
+    'HTTPProxyPasswordMgr',
+    'HTTPRedirectDebugProcessor',
+    'HTTPRedirectHandler',
+    'HTTPRefererProcessor',
+    'HTTPRefreshProcessor',
+    'HTTPRequestUpgradeProcessor',
+    'HTTPResponseDebugProcessor',
+    'HTTPRobotRulesProcessor',
+    'HTTPSClientCertMgr',
+    'HTTPSHandler',
+    'HeadParser',
+    'History',
+    'LWPCookieJar',
+    'Link',
+    'LinkNotFoundError',
+    'LinksFactory',
+    'LoadError',
+    'MSIECookieJar',
+    'MozillaCookieJar',
+    'OpenerDirector',
+    'OpenerFactory',
+    'ParseError',
+    'ProxyBasicAuthHandler',
+    'ProxyDigestAuthHandler',
+    'ProxyHandler',
+    'Request',
+    'ResponseUpgradeProcessor',
+    'RobotExclusionError',
+    'RobustFactory',
+    'RobustFormsFactory',
+    'RobustLinksFactory',
+    'RobustTitleFactory',
+    'SeekableProcessor',
+    'SeekableResponseOpener',
+    'TitleFactory',
+    'URLError',
+    'USE_BARE_EXCEPT',
+    'UnknownHandler',
+    'UserAgent',
+    'UserAgentBase',
+    'XHTMLCompatibleHeadParser',
+    '__version__',
+    'build_opener',
+    'install_opener',
+    'lwp_cookie_str',
+    'make_response',
+    'request_host',
+    'response_seek_wrapper',  # XXX deprecate in public interface?
+    'seek_wrapped_response'   # XXX should probably use this internally in place of response_seek_wrapper()
+    'str2time',
+    'urlopen',
+    'urlretrieve']
+
+import logging
+import sys
+
+from _mechanize import __version__
+
+# high-level stateful browser-style interface
+from _mechanize import \
+     Browser, History, \
+     BrowserStateError, LinkNotFoundError, FormNotFoundError
+
+# configurable URL-opener interface
+from _useragent import UserAgentBase, UserAgent
+from _html import \
+     ParseError, \
+     Link, \
+     Factory, DefaultFactory, RobustFactory, \
+     FormsFactory, LinksFactory, TitleFactory, \
+     RobustFormsFactory, RobustLinksFactory, RobustTitleFactory
+
+# urllib2 work-alike interface (part from mechanize, part from urllib2)
+# This is a superset of the urllib2 interface.
+from _urllib2 import *
+
+# misc
+from _opener import ContentTooShortError, OpenerFactory, urlretrieve
+from _util import http2time as str2time
+from _response import \
+     response_seek_wrapper, seek_wrapped_response, make_response
+from _http import HeadParser
+try:
+    from _http import XHTMLCompatibleHeadParser
+except ImportError:
+    pass
+
+# cookies
+from _clientcookie import Cookie, CookiePolicy, DefaultCookiePolicy, \
+     CookieJar, FileCookieJar, LoadError, request_host_lc as request_host, \
+     effective_request_host
+from _lwpcookiejar import LWPCookieJar, lwp_cookie_str
+# 2.4 raises SyntaxError due to generator / try/finally use
+if sys.version_info[:2] > (2,4):
+    try:
+        import sqlite3
+    except ImportError:
+        pass
+    else:
+        from _firefox3cookiejar import Firefox3CookieJar
+from _mozillacookiejar import MozillaCookieJar
+from _msiecookiejar import MSIECookieJar
+
+# If you hate the idea of turning bugs into warnings, do:
+# import mechanize; mechanize.USE_BARE_EXCEPT = False
+USE_BARE_EXCEPT = True
+
+logger = logging.getLogger("mechanize")
+if logger.level is logging.NOTSET:
+    logger.setLevel(logging.CRITICAL)
+del logger

Added: mechanize/tags/0.1.10/mechanize/_auth.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_auth.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_auth.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,522 @@
+"""HTTP Authentication and Proxy support.
+
+All but HTTPProxyPasswordMgr come from Python 2.5.
+
+
+Copyright 2006 John J. Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it under
+the terms of the BSD or ZPL 2.1 licenses (see the file COPYING.txt
+included with the distribution).
+
+"""
+
+import base64
+import copy
+import os
+import posixpath
+import random
+import re
+import time
+import urlparse
+
+try:
+    import hashlib
+except ImportError:
+    import md5
+    import sha
+    def sha1_digest(bytes):
+        return sha.new(bytes).hexdigest()
+    def md5_digest(bytes):
+        return md5.new(bytes).hexdigest()
+else:
+    def sha1_digest(bytes):
+        return hashlib.sha1(bytes).hexdigest()
+    def md5_digest(bytes):
+        return hashlib.md5(bytes).hexdigest()
+
+from urllib2 import BaseHandler, HTTPError, parse_keqv_list, parse_http_list
+from urllib import getproxies, unquote, splittype, splituser, splitpasswd, \
+     splitport
+
+
+def _parse_proxy(proxy):
+    """Return (scheme, user, password, host/port) given a URL or an authority.
+
+    If a URL is supplied, it must have an authority (host:port) component.
+    According to RFC 3986, having an authority component means the URL must
+    have two slashes after the scheme:
+
+    >>> _parse_proxy('file:/ftp.example.com/')
+    Traceback (most recent call last):
+    ValueError: proxy URL with no authority: 'file:/ftp.example.com/'
+
+    The first three items of the returned tuple may be None.
+
+    Examples of authority parsing:
+
+    >>> _parse_proxy('proxy.example.com')
+    (None, None, None, 'proxy.example.com')
+    >>> _parse_proxy('proxy.example.com:3128')
+    (None, None, None, 'proxy.example.com:3128')
+
+    The authority component may optionally include userinfo (assumed to be
+    username:password):
+
+    >>> _parse_proxy('joe:password at proxy.example.com')
+    (None, 'joe', 'password', 'proxy.example.com')
+    >>> _parse_proxy('joe:password at proxy.example.com:3128')
+    (None, 'joe', 'password', 'proxy.example.com:3128')
+
+    Same examples, but with URLs instead:
+
+    >>> _parse_proxy('http://proxy.example.com/')
+    ('http', None, None, 'proxy.example.com')
+    >>> _parse_proxy('http://proxy.example.com:3128/')
+    ('http', None, None, 'proxy.example.com:3128')
+    >>> _parse_proxy('http://joe:password@proxy.example.com/')
+    ('http', 'joe', 'password', 'proxy.example.com')
+    >>> _parse_proxy('http://joe:password@proxy.example.com:3128')
+    ('http', 'joe', 'password', 'proxy.example.com:3128')
+
+    Everything after the authority is ignored:
+
+    >>> _parse_proxy('ftp://joe:password@proxy.example.com/rubbish:3128')
+    ('ftp', 'joe', 'password', 'proxy.example.com')
+
+    Test for no trailing '/' case:
+
+    >>> _parse_proxy('http://joe:password@proxy.example.com')
+    ('http', 'joe', 'password', 'proxy.example.com')
+
+    """
+    scheme, r_scheme = splittype(proxy)
+    if not r_scheme.startswith("/"):
+        # authority
+        scheme = None
+        authority = proxy
+    else:
+        # URL
+        if not r_scheme.startswith("//"):
+            raise ValueError("proxy URL with no authority: %r" % proxy)
+        # We have an authority, so for RFC 3986-compliant URLs (by ss 3.
+        # and 3.3.), path is empty or starts with '/'
+        end = r_scheme.find("/", 2)
+        if end == -1:
+            end = None
+        authority = r_scheme[2:end]
+    userinfo, hostport = splituser(authority)
+    if userinfo is not None:
+        user, password = splitpasswd(userinfo)
+    else:
+        user = password = None
+    return scheme, user, password, hostport
+
+class ProxyHandler(BaseHandler):
+    # Proxies must be in front
+    handler_order = 100
+
+    def __init__(self, proxies=None):
+        if proxies is None:
+            proxies = getproxies()
+        assert hasattr(proxies, 'has_key'), "proxies must be a mapping"
+        self.proxies = proxies
+        for type, url in proxies.items():
+            setattr(self, '%s_open' % type,
+                    lambda r, proxy=url, type=type, meth=self.proxy_open: \
+                    meth(r, proxy, type))
+
+    def proxy_open(self, req, proxy, type):
+        orig_type = req.get_type()
+        proxy_type, user, password, hostport = _parse_proxy(proxy)
+        if proxy_type is None:
+            proxy_type = orig_type
+        if user and password:
+            user_pass = '%s:%s' % (unquote(user), unquote(password))
+            creds = base64.encodestring(user_pass).strip()
+            req.add_header('Proxy-authorization', 'Basic ' + creds)
+        hostport = unquote(hostport)
+        req.set_proxy(hostport, proxy_type)
+        if orig_type == proxy_type:
+            # let other handlers take care of it
+            return None
+        else:
+            # need to start over, because the other handlers don't
+            # grok the proxy's URL type
+            # e.g. if we have a constructor arg proxies like so:
+            # {'http': 'ftp://proxy.example.com'}, we may end up turning
+            # a request for http://acme.example.com/a into one for
+            # ftp://proxy.example.com/a
+            return self.parent.open(req)
+
+class HTTPPasswordMgr:
+
+    def __init__(self):
+        self.passwd = {}
+
+    def add_password(self, realm, uri, user, passwd):
+        # uri could be a single URI or a sequence
+        if isinstance(uri, basestring):
+            uri = [uri]
+        if not realm in self.passwd:
+            self.passwd[realm] = {}
+        for default_port in True, False:
+            reduced_uri = tuple(
+                [self.reduce_uri(u, default_port) for u in uri])
+            self.passwd[realm][reduced_uri] = (user, passwd)
+
+    def find_user_password(self, realm, authuri):
+        domains = self.passwd.get(realm, {})
+        for default_port in True, False:
+            reduced_authuri = self.reduce_uri(authuri, default_port)
+            for uris, authinfo in domains.iteritems():
+                for uri in uris:
+                    if self.is_suburi(uri, reduced_authuri):
+                        return authinfo
+        return None, None
+
+    def reduce_uri(self, uri, default_port=True):
+        """Accept authority or URI and extract only the authority and path."""
+        # note HTTP URLs do not have a userinfo component
+        parts = urlparse.urlsplit(uri)
+        if parts[1]:
+            # URI
+            scheme = parts[0]
+            authority = parts[1]
+            path = parts[2] or '/'
+        else:
+            # host or host:port
+            scheme = None
+            authority = uri
+            path = '/'
+        host, port = splitport(authority)
+        if default_port and port is None and scheme is not None:
+            dport = {"http": 80,
+                     "https": 443,
+                     }.get(scheme)
+            if dport is not None:
+                authority = "%s:%d" % (host, dport)
+        return authority, path
+
+    def is_suburi(self, base, test):
+        """Check if test is below base in a URI tree
+
+        Both args must be URIs in reduced form.
+        """
+        if base == test:
+            return True
+        if base[0] != test[0]:
+            return False
+        common = posixpath.commonprefix((base[1], test[1]))
+        if len(common) == len(base[1]):
+            return True
+        return False
+
+
+class HTTPPasswordMgrWithDefaultRealm(HTTPPasswordMgr):
+
+    def find_user_password(self, realm, authuri):
+        user, password = HTTPPasswordMgr.find_user_password(self, realm,
+                                                            authuri)
+        if user is not None:
+            return user, password
+        return HTTPPasswordMgr.find_user_password(self, None, authuri)
+
+
+class AbstractBasicAuthHandler:
+
+    rx = re.compile('[ \t]*([^ \t]+)[ \t]+realm="([^"]*)"', re.I)
+
+    # XXX there can actually be multiple auth-schemes in a
+    # www-authenticate header.  should probably be a lot more careful
+    # in parsing them to extract multiple alternatives
+
+    def __init__(self, password_mgr=None):
+        if password_mgr is None:
+            password_mgr = HTTPPasswordMgr()
+        self.passwd = password_mgr
+        self.add_password = self.passwd.add_password
+
+    def http_error_auth_reqed(self, authreq, host, req, headers):
+        # host may be an authority (without userinfo) or a URL with an
+        # authority
+        # XXX could be multiple headers
+        authreq = headers.get(authreq, None)
+        if authreq:
+            mo = AbstractBasicAuthHandler.rx.search(authreq)
+            if mo:
+                scheme, realm = mo.groups()
+                if scheme.lower() == 'basic':
+                    return self.retry_http_basic_auth(host, req, realm)
+
+    def retry_http_basic_auth(self, host, req, realm):
+        user, pw = self.passwd.find_user_password(realm, host)
+        if pw is not None:
+            raw = "%s:%s" % (user, pw)
+            auth = 'Basic %s' % base64.encodestring(raw).strip()
+            if req.headers.get(self.auth_header, None) == auth:
+                return None
+            newreq = copy.copy(req)
+            newreq.add_header(self.auth_header, auth)
+            newreq.visit = False
+            return self.parent.open(newreq)
+        else:
+            return None
+
+
+class HTTPBasicAuthHandler(AbstractBasicAuthHandler, BaseHandler):
+
+    auth_header = 'Authorization'
+
+    def http_error_401(self, req, fp, code, msg, headers):
+        url = req.get_full_url()
+        return self.http_error_auth_reqed('www-authenticate',
+                                          url, req, headers)
+
+
+class ProxyBasicAuthHandler(AbstractBasicAuthHandler, BaseHandler):
+
+    auth_header = 'Proxy-authorization'
+
+    def http_error_407(self, req, fp, code, msg, headers):
+        # http_error_auth_reqed requires that there is no userinfo component in
+        # authority.  Assume there isn't one, since urllib2 does not (and
+        # should not, RFC 3986 s. 3.2.1) support requests for URLs containing
+        # userinfo.
+        authority = req.get_host()
+        return self.http_error_auth_reqed('proxy-authenticate',
+                                          authority, req, headers)
+
+
+def randombytes(n):
+    """Return n random bytes."""
+    # Use /dev/urandom if it is available.  Fall back to random module
+    # if not.  It might be worthwhile to extend this function to use
+    # other platform-specific mechanisms for getting random bytes.
+    if os.path.exists("/dev/urandom"):
+        f = open("/dev/urandom")
+        s = f.read(n)
+        f.close()
+        return s
+    else:
+        L = [chr(random.randrange(0, 256)) for i in range(n)]
+        return "".join(L)
+
+class AbstractDigestAuthHandler:
+    # Digest authentication is specified in RFC 2617.
+
+    # XXX The client does not inspect the Authentication-Info header
+    # in a successful response.
+
+    # XXX It should be possible to test this implementation against
+    # a mock server that just generates a static set of challenges.
+
+    # XXX qop="auth-int" supports is shaky
+
+    def __init__(self, passwd=None):
+        if passwd is None:
+            passwd = HTTPPasswordMgr()
+        self.passwd = passwd
+        self.add_password = self.passwd.add_password
+        self.retried = 0
+        self.nonce_count = 0
+
+    def reset_retry_count(self):
+        self.retried = 0
+
+    def http_error_auth_reqed(self, auth_header, host, req, headers):
+        authreq = headers.get(auth_header, None)
+        if self.retried > 5:
+            # Don't fail endlessly - if we failed once, we'll probably
+            # fail a second time. Hm. Unless the Password Manager is
+            # prompting for the information. Crap. This isn't great
+            # but it's better than the current 'repeat until recursion
+            # depth exceeded' approach <wink>
+            raise HTTPError(req.get_full_url(), 401, "digest auth failed",
+                            headers, None)
+        else:
+            self.retried += 1
+        if authreq:
+            scheme = authreq.split()[0]
+            if scheme.lower() == 'digest':
+                return self.retry_http_digest_auth(req, authreq)
+
+    def retry_http_digest_auth(self, req, auth):
+        token, challenge = auth.split(' ', 1)
+        chal = parse_keqv_list(parse_http_list(challenge))
+        auth = self.get_authorization(req, chal)
+        if auth:
+            auth_val = 'Digest %s' % auth
+            if req.headers.get(self.auth_header, None) == auth_val:
+                return None
+            newreq = copy.copy(req)
+            newreq.add_unredirected_header(self.auth_header, auth_val)
+            newreq.visit = False
+            return self.parent.open(newreq)
+
+    def get_cnonce(self, nonce):
+        # The cnonce-value is an opaque
+        # quoted string value provided by the client and used by both client
+        # and server to avoid chosen plaintext attacks, to provide mutual
+        # authentication, and to provide some message integrity protection.
+        # This isn't a fabulous effort, but it's probably Good Enough.
+        dig = sha1_digest("%s:%s:%s:%s" % (self.nonce_count, nonce,
+                                           time.ctime(), randombytes(8)))
+        return dig[:16]
+
+    def get_authorization(self, req, chal):
+        try:
+            realm = chal['realm']
+            nonce = chal['nonce']
+            qop = chal.get('qop')
+            algorithm = chal.get('algorithm', 'MD5')
+            # mod_digest doesn't send an opaque, even though it isn't
+            # supposed to be optional
+            opaque = chal.get('opaque', None)
+        except KeyError:
+            return None
+
+        H, KD = self.get_algorithm_impls(algorithm)
+        if H is None:
+            return None
+
+        user, pw = self.passwd.find_user_password(realm, req.get_full_url())
+        if user is None:
+            return None
+
+        # XXX not implemented yet
+        if req.has_data():
+            entdig = self.get_entity_digest(req.get_data(), chal)
+        else:
+            entdig = None
+
+        A1 = "%s:%s:%s" % (user, realm, pw)
+        A2 = "%s:%s" % (req.get_method(),
+                        # XXX selector: what about proxies and full urls
+                        req.get_selector())
+        if qop == 'auth':
+            self.nonce_count += 1
+            ncvalue = '%08x' % self.nonce_count
+            cnonce = self.get_cnonce(nonce)
+            noncebit = "%s:%s:%s:%s:%s" % (nonce, ncvalue, cnonce, qop, H(A2))
+            respdig = KD(H(A1), noncebit)
+        elif qop is None:
+            respdig = KD(H(A1), "%s:%s" % (nonce, H(A2)))
+        else:
+            # XXX handle auth-int.
+            pass
+
+        # XXX should the partial digests be encoded too?
+
+        base = 'username="%s", realm="%s", nonce="%s", uri="%s", ' \
+               'response="%s"' % (user, realm, nonce, req.get_selector(),
+                                  respdig)
+        if opaque:
+            base += ', opaque="%s"' % opaque
+        if entdig:
+            base += ', digest="%s"' % entdig
+        base += ', algorithm="%s"' % algorithm
+        if qop:
+            base += ', qop=auth, nc=%s, cnonce="%s"' % (ncvalue, cnonce)
+        return base
+
+    def get_algorithm_impls(self, algorithm):
+        # lambdas assume digest modules are imported at the top level
+        if algorithm == 'MD5':
+            H = md5_digest
+        elif algorithm == 'SHA':
+            H = sha1_digest
+        # XXX MD5-sess
+        KD = lambda s, d: H("%s:%s" % (s, d))
+        return H, KD
+
+    def get_entity_digest(self, data, chal):
+        # XXX not implemented yet
+        return None
+
+
+class HTTPDigestAuthHandler(BaseHandler, AbstractDigestAuthHandler):
+    """An authentication protocol defined by RFC 2069
+
+    Digest authentication improves on basic authentication because it
+    does not transmit passwords in the clear.
+    """
+
+    auth_header = 'Authorization'
+    handler_order = 490
+
+    def http_error_401(self, req, fp, code, msg, headers):
+        host = urlparse.urlparse(req.get_full_url())[1]
+        retry = self.http_error_auth_reqed('www-authenticate',
+                                           host, req, headers)
+        self.reset_retry_count()
+        return retry
+
+
+class ProxyDigestAuthHandler(BaseHandler, AbstractDigestAuthHandler):
+
+    auth_header = 'Proxy-Authorization'
+    handler_order = 490
+
+    def http_error_407(self, req, fp, code, msg, headers):
+        host = req.get_host()
+        retry = self.http_error_auth_reqed('proxy-authenticate',
+                                           host, req, headers)
+        self.reset_retry_count()
+        return retry
+
+
+# XXX ugly implementation, should probably not bother deriving
+class HTTPProxyPasswordMgr(HTTPPasswordMgr):
+    # has default realm and host/port
+    def add_password(self, realm, uri, user, passwd):
+        # uri could be a single URI or a sequence
+        if uri is None or isinstance(uri, basestring):
+            uris = [uri]
+        else:
+            uris = uri
+        passwd_by_domain = self.passwd.setdefault(realm, {})
+        for uri in uris:
+            for default_port in True, False:
+                reduced_uri = self.reduce_uri(uri, default_port)
+                passwd_by_domain[reduced_uri] = (user, passwd)
+
+    def find_user_password(self, realm, authuri):
+        attempts = [(realm, authuri), (None, authuri)]
+        # bleh, want default realm to take precedence over default
+        # URI/authority, hence this outer loop
+        for default_uri in False, True:
+            for realm, authuri in attempts:
+                authinfo_by_domain = self.passwd.get(realm, {})
+                for default_port in True, False:
+                    reduced_authuri = self.reduce_uri(authuri, default_port)
+                    for uri, authinfo in authinfo_by_domain.iteritems():
+                        if uri is None and not default_uri:
+                            continue
+                        if self.is_suburi(uri, reduced_authuri):
+                            return authinfo
+                    user, password = None, None
+
+                    if user is not None:
+                        break
+        return user, password
+
+    def reduce_uri(self, uri, default_port=True):
+        if uri is None:
+            return None
+        return HTTPPasswordMgr.reduce_uri(self, uri, default_port)
+
+    def is_suburi(self, base, test):
+        if base is None:
+            # default to the proxy's host/port
+            hostport, path = test
+            base = (hostport, "/")
+        return HTTPPasswordMgr.is_suburi(self, base, test)
+
+
+class HTTPSClientCertMgr(HTTPPasswordMgr):
+    # implementation inheritance: this is not a proper subclass
+    def add_key_cert(self, uri, key_file, cert_file):
+        self.add_password(None, uri, key_file, cert_file)
+    def find_key_cert(self, authuri):
+        return HTTPPasswordMgr.find_user_password(self, None, authuri)

Added: mechanize/tags/0.1.10/mechanize/_beautifulsoup.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_beautifulsoup.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_beautifulsoup.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,1080 @@
+"""Beautiful Soup
+Elixir and Tonic
+"The Screen-Scraper's Friend"
+v2.1.1
+http://www.crummy.com/software/BeautifulSoup/
+
+Beautiful Soup parses arbitrarily invalid XML- or HTML-like substance
+into a tree representation. It provides methods and Pythonic idioms
+that make it easy to search and modify the tree.
+
+A well-formed XML/HTML document will yield a well-formed data
+structure. An ill-formed XML/HTML document will yield a
+correspondingly ill-formed data structure. If your document is only
+locally well-formed, you can use this library to find and process the
+well-formed part of it. The BeautifulSoup class has heuristics for
+obtaining a sensible parse tree in the face of common HTML errors.
+
+Beautiful Soup has no external dependencies. It works with Python 2.2
+and up.
+
+Beautiful Soup defines classes for four different parsing strategies:
+
+ * BeautifulStoneSoup, for parsing XML, SGML, or your domain-specific
+   language that kind of looks like XML.
+
+ * BeautifulSoup, for parsing run-of-the-mill HTML code, be it valid
+   or invalid.
+
+ * ICantBelieveItsBeautifulSoup, for parsing valid but bizarre HTML
+   that trips up BeautifulSoup.
+
+ * BeautifulSOAP, for making it easier to parse XML documents that use
+   lots of subelements containing a single string, where you'd prefer
+   they put that string into an attribute (such as SOAP messages).
+
+You can subclass BeautifulStoneSoup or BeautifulSoup to create a
+parsing strategy specific to an XML schema or a particular bizarre
+HTML document. Typically your subclass would just override
+SELF_CLOSING_TAGS and/or NESTABLE_TAGS.
+""" #"
+from __future__ import generators
+
+__author__ = "Leonard Richardson (leonardr at segfault.org)"
+__version__ = "2.1.1"
+__date__ = "$Date: 2004/10/18 00:14:20 $"
+__copyright__ = "Copyright (c) 2004-2005 Leonard Richardson"
+__license__ = "PSF"
+
+from sgmllib import SGMLParser, SGMLParseError
+import types
+import re
+import sgmllib
+
+#This code makes Beautiful Soup able to parse XML with namespaces
+sgmllib.tagfind = re.compile('[a-zA-Z][-_.:a-zA-Z0-9]*')
+
+class NullType(object):
+
+    """Similar to NoneType with a corresponding singleton instance
+    'Null' that, unlike None, accepts any message and returns itself.
+
+    Examples:
+    >>> Null("send", "a", "message")("and one more",
+    ...      "and what you get still") is Null
+    True
+    """
+
+    def __new__(cls):                    return Null
+    def __call__(self, *args, **kwargs): return Null
+##    def __getstate__(self, *args):       return Null
+    def __getattr__(self, attr):         return Null
+    def __getitem__(self, item):         return Null
+    def __setattr__(self, attr, value):  pass
+    def __setitem__(self, item, value):  pass
+    def __len__(self):                   return 0
+    # FIXME: is this a python bug? otherwise ``for x in Null: pass``
+    #        never terminates...
+    def __iter__(self):                  return iter([])
+    def __contains__(self, item):        return False
+    def __repr__(self):                  return "Null"
+Null = object.__new__(NullType)
+
+class PageElement:
+    """Contains the navigational information for some part of the page
+    (either a tag or a piece of text)"""
+
+    def setup(self, parent=Null, previous=Null):
+        """Sets up the initial relations between this element and
+        other elements."""
+        self.parent = parent
+        self.previous = previous
+        self.next = Null
+        self.previousSibling = Null
+        self.nextSibling = Null
+        if self.parent and self.parent.contents:
+            self.previousSibling = self.parent.contents[-1]
+            self.previousSibling.nextSibling = self
+
+    def findNext(self, name=None, attrs={}, text=None):
+        """Returns the first item that matches the given criteria and
+        appears after this Tag in the document."""
+        return self._first(self.fetchNext, name, attrs, text)
+    firstNext = findNext
+
+    def fetchNext(self, name=None, attrs={}, text=None, limit=None):
+        """Returns all items that match the given criteria and appear
+        before after Tag in the document."""
+        return self._fetch(name, attrs, text, limit, self.nextGenerator)
+
+    def findNextSibling(self, name=None, attrs={}, text=None):
+        """Returns the closest sibling to this Tag that matches the
+        given criteria and appears after this Tag in the document."""
+        return self._first(self.fetchNextSiblings, name, attrs, text)
+    firstNextSibling = findNextSibling
+
+    def fetchNextSiblings(self, name=None, attrs={}, text=None, limit=None):
+        """Returns the siblings of this Tag that match the given
+        criteria and appear after this Tag in the document."""
+        return self._fetch(name, attrs, text, limit, self.nextSiblingGenerator)
+
+    def findPrevious(self, name=None, attrs={}, text=None):
+        """Returns the first item that matches the given criteria and
+        appears before this Tag in the document."""
+        return self._first(self.fetchPrevious, name, attrs, text)
+
+    def fetchPrevious(self, name=None, attrs={}, text=None, limit=None):
+        """Returns all items that match the given criteria and appear
+        before this Tag in the document."""
+        return self._fetch(name, attrs, text, limit, self.previousGenerator)
+    firstPrevious = findPrevious
+
+    def findPreviousSibling(self, name=None, attrs={}, text=None):
+        """Returns the closest sibling to this Tag that matches the
+        given criteria and appears before this Tag in the document."""
+        return self._first(self.fetchPreviousSiblings, name, attrs, text)
+    firstPreviousSibling = findPreviousSibling
+
+    def fetchPreviousSiblings(self, name=None, attrs={}, text=None,
+                              limit=None):
+        """Returns the siblings of this Tag that match the given
+        criteria and appear before this Tag in the document."""
+        return self._fetch(name, attrs, text, limit,
+                           self.previousSiblingGenerator)
+
+    def findParent(self, name=None, attrs={}):
+        """Returns the closest parent of this Tag that matches the given
+        criteria."""
+        r = Null
+        l = self.fetchParents(name, attrs, 1)
+        if l:
+            r = l[0]
+        return r
+    firstParent = findParent
+
+    def fetchParents(self, name=None, attrs={}, limit=None):
+        """Returns the parents of this Tag that match the given
+        criteria."""
+        return self._fetch(name, attrs, None, limit, self.parentGenerator)
+
+    #These methods do the real heavy lifting.
+
+    def _first(self, method, name, attrs, text):
+        r = Null
+        l = method(name, attrs, text, 1)
+        if l:
+            r = l[0]
+        return r
+    
+    def _fetch(self, name, attrs, text, limit, generator):
+        "Iterates over a generator looking for things that match."
+        if not hasattr(attrs, 'items'):
+            attrs = {'class' : attrs}
+
+        results = []
+        g = generator()
+        while True:
+            try:
+                i = g.next()
+            except StopIteration:
+                break
+            found = None
+            if isinstance(i, Tag):
+                if not text:
+                    if not name or self._matches(i, name):
+                        match = True
+                        for attr, matchAgainst in attrs.items():
+                            check = i.get(attr)
+                            if not self._matches(check, matchAgainst):
+                                match = False
+                                break
+                        if match:
+                            found = i
+            elif text:
+                if self._matches(i, text):
+                    found = i                    
+            if found:
+                results.append(found)
+                if limit and len(results) >= limit:
+                    break
+        return results
+
+    #Generators that can be used to navigate starting from both
+    #NavigableTexts and Tags.                
+    def nextGenerator(self):
+        i = self
+        while i:
+            i = i.next
+            yield i
+
+    def nextSiblingGenerator(self):
+        i = self
+        while i:
+            i = i.nextSibling
+            yield i
+
+    def previousGenerator(self):
+        i = self
+        while i:
+            i = i.previous
+            yield i
+
+    def previousSiblingGenerator(self):
+        i = self
+        while i:
+            i = i.previousSibling
+            yield i
+
+    def parentGenerator(self):
+        i = self
+        while i:
+            i = i.parent
+            yield i
+
+    def _matches(self, chunk, howToMatch):
+        #print 'looking for %s in %s' % (howToMatch, chunk)
+        #
+        # If given a list of items, return true if the list contains a
+        # text element that matches.
+        if isList(chunk) and not isinstance(chunk, Tag):
+            for tag in chunk:
+                if isinstance(tag, NavigableText) and self._matches(tag, howToMatch):
+                    return True
+            return False
+        if callable(howToMatch):
+            return howToMatch(chunk)
+        if isinstance(chunk, Tag):
+            #Custom match methods take the tag as an argument, but all other
+            #ways of matching match the tag name as a string
+            chunk = chunk.name
+        #Now we know that chunk is a string
+        if not isinstance(chunk, basestring):
+            chunk = str(chunk)
+        if hasattr(howToMatch, 'match'):
+            # It's a regexp object.
+            return howToMatch.search(chunk)
+        if isList(howToMatch):
+            return chunk in howToMatch
+        if hasattr(howToMatch, 'items'):
+            return howToMatch.has_key(chunk)
+        #It's just a string
+        return str(howToMatch) == chunk
+
+class NavigableText(PageElement):
+
+    def __getattr__(self, attr):
+        "For backwards compatibility, text.string gives you text"
+        if attr == 'string':
+            return self
+        else:
+            raise AttributeError, "'%s' object has no attribute '%s'" % (self.__class__.__name__, attr)
+        
+class NavigableString(str, NavigableText):
+    pass
+
+class NavigableUnicodeString(unicode, NavigableText):
+    pass
+
+class Tag(PageElement):
+
+    """Represents a found HTML tag with its attributes and contents."""
+
+    def __init__(self, name, attrs=None, parent=Null, previous=Null):
+        "Basic constructor."
+        self.name = name
+        if attrs == None:
+            attrs = []
+        self.attrs = attrs
+        self.contents = []
+        self.setup(parent, previous)
+        self.hidden = False
+
+    def get(self, key, default=None):
+        """Returns the value of the 'key' attribute for the tag, or
+        the value given for 'default' if it doesn't have that
+        attribute."""
+        return self._getAttrMap().get(key, default)    
+
+    def __getitem__(self, key):
+        """tag[key] returns the value of the 'key' attribute for the tag,
+        and throws an exception if it's not there."""
+        return self._getAttrMap()[key]
+
+    def __iter__(self):
+        "Iterating over a tag iterates over its contents."
+        return iter(self.contents)
+
+    def __len__(self):
+        "The length of a tag is the length of its list of contents."
+        return len(self.contents)
+
+    def __contains__(self, x):
+        return x in self.contents
+
+    def __nonzero__(self):
+        "A tag is non-None even if it has no contents."
+        return True
+
+    def __setitem__(self, key, value):        
+        """Setting tag[key] sets the value of the 'key' attribute for the
+        tag."""
+        self._getAttrMap()
+        self.attrMap[key] = value
+        found = False
+        for i in range(0, len(self.attrs)):
+            if self.attrs[i][0] == key:
+                self.attrs[i] = (key, value)
+                found = True
+        if not found:
+            self.attrs.append((key, value))
+        self._getAttrMap()[key] = value
+
+    def __delitem__(self, key):
+        "Deleting tag[key] deletes all 'key' attributes for the tag."
+        for item in self.attrs:
+            if item[0] == key:
+                self.attrs.remove(item)
+                #We don't break because bad HTML can define the same
+                #attribute multiple times.
+            self._getAttrMap()
+            if self.attrMap.has_key(key):
+                del self.attrMap[key]
+
+    def __call__(self, *args, **kwargs):
+        """Calling a tag like a function is the same as calling its
+        fetch() method. Eg. tag('a') returns a list of all the A tags
+        found within this tag."""
+        return apply(self.fetch, args, kwargs)
+
+    def __getattr__(self, tag):
+        if len(tag) > 3 and tag.rfind('Tag') == len(tag)-3:
+            return self.first(tag[:-3])
+        elif tag.find('__') != 0:
+            return self.first(tag)
+
+    def __eq__(self, other):
+        """Returns true iff this tag has the same name, the same attributes,
+        and the same contents (recursively) as the given tag.
+
+        NOTE: right now this will return false if two tags have the
+        same attributes in a different order. Should this be fixed?"""
+        if not hasattr(other, 'name') or not hasattr(other, 'attrs') or not hasattr(other, 'contents') or self.name != other.name or self.attrs != other.attrs or len(self) != len(other):
+            return False
+        for i in range(0, len(self.contents)):
+            if self.contents[i] != other.contents[i]:
+                return False
+        return True
+
+    def __ne__(self, other):
+        """Returns true iff this tag is not identical to the other tag,
+        as defined in __eq__."""
+        return not self == other
+
+    def __repr__(self):
+        """Renders this tag as a string."""
+        return str(self)
+
+    def __unicode__(self):
+        return self.__str__(1)
+
+    def __str__(self, needUnicode=None, showStructureIndent=None):
+        """Returns a string or Unicode representation of this tag and
+        its contents.
+
+        NOTE: since Python's HTML parser consumes whitespace, this
+        method is not certain to reproduce the whitespace present in
+        the original string."""
+        
+        attrs = []
+        if self.attrs:
+            for key, val in self.attrs:
+                attrs.append('%s="%s"' % (key, val))
+        close = ''
+        closeTag = ''
+        if self.isSelfClosing():
+            close = ' /'
+        else:
+            closeTag = '</%s>' % self.name
+        indentIncrement = None        
+        if showStructureIndent != None:
+            indentIncrement = showStructureIndent
+            if not self.hidden:
+                indentIncrement += 1
+        contents = self.renderContents(indentIncrement, needUnicode=needUnicode)        
+        if showStructureIndent:
+            space = '\n%s' % (' ' * showStructureIndent)
+        if self.hidden:
+            s = contents
+        else:
+            s = []
+            attributeString = ''
+            if attrs:
+                attributeString = ' ' + ' '.join(attrs)            
+            if showStructureIndent:
+                s.append(space)
+            s.append('<%s%s%s>' % (self.name, attributeString, close))
+            s.append(contents)
+            if closeTag and showStructureIndent != None:
+                s.append(space)
+            s.append(closeTag)
+            s = ''.join(s)
+        isUnicode = type(s) == types.UnicodeType
+        if needUnicode and not isUnicode:
+            s = unicode(s)
+        elif isUnicode and needUnicode==False:
+            s = str(s)
+        return s
+
+    def prettify(self, needUnicode=None):
+        return self.__str__(needUnicode, showStructureIndent=True)
+
+    def renderContents(self, showStructureIndent=None, needUnicode=None):
+        """Renders the contents of this tag as a (possibly Unicode) 
+        string."""
+        s=[]
+        for c in self:
+            text = None
+            if isinstance(c, NavigableUnicodeString) or type(c) == types.UnicodeType:
+                text = unicode(c)
+            elif isinstance(c, Tag):
+                s.append(c.__str__(needUnicode, showStructureIndent))
+            elif needUnicode:
+                text = unicode(c)
+            else:
+                text = str(c)
+            if text:
+                if showStructureIndent != None:
+                    if text[-1] == '\n':
+                        text = text[:-1]
+                s.append(text)
+        return ''.join(s)    
+
+    #Soup methods
+
+    def firstText(self, text, recursive=True):
+        """Convenience method to retrieve the first piece of text matching the
+        given criteria. 'text' can be a string, a regular expression object,
+        a callable that takes a string and returns whether or not the
+        string 'matches', etc."""
+        return self.first(recursive=recursive, text=text)
+
+    def fetchText(self, text, recursive=True, limit=None):
+        """Convenience method to retrieve all pieces of text matching the
+        given criteria. 'text' can be a string, a regular expression object,
+        a callable that takes a string and returns whether or not the
+        string 'matches', etc."""
+        return self.fetch(recursive=recursive, text=text, limit=limit)
+
+    def first(self, name=None, attrs={}, recursive=True, text=None):
+        """Return only the first child of this
+        Tag matching the given criteria."""
+        r = Null
+        l = self.fetch(name, attrs, recursive, text, 1)
+        if l:
+            r = l[0]
+        return r
+    findChild = first
+
+    def fetch(self, name=None, attrs={}, recursive=True, text=None,
+              limit=None):
+        """Extracts a list of Tag objects that match the given
+        criteria.  You can specify the name of the Tag and any
+        attributes you want the Tag to have.
+
+        The value of a key-value pair in the 'attrs' map can be a
+        string, a list of strings, a regular expression object, or a
+        callable that takes a string and returns whether or not the
+        string matches for some custom definition of 'matches'. The
+        same is true of the tag name."""
+        generator = self.recursiveChildGenerator
+        if not recursive:
+            generator = self.childGenerator
+        return self._fetch(name, attrs, text, limit, generator)
+    fetchChildren = fetch
+    
+    #Utility methods
+
+    def isSelfClosing(self):
+        """Returns true iff this is a self-closing tag as defined in the HTML
+        standard.
+
+        TODO: This is specific to BeautifulSoup and its subclasses, but it's
+        used by __str__"""
+        return self.name in BeautifulSoup.SELF_CLOSING_TAGS
+
+    def append(self, tag):
+        """Appends the given tag to the contents of this tag."""
+        self.contents.append(tag)
+
+    #Private methods
+
+    def _getAttrMap(self):
+        """Initializes a map representation of this tag's attributes,
+        if not already initialized."""
+        if not getattr(self, 'attrMap'):
+            self.attrMap = {}
+            for (key, value) in self.attrs:
+                self.attrMap[key] = value 
+        return self.attrMap
+
+    #Generator methods
+    def childGenerator(self):
+        for i in range(0, len(self.contents)):
+            yield self.contents[i]
+        raise StopIteration
+    
+    def recursiveChildGenerator(self):
+        stack = [(self, 0)]
+        while stack:
+            tag, start = stack.pop()
+            if isinstance(tag, Tag):            
+                for i in range(start, len(tag.contents)):
+                    a = tag.contents[i]
+                    yield a
+                    if isinstance(a, Tag) and tag.contents:
+                        if i < len(tag.contents) - 1:
+                            stack.append((tag, i+1))
+                        stack.append((a, 0))
+                        break
+        raise StopIteration
+
+
+def isList(l):
+    """Convenience method that works with all 2.x versions of Python
+    to determine whether or not something is listlike."""
+    return hasattr(l, '__iter__') \
+           or (type(l) in (types.ListType, types.TupleType))
+
+def buildTagMap(default, *args):
+    """Turns a list of maps, lists, or scalars into a single map.
+    Used to build the SELF_CLOSING_TAGS and NESTABLE_TAGS maps out
+    of lists and partial maps."""
+    built = {}
+    for portion in args:
+        if hasattr(portion, 'items'):
+            #It's a map. Merge it.
+            for k,v in portion.items():
+                built[k] = v
+        elif isList(portion):
+            #It's a list. Map each item to the default.
+            for k in portion:
+                built[k] = default
+        else:
+            #It's a scalar. Map it to the default.
+            built[portion] = default
+    return built
+
+class BeautifulStoneSoup(Tag, SGMLParser):
+
+    """This class contains the basic parser and fetch code. It defines
+    a parser that knows nothing about tag behavior except for the
+    following:
+   
+      You can't close a tag without closing all the tags it encloses.
+      That is, "<foo><bar></foo>" actually means
+      "<foo><bar></bar></foo>".
+
+    [Another possible explanation is "<foo><bar /></foo>", but since
+    this class defines no SELF_CLOSING_TAGS, it will never use that
+    explanation.]
+
+    This class is useful for parsing XML or made-up markup languages,
+    or when BeautifulSoup makes an assumption counter to what you were
+    expecting."""
+
+    SELF_CLOSING_TAGS = {}
+    NESTABLE_TAGS = {}
+    RESET_NESTING_TAGS = {}
+    QUOTE_TAGS = {}
+
+    #As a public service we will by default silently replace MS smart quotes
+    #and similar characters with their HTML or ASCII equivalents.
+    MS_CHARS = { '\x80' : '&euro;',
+                 '\x81' : ' ',
+                 '\x82' : '&sbquo;',
+                 '\x83' : '&fnof;',
+                 '\x84' : '&bdquo;',
+                 '\x85' : '&hellip;',
+                 '\x86' : '&dagger;',
+                 '\x87' : '&Dagger;',
+                 '\x88' : '&caret;',
+                 '\x89' : '%',
+                 '\x8A' : '&Scaron;',
+                 '\x8B' : '&lt;',
+                 '\x8C' : '&OElig;',
+                 '\x8D' : '?',
+                 '\x8E' : 'Z',
+                 '\x8F' : '?',
+                 '\x90' : '?',
+                 '\x91' : '&lsquo;',
+                 '\x92' : '&rsquo;',
+                 '\x93' : '&ldquo;',
+                 '\x94' : '&rdquo;',
+                 '\x95' : '&bull;',
+                 '\x96' : '&ndash;',
+                 '\x97' : '&mdash;',
+                 '\x98' : '&tilde;',
+                 '\x99' : '&trade;',
+                 '\x9a' : '&scaron;',
+                 '\x9b' : '&gt;',
+                 '\x9c' : '&oelig;',
+                 '\x9d' : '?',
+                 '\x9e' : 'z',
+                 '\x9f' : '&Yuml;',}
+
+    PARSER_MASSAGE = [(re.compile('(<[^<>]*)/>'),
+                       lambda(x):x.group(1) + ' />'),
+                      (re.compile('<!\s+([^<>]*)>'),
+                       lambda(x):'<!' + x.group(1) + '>'),
+                      (re.compile("([\x80-\x9f])"),
+                       lambda(x): BeautifulStoneSoup.MS_CHARS.get(x.group(1)))
+                      ]
+
+    ROOT_TAG_NAME = '[document]'
+
+    def __init__(self, text=None, avoidParserProblems=True,
+                 initialTextIsEverything=True):
+        """Initialize this as the 'root tag' and feed in any text to
+        the parser.
+
+        NOTE about avoidParserProblems: sgmllib will process most bad
+        HTML, and BeautifulSoup has tricks for dealing with some HTML
+        that kills sgmllib, but Beautiful Soup can nonetheless choke
+        or lose data if your data uses self-closing tags or
+        declarations incorrectly. By default, Beautiful Soup sanitizes
+        its input to avoid the vast majority of these problems. The
+        problems are relatively rare, even in bad HTML, so feel free
+        to pass in False to avoidParserProblems if they don't apply to
+        you, and you'll get better performance. The only reason I have
+        this turned on by default is so I don't get so many tech
+        support questions.
+
+        The two most common instances of invalid HTML that will choke
+        sgmllib are fixed by the default parser massage techniques:
+
+         <br/> (No space between name of closing tag and tag close)
+         <! --Comment--> (Extraneous whitespace in declaration)
+
+        You can pass in a custom list of (RE object, replace method)
+        tuples to get Beautiful Soup to scrub your input the way you
+        want."""
+        Tag.__init__(self, self.ROOT_TAG_NAME)
+        if avoidParserProblems \
+           and not isList(avoidParserProblems):
+            avoidParserProblems = self.PARSER_MASSAGE            
+        self.avoidParserProblems = avoidParserProblems
+        SGMLParser.__init__(self)
+        self.quoteStack = []
+        self.hidden = 1
+        self.reset()
+        if hasattr(text, 'read'):
+            #It's a file-type object.
+            text = text.read()
+        if text:
+            self.feed(text)
+        if initialTextIsEverything:
+            self.done()
+
+    def __getattr__(self, methodName):
+        """This method routes method call requests to either the SGMLParser
+        superclass or the Tag superclass, depending on the method name."""
+        if methodName.find('start_') == 0 or methodName.find('end_') == 0 \
+               or methodName.find('do_') == 0:
+            return SGMLParser.__getattr__(self, methodName)
+        elif methodName.find('__') != 0:
+            return Tag.__getattr__(self, methodName)
+        else:
+            raise AttributeError
+
+    def feed(self, text):
+        if self.avoidParserProblems:
+            for fix, m in self.avoidParserProblems:
+                text = fix.sub(m, text)
+        SGMLParser.feed(self, text)
+
+    def done(self):
+        """Called when you're done parsing, so that the unclosed tags can be
+        correctly processed."""
+        self.endData() #NEW
+        while self.currentTag.name != self.ROOT_TAG_NAME:
+            self.popTag()
+            
+    def reset(self):
+        SGMLParser.reset(self)
+        self.currentData = []
+        self.currentTag = None
+        self.tagStack = []
+        self.pushTag(self)        
+    
+    def popTag(self):
+        tag = self.tagStack.pop()
+        # Tags with just one string-owning child get the child as a
+        # 'string' property, so that soup.tag.string is shorthand for
+        # soup.tag.contents[0]
+        if len(self.currentTag.contents) == 1 and \
+           isinstance(self.currentTag.contents[0], NavigableText):
+            self.currentTag.string = self.currentTag.contents[0]
+
+        #print "Pop", tag.name
+        if self.tagStack:
+            self.currentTag = self.tagStack[-1]
+        return self.currentTag
+
+    def pushTag(self, tag):
+        #print "Push", tag.name
+        if self.currentTag:
+            self.currentTag.append(tag)
+        self.tagStack.append(tag)
+        self.currentTag = self.tagStack[-1]
+
+    def endData(self):
+        currentData = ''.join(self.currentData)
+        if currentData:
+            if not currentData.strip():
+                if '\n' in currentData:
+                    currentData = '\n'
+                else:
+                    currentData = ' '
+            c = NavigableString
+            if type(currentData) == types.UnicodeType:
+                c = NavigableUnicodeString
+            o = c(currentData)
+            o.setup(self.currentTag, self.previous)
+            if self.previous:
+                self.previous.next = o
+            self.previous = o
+            self.currentTag.contents.append(o)
+        self.currentData = []
+
+    def _popToTag(self, name, inclusivePop=True):
+        """Pops the tag stack up to and including the most recent
+        instance of the given tag. If inclusivePop is false, pops the tag
+        stack up to but *not* including the most recent instqance of
+        the given tag."""
+        if name == self.ROOT_TAG_NAME:
+            return            
+
+        numPops = 0
+        mostRecentTag = None
+        for i in range(len(self.tagStack)-1, 0, -1):
+            if name == self.tagStack[i].name:
+                numPops = len(self.tagStack)-i
+                break
+        if not inclusivePop:
+            numPops = numPops - 1
+
+        for i in range(0, numPops):
+            mostRecentTag = self.popTag()
+        return mostRecentTag    
+
+    def _smartPop(self, name):
+
+        """We need to pop up to the previous tag of this type, unless
+        one of this tag's nesting reset triggers comes between this
+        tag and the previous tag of this type, OR unless this tag is a
+        generic nesting trigger and another generic nesting trigger
+        comes between this tag and the previous tag of this type.
+
+        Examples:
+         <p>Foo<b>Bar<p> should pop to 'p', not 'b'.
+         <p>Foo<table>Bar<p> should pop to 'table', not 'p'.
+         <p>Foo<table><tr>Bar<p> should pop to 'tr', not 'p'.
+         <p>Foo<b>Bar<p> should pop to 'p', not 'b'.
+
+         <li><ul><li> *<li>* should pop to 'ul', not the first 'li'.
+         <tr><table><tr> *<tr>* should pop to 'table', not the first 'tr'
+         <td><tr><td> *<td>* should pop to 'tr', not the first 'td'
+        """
+
+        nestingResetTriggers = self.NESTABLE_TAGS.get(name)
+        isNestable = nestingResetTriggers != None
+        isResetNesting = self.RESET_NESTING_TAGS.has_key(name)
+        popTo = None
+        inclusive = True
+        for i in range(len(self.tagStack)-1, 0, -1):
+            p = self.tagStack[i]
+            if (not p or p.name == name) and not isNestable:
+                #Non-nestable tags get popped to the top or to their
+                #last occurance.
+                popTo = name
+                break
+            if (nestingResetTriggers != None
+                and p.name in nestingResetTriggers) \
+                or (nestingResetTriggers == None and isResetNesting
+                    and self.RESET_NESTING_TAGS.has_key(p.name)):
+                
+                #If we encounter one of the nesting reset triggers
+                #peculiar to this tag, or we encounter another tag
+                #that causes nesting to reset, pop up to but not
+                #including that tag.
+
+                popTo = p.name
+                inclusive = False
+                break
+            p = p.parent
+        if popTo:
+            self._popToTag(popTo, inclusive)
+
+    def unknown_starttag(self, name, attrs, selfClosing=0):
+        #print "Start tag %s" % name
+        if self.quoteStack:
+            #This is not a real tag.
+            #print "<%s> is not real!" % name
+            attrs = ''.join(map(lambda(x, y): ' %s="%s"' % (x, y), attrs))
+            self.handle_data('<%s%s>' % (name, attrs))
+            return
+        self.endData()
+        if not name in self.SELF_CLOSING_TAGS and not selfClosing:
+            self._smartPop(name)
+        tag = Tag(name, attrs, self.currentTag, self.previous)        
+        if self.previous:
+            self.previous.next = tag
+        self.previous = tag
+        self.pushTag(tag)
+        if selfClosing or name in self.SELF_CLOSING_TAGS:
+            self.popTag()                
+        if name in self.QUOTE_TAGS:
+            #print "Beginning quote (%s)" % name
+            self.quoteStack.append(name)
+            self.literal = 1
+
+    def unknown_endtag(self, name):
+        if self.quoteStack and self.quoteStack[-1] != name:
+            #This is not a real end tag.
+            #print "</%s> is not real!" % name
+            self.handle_data('</%s>' % name)
+            return
+        self.endData()
+        self._popToTag(name)
+        if self.quoteStack and self.quoteStack[-1] == name:
+            self.quoteStack.pop()
+            self.literal = (len(self.quoteStack) > 0)
+
+    def handle_data(self, data):
+        self.currentData.append(data)
+
+    def handle_pi(self, text):
+        "Propagate processing instructions right through."
+        self.handle_data("<?%s>" % text)
+
+    def handle_comment(self, text):
+        "Propagate comments right through."
+        self.handle_data("<!--%s-->" % text)
+
+    def handle_charref(self, ref):
+        "Propagate char refs right through."
+        self.handle_data('&#%s;' % ref)
+
+    def handle_entityref(self, ref):
+        "Propagate entity refs right through."
+        self.handle_data('&%s;' % ref)
+        
+    def handle_decl(self, data):
+        "Propagate DOCTYPEs and the like right through."
+        self.handle_data('<!%s>' % data)
+
+    def parse_declaration(self, i):
+        """Treat a bogus SGML declaration as raw data. Treat a CDATA
+        declaration as regular data."""
+        j = None
+        if self.rawdata[i:i+9] == '<![CDATA[':
+             k = self.rawdata.find(']]>', i)
+             if k == -1:
+                 k = len(self.rawdata)
+             self.handle_data(self.rawdata[i+9:k])
+             j = k+3
+        else:
+            try:
+                j = SGMLParser.parse_declaration(self, i)
+            except SGMLParseError:
+                toHandle = self.rawdata[i:]
+                self.handle_data(toHandle)
+                j = i + len(toHandle)
+        return j
+
+class BeautifulSoup(BeautifulStoneSoup):
+
+    """This parser knows the following facts about HTML:
+
+    * Some tags have no closing tag and should be interpreted as being
+      closed as soon as they are encountered.
+
+    * The text inside some tags (ie. 'script') may contain tags which
+      are not really part of the document and which should be parsed
+      as text, not tags. If you want to parse the text as tags, you can
+      always fetch it and parse it explicitly.
+
+    * Tag nesting rules:
+
+      Most tags can't be nested at all. For instance, the occurance of
+      a <p> tag should implicitly close the previous <p> tag.
+
+       <p>Para1<p>Para2
+        should be transformed into:
+       <p>Para1</p><p>Para2
+
+      Some tags can be nested arbitrarily. For instance, the occurance
+      of a <blockquote> tag should _not_ implicitly close the previous
+      <blockquote> tag.
+
+       Alice said: <blockquote>Bob said: <blockquote>Blah
+        should NOT be transformed into:
+       Alice said: <blockquote>Bob said: </blockquote><blockquote>Blah
+
+      Some tags can be nested, but the nesting is reset by the
+      interposition of other tags. For instance, a <tr> tag should
+      implicitly close the previous <tr> tag within the same <table>,
+      but not close a <tr> tag in another table.
+
+       <table><tr>Blah<tr>Blah
+        should be transformed into:
+       <table><tr>Blah</tr><tr>Blah
+        but,
+       <tr>Blah<table><tr>Blah
+        should NOT be transformed into
+       <tr>Blah<table></tr><tr>Blah
+
+    Differing assumptions about tag nesting rules are a major source
+    of problems with the BeautifulSoup class. If BeautifulSoup is not
+    treating as nestable a tag your page author treats as nestable,
+    try ICantBelieveItsBeautifulSoup before writing your own
+    subclass."""
+
+    SELF_CLOSING_TAGS = buildTagMap(None, ['br' , 'hr', 'input', 'img', 'meta',
+                                           'spacer', 'link', 'frame', 'base'])
+
+    QUOTE_TAGS = {'script': None}
+    
+    #According to the HTML standard, each of these inline tags can
+    #contain another tag of the same type. Furthermore, it's common
+    #to actually use these tags this way.
+    NESTABLE_INLINE_TAGS = ['span', 'font', 'q', 'object', 'bdo', 'sub', 'sup',
+                            'center']
+
+    #According to the HTML standard, these block tags can contain
+    #another tag of the same type. Furthermore, it's common
+    #to actually use these tags this way.
+    NESTABLE_BLOCK_TAGS = ['blockquote', 'div', 'fieldset', 'ins', 'del']
+
+    #Lists can contain other lists, but there are restrictions.    
+    NESTABLE_LIST_TAGS = { 'ol' : [],
+                           'ul' : [],
+                           'li' : ['ul', 'ol'],
+                           'dl' : [],
+                           'dd' : ['dl'],
+                           'dt' : ['dl'] }
+
+    #Tables can contain other tables, but there are restrictions.    
+    NESTABLE_TABLE_TAGS = {'table' : [], 
+                           'tr' : ['table', 'tbody', 'tfoot', 'thead'],
+                           'td' : ['tr'],
+                           'th' : ['tr'],
+                           }
+
+    NON_NESTABLE_BLOCK_TAGS = ['address', 'form', 'p', 'pre']
+
+    #If one of these tags is encountered, all tags up to the next tag of
+    #this type are popped.
+    RESET_NESTING_TAGS = buildTagMap(None, NESTABLE_BLOCK_TAGS, 'noscript',
+                                     NON_NESTABLE_BLOCK_TAGS,
+                                     NESTABLE_LIST_TAGS,
+                                     NESTABLE_TABLE_TAGS)
+
+    NESTABLE_TAGS = buildTagMap([], NESTABLE_INLINE_TAGS, NESTABLE_BLOCK_TAGS,
+                                NESTABLE_LIST_TAGS, NESTABLE_TABLE_TAGS)
+    
+class ICantBelieveItsBeautifulSoup(BeautifulSoup):
+
+    """The BeautifulSoup class is oriented towards skipping over
+    common HTML errors like unclosed tags. However, sometimes it makes
+    errors of its own. For instance, consider this fragment:
+
+     <b>Foo<b>Bar</b></b>
+
+    This is perfectly valid (if bizarre) HTML. However, the
+    BeautifulSoup class will implicitly close the first b tag when it
+    encounters the second 'b'. It will think the author wrote
+    "<b>Foo<b>Bar", and didn't close the first 'b' tag, because
+    there's no real-world reason to bold something that's already
+    bold. When it encounters '</b></b>' it will close two more 'b'
+    tags, for a grand total of three tags closed instead of two. This
+    can throw off the rest of your document structure. The same is
+    true of a number of other tags, listed below.
+
+    It's much more common for someone to forget to close (eg.) a 'b'
+    tag than to actually use nested 'b' tags, and the BeautifulSoup
+    class handles the common case. This class handles the
+    not-co-common case: where you can't believe someone wrote what
+    they did, but it's valid HTML and BeautifulSoup screwed up by
+    assuming it wouldn't be.
+
+    If this doesn't do what you need, try subclassing this class or
+    BeautifulSoup, and providing your own list of NESTABLE_TAGS."""
+
+    I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS = \
+     ['em', 'big', 'i', 'small', 'tt', 'abbr', 'acronym', 'strong',
+      'cite', 'code', 'dfn', 'kbd', 'samp', 'strong', 'var', 'b',
+      'big']
+
+    I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS = ['noscript']
+
+    NESTABLE_TAGS = buildTagMap([], BeautifulSoup.NESTABLE_TAGS,
+                                I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS,
+                                I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS)
+
+class BeautifulSOAP(BeautifulStoneSoup):
+    """This class will push a tag with only a single string child into
+    the tag's parent as an attribute. The attribute's name is the tag
+    name, and the value is the string child. An example should give
+    the flavor of the change:
+
+    <foo><bar>baz</bar></foo>
+     =>
+    <foo bar="baz"><bar>baz</bar></foo>
+
+    You can then access fooTag['bar'] instead of fooTag.barTag.string.
+
+    This is, of course, useful for scraping structures that tend to
+    use subelements instead of attributes, such as SOAP messages. Note
+    that it modifies its input, so don't print the modified version
+    out.
+
+    I'm not sure how many people really want to use this class; let me
+    know if you do. Mainly I like the name."""
+
+    def popTag(self):
+        if len(self.tagStack) > 1:
+            tag = self.tagStack[-1]
+            parent = self.tagStack[-2]
+            parent._getAttrMap()
+            if (isinstance(tag, Tag) and len(tag.contents) == 1 and
+                isinstance(tag.contents[0], NavigableText) and 
+                not parent.attrMap.has_key(tag.name)):
+                parent[tag.name] = tag.contents[0]
+        BeautifulStoneSoup.popTag(self)
+
+#Enterprise class names! It has come to our attention that some people
+#think the names of the Beautiful Soup parser classes are too silly
+#and "unprofessional" for use in enterprise screen-scraping. We feel
+#your pain! For such-minded folk, the Beautiful Soup Consortium And
+#All-Night Kosher Bakery recommends renaming this file to
+#"RobustParser.py" (or, in cases of extreme enterprisitude,
+#"RobustParserBeanInterface.class") and using the following
+#enterprise-friendly class aliases:
+class RobustXMLParser(BeautifulStoneSoup):
+    pass
+class RobustHTMLParser(BeautifulSoup):
+    pass
+class RobustWackAssHTMLParser(ICantBelieveItsBeautifulSoup):
+    pass
+class SimplifyingSOAPParser(BeautifulSOAP):
+    pass
+
+###
+
+
+#By default, act as an HTML pretty-printer.
+if __name__ == '__main__':
+    import sys
+    soup = BeautifulStoneSoup(sys.stdin.read())
+    print soup.prettify()

Added: mechanize/tags/0.1.10/mechanize/_clientcookie.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_clientcookie.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_clientcookie.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,1707 @@
+"""HTTP cookie handling for web clients.
+
+This module originally developed from my port of Gisle Aas' Perl module
+HTTP::Cookies, from the libwww-perl library.
+
+Docstrings, comments and debug strings in this code refer to the
+attributes of the HTTP cookie system as cookie-attributes, to distinguish
+them clearly from Python attributes.
+
+                        CookieJar____
+                        /     \      \
+            FileCookieJar      \      \
+             /    |   \         \      \
+ MozillaCookieJar | LWPCookieJar \      \
+                  |               |      \
+                  |   ---MSIEBase |       \
+                  |  /      |     |        \
+                  | /   MSIEDBCookieJar BSDDBCookieJar
+                  |/    
+               MSIECookieJar
+
+Comments to John J Lee <jjl at pobox.com>.
+
+
+Copyright 2002-2006 John J Lee <jjl at pobox.com>
+Copyright 1997-1999 Gisle Aas (original libwww-perl code)
+Copyright 2002-2003 Johnny Lee (original MSIE Perl code)
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import sys, re, copy, time, urllib, types, logging
+try:
+    import threading
+    _threading = threading; del threading
+except ImportError:
+    import dummy_threading
+    _threading = dummy_threading; del dummy_threading
+
+MISSING_FILENAME_TEXT = ("a filename was not supplied (nor was the CookieJar "
+                         "instance initialised with one)")
+DEFAULT_HTTP_PORT = "80"
+
+from _headersutil import split_header_words, parse_ns_headers
+from _util import isstringlike
+import _rfc3986
+
+debug = logging.getLogger("mechanize.cookies").debug
+
+
+def reraise_unmasked_exceptions(unmasked=()):
+    # There are a few catch-all except: statements in this module, for
+    # catching input that's bad in unexpected ways.
+    # This function re-raises some exceptions we don't want to trap.
+    import mechanize, warnings
+    if not mechanize.USE_BARE_EXCEPT:
+        raise
+    unmasked = unmasked + (KeyboardInterrupt, SystemExit, MemoryError)
+    etype = sys.exc_info()[0]
+    if issubclass(etype, unmasked):
+        raise
+    # swallowed an exception
+    import traceback, StringIO
+    f = StringIO.StringIO()
+    traceback.print_exc(None, f)
+    msg = f.getvalue()
+    warnings.warn("mechanize bug!\n%s" % msg, stacklevel=2)
+
+
+IPV4_RE = re.compile(r"\.\d+$")
+def is_HDN(text):
+    """Return True if text is a host domain name."""
+    # XXX
+    # This may well be wrong.  Which RFC is HDN defined in, if any (for
+    #  the purposes of RFC 2965)?
+    # For the current implementation, what about IPv6?  Remember to look
+    #  at other uses of IPV4_RE also, if change this.
+    return not (IPV4_RE.search(text) or
+                text == "" or
+                text[0] == "." or text[-1] == ".")
+
+def domain_match(A, B):
+    """Return True if domain A domain-matches domain B, according to RFC 2965.
+
+    A and B may be host domain names or IP addresses.
+
+    RFC 2965, section 1:
+
+    Host names can be specified either as an IP address or a HDN string.
+    Sometimes we compare one host name with another.  (Such comparisons SHALL
+    be case-insensitive.)  Host A's name domain-matches host B's if
+
+         *  their host name strings string-compare equal; or
+
+         * A is a HDN string and has the form NB, where N is a non-empty
+            name string, B has the form .B', and B' is a HDN string.  (So,
+            x.y.com domain-matches .Y.com but not Y.com.)
+
+    Note that domain-match is not a commutative operation: a.b.c.com
+    domain-matches .c.com, but not the reverse.
+
+    """
+    # Note that, if A or B are IP addresses, the only relevant part of the
+    # definition of the domain-match algorithm is the direct string-compare.
+    A = A.lower()
+    B = B.lower()
+    if A == B:
+        return True
+    if not is_HDN(A):
+        return False
+    i = A.rfind(B)
+    has_form_nb = not (i == -1 or i == 0)
+    return (
+        has_form_nb and
+        B.startswith(".") and
+        is_HDN(B[1:])
+        )
+
+def liberal_is_HDN(text):
+    """Return True if text is a sort-of-like a host domain name.
+
+    For accepting/blocking domains.
+
+    """
+    return not IPV4_RE.search(text)
+
+def user_domain_match(A, B):
+    """For blocking/accepting domains.
+
+    A and B may be host domain names or IP addresses.
+
+    """
+    A = A.lower()
+    B = B.lower()
+    if not (liberal_is_HDN(A) and liberal_is_HDN(B)):
+        if A == B:
+            # equal IP addresses
+            return True
+        return False
+    initial_dot = B.startswith(".")
+    if initial_dot and A.endswith(B):
+        return True
+    if not initial_dot and A == B:
+        return True
+    return False
+
+cut_port_re = re.compile(r":\d+$")
+def request_host(request):
+    """Return request-host, as defined by RFC 2965.
+
+    Variation from RFC: returned value is lowercased, for convenient
+    comparison.
+
+    """
+    url = request.get_full_url()
+    host = _rfc3986.urlsplit(url)[1]
+    if host is None:
+        host = request.get_header("Host", "")
+    # remove port, if present
+    return cut_port_re.sub("", host, 1)
+
+def request_host_lc(request):
+    return request_host(request).lower()
+
+def eff_request_host(request):
+    """Return a tuple (request-host, effective request-host name)."""
+    erhn = req_host = request_host(request)
+    if req_host.find(".") == -1 and not IPV4_RE.search(req_host):
+        erhn = req_host + ".local"
+    return req_host, erhn
+
+def eff_request_host_lc(request):
+    req_host, erhn = eff_request_host(request)
+    return req_host.lower(), erhn.lower()
+
+def effective_request_host(request):
+    """Return the effective request-host, as defined by RFC 2965."""
+    return eff_request_host(request)[1]
+
+def request_path(request):
+    """request-URI, as defined by RFC 2965."""
+    url = request.get_full_url()
+    path, query, frag = _rfc3986.urlsplit(url)[2:]
+    path = escape_path(path)
+    req_path = _rfc3986.urlunsplit((None, None, path, query, frag))
+    if not req_path.startswith("/"):
+        req_path = "/"+req_path
+    return req_path
+
+def request_port(request):
+    host = request.get_host()
+    i = host.find(':')
+    if i >= 0:
+        port = host[i+1:]
+        try:
+            int(port)
+        except ValueError:
+            debug("nonnumeric port: '%s'", port)
+            return None
+    else:
+        port = DEFAULT_HTTP_PORT
+    return port
+
+def request_is_unverifiable(request):
+    try:
+        return request.is_unverifiable()
+    except AttributeError:
+        if hasattr(request, "unverifiable"):
+            return request.unverifiable
+        else:
+            raise
+
+# Characters in addition to A-Z, a-z, 0-9, '_', '.', and '-' that don't
+# need to be escaped to form a valid HTTP URL (RFCs 2396 and 1738).
+HTTP_PATH_SAFE = "%/;:@&=+$,!~*'()"
+ESCAPED_CHAR_RE = re.compile(r"%([0-9a-fA-F][0-9a-fA-F])")
+def uppercase_escaped_char(match):
+    return "%%%s" % match.group(1).upper()
+def escape_path(path):
+    """Escape any invalid characters in HTTP URL, and uppercase all escapes."""
+    # There's no knowing what character encoding was used to create URLs
+    # containing %-escapes, but since we have to pick one to escape invalid
+    # path characters, we pick UTF-8, as recommended in the HTML 4.0
+    # specification:
+    # http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.1
+    # And here, kind of: draft-fielding-uri-rfc2396bis-03
+    # (And in draft IRI specification: draft-duerst-iri-05)
+    # (And here, for new URI schemes: RFC 2718)
+    if isinstance(path, types.UnicodeType):
+        path = path.encode("utf-8")
+    path = urllib.quote(path, HTTP_PATH_SAFE)
+    path = ESCAPED_CHAR_RE.sub(uppercase_escaped_char, path)
+    return path
+
+def reach(h):
+    """Return reach of host h, as defined by RFC 2965, section 1.
+
+    The reach R of a host name H is defined as follows:
+
+       *  If
+
+          -  H is the host domain name of a host; and,
+
+          -  H has the form A.B; and
+
+          -  A has no embedded (that is, interior) dots; and
+
+          -  B has at least one embedded dot, or B is the string "local".
+             then the reach of H is .B.
+
+       *  Otherwise, the reach of H is H.
+
+    >>> reach("www.acme.com")
+    '.acme.com'
+    >>> reach("acme.com")
+    'acme.com'
+    >>> reach("acme.local")
+    '.local'
+
+    """
+    i = h.find(".")
+    if i >= 0:
+        #a = h[:i]  # this line is only here to show what a is
+        b = h[i+1:]
+        i = b.find(".")
+        if is_HDN(h) and (i >= 0 or b == "local"):
+            return "."+b
+    return h
+
+def is_third_party(request):
+    """
+
+    RFC 2965, section 3.3.6:
+
+        An unverifiable transaction is to a third-party host if its request-
+        host U does not domain-match the reach R of the request-host O in the
+        origin transaction.
+
+    """
+    req_host = request_host_lc(request)
+    # the origin request's request-host was stuffed into request by
+    # _urllib2_support.AbstractHTTPHandler
+    return not domain_match(req_host, reach(request.origin_req_host))
+
+
+class Cookie:
+    """HTTP Cookie.
+
+    This class represents both Netscape and RFC 2965 cookies.
+
+    This is deliberately a very simple class.  It just holds attributes.  It's
+    possible to construct Cookie instances that don't comply with the cookie
+    standards.  CookieJar.make_cookies is the factory function for Cookie
+    objects -- it deals with cookie parsing, supplying defaults, and
+    normalising to the representation used in this class.  CookiePolicy is
+    responsible for checking them to see whether they should be accepted from
+    and returned to the server.
+
+    version: integer;
+    name: string;
+    value: string (may be None);
+    port: string; None indicates no attribute was supplied (eg. "Port", rather
+     than eg. "Port=80"); otherwise, a port string (eg. "80") or a port list
+     string (eg. "80,8080")
+    port_specified: boolean; true if a value was supplied with the Port
+     cookie-attribute
+    domain: string;
+    domain_specified: boolean; true if Domain was explicitly set
+    domain_initial_dot: boolean; true if Domain as set in HTTP header by server
+     started with a dot (yes, this really is necessary!)
+    path: string;
+    path_specified: boolean; true if Path was explicitly set
+    secure:  boolean; true if should only be returned over secure connection
+    expires: integer; seconds since epoch (RFC 2965 cookies should calculate
+     this value from the Max-Age attribute)
+    discard: boolean, true if this is a session cookie; (if no expires value,
+     this should be true)
+    comment: string;
+    comment_url: string;
+    rfc2109: boolean; true if cookie arrived in a Set-Cookie: (not
+     Set-Cookie2:) header, but had a version cookie-attribute of 1
+    rest: mapping of other cookie-attributes
+
+    Note that the port may be present in the headers, but unspecified ("Port"
+    rather than"Port=80", for example); if this is the case, port is None.
+
+    """
+
+    def __init__(self, version, name, value,
+                 port, port_specified,
+                 domain, domain_specified, domain_initial_dot,
+                 path, path_specified,
+                 secure,
+                 expires,
+                 discard,
+                 comment,
+                 comment_url,
+                 rest,
+                 rfc2109=False,
+                 ):
+
+        if version is not None: version = int(version)
+        if expires is not None: expires = int(expires)
+        if port is None and port_specified is True:
+            raise ValueError("if port is None, port_specified must be false")
+
+        self.version = version
+        self.name = name
+        self.value = value
+        self.port = port
+        self.port_specified = port_specified
+        # normalise case, as per RFC 2965 section 3.3.3
+        self.domain = domain.lower()
+        self.domain_specified = domain_specified
+        # Sigh.  We need to know whether the domain given in the
+        # cookie-attribute had an initial dot, in order to follow RFC 2965
+        # (as clarified in draft errata).  Needed for the returned $Domain
+        # value.
+        self.domain_initial_dot = domain_initial_dot
+        self.path = path
+        self.path_specified = path_specified
+        self.secure = secure
+        self.expires = expires
+        self.discard = discard
+        self.comment = comment
+        self.comment_url = comment_url
+        self.rfc2109 = rfc2109
+
+        self._rest = copy.copy(rest)
+
+    def has_nonstandard_attr(self, name):
+        return self._rest.has_key(name)
+    def get_nonstandard_attr(self, name, default=None):
+        return self._rest.get(name, default)
+    def set_nonstandard_attr(self, name, value):
+        self._rest[name] = value
+    def nonstandard_attr_keys(self):
+        return self._rest.keys()
+
+    def is_expired(self, now=None):
+        if now is None: now = time.time()
+        return (self.expires is not None) and (self.expires <= now)
+
+    def __str__(self):
+        if self.port is None: p = ""
+        else: p = ":"+self.port
+        limit = self.domain + p + self.path
+        if self.value is not None:
+            namevalue = "%s=%s" % (self.name, self.value)
+        else:
+            namevalue = self.name
+        return "<Cookie %s for %s>" % (namevalue, limit)
+
+    def __repr__(self):
+        args = []
+        for name in ["version", "name", "value",
+                     "port", "port_specified",
+                     "domain", "domain_specified", "domain_initial_dot",
+                     "path", "path_specified",
+                     "secure", "expires", "discard", "comment", "comment_url",
+                     ]:
+            attr = getattr(self, name)
+            args.append("%s=%s" % (name, repr(attr)))
+        args.append("rest=%s" % repr(self._rest))
+        args.append("rfc2109=%s" % repr(self.rfc2109))
+        return "Cookie(%s)" % ", ".join(args)
+
+
+class CookiePolicy:
+    """Defines which cookies get accepted from and returned to server.
+
+    May also modify cookies.
+
+    The subclass DefaultCookiePolicy defines the standard rules for Netscape
+    and RFC 2965 cookies -- override that if you want a customised policy.
+
+    As well as implementing set_ok and return_ok, implementations of this
+    interface must also supply the following attributes, indicating which
+    protocols should be used, and how.  These can be read and set at any time,
+    though whether that makes complete sense from the protocol point of view is
+    doubtful.
+
+    Public attributes:
+
+    netscape: implement netscape protocol
+    rfc2965: implement RFC 2965 protocol
+    rfc2109_as_netscape:
+       WARNING: This argument will change or go away if is not accepted into
+                the Python standard library in this form!
+     If true, treat RFC 2109 cookies as though they were Netscape cookies.  The
+     default is for this attribute to be None, which means treat 2109 cookies
+     as RFC 2965 cookies unless RFC 2965 handling is switched off (which it is,
+     by default), and as Netscape cookies otherwise.
+    hide_cookie2: don't add Cookie2 header to requests (the presence of
+     this header indicates to the server that we understand RFC 2965
+     cookies)
+
+    """
+    def set_ok(self, cookie, request):
+        """Return true if (and only if) cookie should be accepted from server.
+
+        Currently, pre-expired cookies never get this far -- the CookieJar
+        class deletes such cookies itself.
+
+        cookie: mechanize.Cookie object
+        request: object implementing the interface defined by
+         CookieJar.extract_cookies.__doc__
+
+        """
+        raise NotImplementedError()
+
+    def return_ok(self, cookie, request):
+        """Return true if (and only if) cookie should be returned to server.
+
+        cookie: mechanize.Cookie object
+        request: object implementing the interface defined by
+         CookieJar.add_cookie_header.__doc__
+
+        """
+        raise NotImplementedError()
+
+    def domain_return_ok(self, domain, request):
+        """Return false if cookies should not be returned, given cookie domain.
+
+        This is here as an optimization, to remove the need for checking every
+        cookie with a particular domain (which may involve reading many files).
+        The default implementations of domain_return_ok and path_return_ok
+        (return True) leave all the work to return_ok.
+
+        If domain_return_ok returns true for the cookie domain, path_return_ok
+        is called for the cookie path.  Otherwise, path_return_ok and return_ok
+        are never called for that cookie domain.  If path_return_ok returns
+        true, return_ok is called with the Cookie object itself for a full
+        check.  Otherwise, return_ok is never called for that cookie path.
+
+        Note that domain_return_ok is called for every *cookie* domain, not
+        just for the *request* domain.  For example, the function might be
+        called with both ".acme.com" and "www.acme.com" if the request domain
+        is "www.acme.com".  The same goes for path_return_ok.
+
+        For argument documentation, see the docstring for return_ok.
+
+        """
+        return True
+
+    def path_return_ok(self, path, request):
+        """Return false if cookies should not be returned, given cookie path.
+
+        See the docstring for domain_return_ok.
+
+        """
+        return True
+
+
+class DefaultCookiePolicy(CookiePolicy):
+    """Implements the standard rules for accepting and returning cookies.
+
+    Both RFC 2965 and Netscape cookies are covered.  RFC 2965 handling is
+    switched off by default.
+
+    The easiest way to provide your own policy is to override this class and
+    call its methods in your overriden implementations before adding your own
+    additional checks.
+
+    import mechanize
+    class MyCookiePolicy(mechanize.DefaultCookiePolicy):
+        def set_ok(self, cookie, request):
+            if not mechanize.DefaultCookiePolicy.set_ok(
+                self, cookie, request):
+                return False
+            if i_dont_want_to_store_this_cookie():
+                return False
+            return True
+
+    In addition to the features required to implement the CookiePolicy
+    interface, this class allows you to block and allow domains from setting
+    and receiving cookies.  There are also some strictness switches that allow
+    you to tighten up the rather loose Netscape protocol rules a little bit (at
+    the cost of blocking some benign cookies).
+
+    A domain blacklist and whitelist is provided (both off by default).  Only
+    domains not in the blacklist and present in the whitelist (if the whitelist
+    is active) participate in cookie setting and returning.  Use the
+    blocked_domains constructor argument, and blocked_domains and
+    set_blocked_domains methods (and the corresponding argument and methods for
+    allowed_domains).  If you set a whitelist, you can turn it off again by
+    setting it to None.
+
+    Domains in block or allow lists that do not start with a dot must
+    string-compare equal.  For example, "acme.com" matches a blacklist entry of
+    "acme.com", but "www.acme.com" does not.  Domains that do start with a dot
+    are matched by more specific domains too.  For example, both "www.acme.com"
+    and "www.munitions.acme.com" match ".acme.com" (but "acme.com" itself does
+    not).  IP addresses are an exception, and must match exactly.  For example,
+    if blocked_domains contains "192.168.1.2" and ".168.1.2" 192.168.1.2 is
+    blocked, but 193.168.1.2 is not.
+
+    Additional Public Attributes:
+
+    General strictness switches
+
+    strict_domain: don't allow sites to set two-component domains with
+     country-code top-level domains like .co.uk, .gov.uk, .co.nz. etc.
+     This is far from perfect and isn't guaranteed to work!
+
+    RFC 2965 protocol strictness switches
+
+    strict_rfc2965_unverifiable: follow RFC 2965 rules on unverifiable
+     transactions (usually, an unverifiable transaction is one resulting from
+     a redirect or an image hosted on another site); if this is false, cookies
+     are NEVER blocked on the basis of verifiability
+
+    Netscape protocol strictness switches
+
+    strict_ns_unverifiable: apply RFC 2965 rules on unverifiable transactions
+     even to Netscape cookies
+    strict_ns_domain: flags indicating how strict to be with domain-matching
+     rules for Netscape cookies:
+      DomainStrictNoDots: when setting cookies, host prefix must not contain a
+       dot (eg. www.foo.bar.com can't set a cookie for .bar.com, because
+       www.foo contains a dot)
+      DomainStrictNonDomain: cookies that did not explicitly specify a Domain
+       cookie-attribute can only be returned to a domain that string-compares
+       equal to the domain that set the cookie (eg. rockets.acme.com won't
+       be returned cookies from acme.com that had no Domain cookie-attribute)
+      DomainRFC2965Match: when setting cookies, require a full RFC 2965
+       domain-match
+      DomainLiberal and DomainStrict are the most useful combinations of the
+       above flags, for convenience
+    strict_ns_set_initial_dollar: ignore cookies in Set-Cookie: headers that
+     have names starting with '$'
+    strict_ns_set_path: don't allow setting cookies whose path doesn't
+     path-match request URI
+
+    """
+
+    DomainStrictNoDots = 1
+    DomainStrictNonDomain = 2
+    DomainRFC2965Match = 4
+
+    DomainLiberal = 0
+    DomainStrict = DomainStrictNoDots|DomainStrictNonDomain
+
+    def __init__(self,
+                 blocked_domains=None, allowed_domains=None,
+                 netscape=True, rfc2965=False,
+                 # WARNING: this argument will change or go away if is not
+                 # accepted into the Python standard library in this form!
+                 # default, ie. treat 2109 as netscape iff not rfc2965
+                 rfc2109_as_netscape=None,
+                 hide_cookie2=False,
+                 strict_domain=False,
+                 strict_rfc2965_unverifiable=True,
+                 strict_ns_unverifiable=False,
+                 strict_ns_domain=DomainLiberal,
+                 strict_ns_set_initial_dollar=False,
+                 strict_ns_set_path=False,
+                 ):
+        """
+        Constructor arguments should be used as keyword arguments only.
+
+        blocked_domains: sequence of domain names that we never accept cookies
+         from, nor return cookies to
+        allowed_domains: if not None, this is a sequence of the only domains
+         for which we accept and return cookies
+
+        For other arguments, see CookiePolicy.__doc__ and
+        DefaultCookiePolicy.__doc__..
+
+        """
+        self.netscape = netscape
+        self.rfc2965 = rfc2965
+        self.rfc2109_as_netscape = rfc2109_as_netscape
+        self.hide_cookie2 = hide_cookie2
+        self.strict_domain = strict_domain
+        self.strict_rfc2965_unverifiable = strict_rfc2965_unverifiable
+        self.strict_ns_unverifiable = strict_ns_unverifiable
+        self.strict_ns_domain = strict_ns_domain
+        self.strict_ns_set_initial_dollar = strict_ns_set_initial_dollar
+        self.strict_ns_set_path = strict_ns_set_path
+
+        if blocked_domains is not None:
+            self._blocked_domains = tuple(blocked_domains)
+        else:
+            self._blocked_domains = ()
+
+        if allowed_domains is not None:
+            allowed_domains = tuple(allowed_domains)
+        self._allowed_domains = allowed_domains
+
+    def blocked_domains(self):
+        """Return the sequence of blocked domains (as a tuple)."""
+        return self._blocked_domains
+    def set_blocked_domains(self, blocked_domains):
+        """Set the sequence of blocked domains."""
+        self._blocked_domains = tuple(blocked_domains)
+
+    def is_blocked(self, domain):
+        for blocked_domain in self._blocked_domains:
+            if user_domain_match(domain, blocked_domain):
+                return True
+        return False
+
+    def allowed_domains(self):
+        """Return None, or the sequence of allowed domains (as a tuple)."""
+        return self._allowed_domains
+    def set_allowed_domains(self, allowed_domains):
+        """Set the sequence of allowed domains, or None."""
+        if allowed_domains is not None:
+            allowed_domains = tuple(allowed_domains)
+        self._allowed_domains = allowed_domains
+
+    def is_not_allowed(self, domain):
+        if self._allowed_domains is None:
+            return False
+        for allowed_domain in self._allowed_domains:
+            if user_domain_match(domain, allowed_domain):
+                return False
+        return True
+
+    def set_ok(self, cookie, request):
+        """
+        If you override set_ok, be sure to call this method.  If it returns
+        false, so should your subclass (assuming your subclass wants to be more
+        strict about which cookies to accept).
+
+        """
+        debug(" - checking cookie %s", cookie)
+
+        assert cookie.name is not None
+
+        for n in "version", "verifiability", "name", "path", "domain", "port":
+            fn_name = "set_ok_"+n
+            fn = getattr(self, fn_name)
+            if not fn(cookie, request):
+                return False
+
+        return True
+
+    def set_ok_version(self, cookie, request):
+        if cookie.version is None:
+            # Version is always set to 0 by parse_ns_headers if it's a Netscape
+            # cookie, so this must be an invalid RFC 2965 cookie.
+            debug("   Set-Cookie2 without version attribute (%s)", cookie)
+            return False
+        if cookie.version > 0 and not self.rfc2965:
+            debug("   RFC 2965 cookies are switched off")
+            return False
+        elif cookie.version == 0 and not self.netscape:
+            debug("   Netscape cookies are switched off")
+            return False
+        return True
+
+    def set_ok_verifiability(self, cookie, request):
+        if request_is_unverifiable(request) and is_third_party(request):
+            if cookie.version > 0 and self.strict_rfc2965_unverifiable:
+                debug("   third-party RFC 2965 cookie during "
+                             "unverifiable transaction")
+                return False
+            elif cookie.version == 0 and self.strict_ns_unverifiable:
+                debug("   third-party Netscape cookie during "
+                             "unverifiable transaction")
+                return False
+        return True
+
+    def set_ok_name(self, cookie, request):
+        # Try and stop servers setting V0 cookies designed to hack other
+        # servers that know both V0 and V1 protocols.
+        if (cookie.version == 0 and self.strict_ns_set_initial_dollar and
+            cookie.name.startswith("$")):
+            debug("   illegal name (starts with '$'): '%s'", cookie.name)
+            return False
+        return True
+
+    def set_ok_path(self, cookie, request):
+        if cookie.path_specified:
+            req_path = request_path(request)
+            if ((cookie.version > 0 or
+                 (cookie.version == 0 and self.strict_ns_set_path)) and
+                not req_path.startswith(cookie.path)):
+                debug("   path attribute %s is not a prefix of request "
+                      "path %s", cookie.path, req_path)
+                return False
+        return True
+
+    def set_ok_countrycode_domain(self, cookie, request):
+        """Return False if explicit cookie domain is not acceptable.
+
+        Called by set_ok_domain, for convenience of overriding by
+        subclasses.
+
+        """
+        if cookie.domain_specified and self.strict_domain:
+            domain = cookie.domain
+            # since domain was specified, we know that:
+            assert domain.startswith(".")
+            if domain.count(".") == 2:
+                # domain like .foo.bar
+                i = domain.rfind(".")
+                tld = domain[i+1:]
+                sld = domain[1:i]
+                if (sld.lower() in [
+                    "co", "ac",
+                    "com", "edu", "org", "net", "gov", "mil", "int",
+                    "aero", "biz", "cat", "coop", "info", "jobs", "mobi",
+                    "museum", "name", "pro", "travel",
+                    ] and
+                    len(tld) == 2):
+                    # domain like .co.uk
+                    return False
+        return True
+
+    def set_ok_domain(self, cookie, request):
+        if self.is_blocked(cookie.domain):
+            debug("   domain %s is in user block-list", cookie.domain)
+            return False
+        if self.is_not_allowed(cookie.domain):
+            debug("   domain %s is not in user allow-list", cookie.domain)
+            return False
+        if not self.set_ok_countrycode_domain(cookie, request):
+            debug("   country-code second level domain %s", cookie.domain)
+            return False
+        if cookie.domain_specified:
+            req_host, erhn = eff_request_host_lc(request)
+            domain = cookie.domain
+            if domain.startswith("."):
+                undotted_domain = domain[1:]
+            else:
+                undotted_domain = domain
+            embedded_dots = (undotted_domain.find(".") >= 0)
+            if not embedded_dots and domain != ".local":
+                debug("   non-local domain %s contains no embedded dot",
+                      domain)
+                return False
+            if cookie.version == 0:
+                if (not erhn.endswith(domain) and
+                    (not erhn.startswith(".") and
+                     not ("."+erhn).endswith(domain))):
+                    debug("   effective request-host %s (even with added "
+                          "initial dot) does not end end with %s",
+                          erhn, domain)
+                    return False
+            if (cookie.version > 0 or
+                (self.strict_ns_domain & self.DomainRFC2965Match)):
+                if not domain_match(erhn, domain):
+                    debug("   effective request-host %s does not domain-match "
+                          "%s", erhn, domain)
+                    return False
+            if (cookie.version > 0 or
+                (self.strict_ns_domain & self.DomainStrictNoDots)):
+                host_prefix = req_host[:-len(domain)]
+                if (host_prefix.find(".") >= 0 and
+                    not IPV4_RE.search(req_host)):
+                    debug("   host prefix %s for domain %s contains a dot",
+                          host_prefix, domain)
+                    return False
+        return True
+
+    def set_ok_port(self, cookie, request):
+        if cookie.port_specified:
+            req_port = request_port(request)
+            if req_port is None:
+                req_port = "80"
+            else:
+                req_port = str(req_port)
+            for p in cookie.port.split(","):
+                try:
+                    int(p)
+                except ValueError:
+                    debug("   bad port %s (not numeric)", p)
+                    return False
+                if p == req_port:
+                    break
+            else:
+                debug("   request port (%s) not found in %s",
+                      req_port, cookie.port)
+                return False
+        return True
+
+    def return_ok(self, cookie, request):
+        """
+        If you override return_ok, be sure to call this method.  If it returns
+        false, so should your subclass (assuming your subclass wants to be more
+        strict about which cookies to return).
+
+        """
+        # Path has already been checked by path_return_ok, and domain blocking
+        # done by domain_return_ok.
+        debug(" - checking cookie %s", cookie)
+
+        for n in ("version", "verifiability", "secure", "expires", "port",
+                  "domain"):
+            fn_name = "return_ok_"+n
+            fn = getattr(self, fn_name)
+            if not fn(cookie, request):
+                return False
+        return True
+
+    def return_ok_version(self, cookie, request):
+        if cookie.version > 0 and not self.rfc2965:
+            debug("   RFC 2965 cookies are switched off")
+            return False
+        elif cookie.version == 0 and not self.netscape:
+            debug("   Netscape cookies are switched off")
+            return False
+        return True
+
+    def return_ok_verifiability(self, cookie, request):
+        if request_is_unverifiable(request) and is_third_party(request):
+            if cookie.version > 0 and self.strict_rfc2965_unverifiable:
+                debug("   third-party RFC 2965 cookie during unverifiable "
+                      "transaction")
+                return False
+            elif cookie.version == 0 and self.strict_ns_unverifiable:
+                debug("   third-party Netscape cookie during unverifiable "
+                      "transaction")
+                return False
+        return True
+
+    def return_ok_secure(self, cookie, request):
+        if cookie.secure and request.get_type() != "https":
+            debug("   secure cookie with non-secure request")
+            return False
+        return True
+
+    def return_ok_expires(self, cookie, request):
+        if cookie.is_expired(self._now):
+            debug("   cookie expired")
+            return False
+        return True
+
+    def return_ok_port(self, cookie, request):
+        if cookie.port:
+            req_port = request_port(request)
+            if req_port is None:
+                req_port = "80"
+            for p in cookie.port.split(","):
+                if p == req_port:
+                    break
+            else:
+                debug("   request port %s does not match cookie port %s",
+                      req_port, cookie.port)
+                return False
+        return True
+
+    def return_ok_domain(self, cookie, request):
+        req_host, erhn = eff_request_host_lc(request)
+        domain = cookie.domain
+
+        # strict check of non-domain cookies: Mozilla does this, MSIE5 doesn't
+        if (cookie.version == 0 and
+            (self.strict_ns_domain & self.DomainStrictNonDomain) and
+            not cookie.domain_specified and domain != erhn):
+            debug("   cookie with unspecified domain does not string-compare "
+                  "equal to request domain")
+            return False
+
+        if cookie.version > 0 and not domain_match(erhn, domain):
+            debug("   effective request-host name %s does not domain-match "
+                  "RFC 2965 cookie domain %s", erhn, domain)
+            return False
+        if cookie.version == 0 and not ("."+erhn).endswith(domain):
+            debug("   request-host %s does not match Netscape cookie domain "
+                  "%s", req_host, domain)
+            return False
+        return True
+
+    def domain_return_ok(self, domain, request):
+        # Liberal check of domain.  This is here as an optimization to avoid
+        # having to load lots of MSIE cookie files unless necessary.
+
+        # Munge req_host and erhn to always start with a dot, so as to err on
+        # the side of letting cookies through.
+        dotted_req_host, dotted_erhn = eff_request_host_lc(request)
+        if not dotted_req_host.startswith("."):
+            dotted_req_host = "."+dotted_req_host
+        if not dotted_erhn.startswith("."):
+            dotted_erhn = "."+dotted_erhn
+        if not (dotted_req_host.endswith(domain) or
+                dotted_erhn.endswith(domain)):
+            #debug("   request domain %s does not match cookie domain %s",
+            #      req_host, domain)
+            return False
+
+        if self.is_blocked(domain):
+            debug("   domain %s is in user block-list", domain)
+            return False
+        if self.is_not_allowed(domain):
+            debug("   domain %s is not in user allow-list", domain)
+            return False
+
+        return True
+
+    def path_return_ok(self, path, request):
+        debug("- checking cookie path=%s", path)
+        req_path = request_path(request)
+        if not req_path.startswith(path):
+            debug("  %s does not path-match %s", req_path, path)
+            return False
+        return True
+
+
+def vals_sorted_by_key(adict):
+    keys = adict.keys()
+    keys.sort()
+    return map(adict.get, keys)
+
+class MappingIterator:
+    """Iterates over nested mapping, depth-first, in sorted order by key."""
+    def __init__(self, mapping):
+        self._s = [(vals_sorted_by_key(mapping), 0, None)]  # LIFO stack
+
+    def __iter__(self): return self
+
+    def next(self):
+        # this is hairy because of lack of generators
+        while 1:
+            try:
+                vals, i, prev_item = self._s.pop()
+            except IndexError:
+                raise StopIteration()
+            if i < len(vals):
+                item = vals[i]
+                i = i + 1
+                self._s.append((vals, i, prev_item))
+                try:
+                    item.items
+                except AttributeError:
+                    # non-mapping
+                    break
+                else:
+                    # mapping
+                    self._s.append((vals_sorted_by_key(item), 0, item))
+                    continue
+        return item
+
+
+# Used as second parameter to dict.get method, to distinguish absent
+# dict key from one with a None value.
+class Absent: pass
+
+class CookieJar:
+    """Collection of HTTP cookies.
+
+    You may not need to know about this class: try mechanize.urlopen().
+
+    The major methods are extract_cookies and add_cookie_header; these are all
+    you are likely to need.
+
+    CookieJar supports the iterator protocol:
+
+    for cookie in cookiejar:
+        # do something with cookie
+
+    Methods:
+
+    add_cookie_header(request)
+    extract_cookies(response, request)
+    get_policy()
+    set_policy(policy)
+    cookies_for_request(request)
+    make_cookies(response, request)
+    set_cookie_if_ok(cookie, request)
+    set_cookie(cookie)
+    clear_session_cookies()
+    clear_expired_cookies()
+    clear(domain=None, path=None, name=None)
+
+    Public attributes
+
+    policy: CookiePolicy object
+
+    """
+
+    non_word_re = re.compile(r"\W")
+    quote_re = re.compile(r"([\"\\])")
+    strict_domain_re = re.compile(r"\.?[^.]*")
+    domain_re = re.compile(r"[^.]*")
+    dots_re = re.compile(r"^\.+")
+
+    def __init__(self, policy=None):
+        """
+        See CookieJar.__doc__ for argument documentation.
+
+        """
+        if policy is None:
+            policy = DefaultCookiePolicy()
+        self._policy = policy
+
+        self._cookies = {}
+
+        # for __getitem__ iteration in pre-2.2 Pythons
+        self._prev_getitem_index = 0
+
+    def get_policy(self):
+        return self._policy
+
+    def set_policy(self, policy):
+        self._policy = policy
+
+    def _cookies_for_domain(self, domain, request):
+        cookies = []
+        if not self._policy.domain_return_ok(domain, request):
+            return []
+        debug("Checking %s for cookies to return", domain)
+        cookies_by_path = self._cookies[domain]
+        for path in cookies_by_path.keys():
+            if not self._policy.path_return_ok(path, request):
+                continue
+            cookies_by_name = cookies_by_path[path]
+            for cookie in cookies_by_name.values():
+                if not self._policy.return_ok(cookie, request):
+                    debug("   not returning cookie")
+                    continue
+                debug("   it's a match")
+                cookies.append(cookie)
+        return cookies
+
+    def cookies_for_request(self, request):
+        """Return a list of cookies to be returned to server.
+
+        The returned list of cookie instances is sorted in the order they
+        should appear in the Cookie: header for return to the server.
+
+        See add_cookie_header.__doc__ for the interface required of the
+        request argument.
+
+        New in version 0.1.10
+
+        """
+        self._policy._now = self._now = int(time.time())
+        cookies = self._cookies_for_request(request)
+        # add cookies in order of most specific (i.e. longest) path first
+        def decreasing_size(a, b): return cmp(len(b.path), len(a.path))
+        cookies.sort(decreasing_size)
+        return cookies
+
+    def _cookies_for_request(self, request):
+        """Return a list of cookies to be returned to server."""
+        # this method still exists (alongside cookies_for_request) because it
+        # is part of an implied protected interface for subclasses of cookiejar
+        # XXX document that implied interface, or provide another way of
+        # implementing cookiejars than subclassing
+        cookies = []
+        for domain in self._cookies.keys():
+            cookies.extend(self._cookies_for_domain(domain, request))
+        return cookies
+
+    def _cookie_attrs(self, cookies):
+        """Return a list of cookie-attributes to be returned to server.
+
+        The $Version attribute is also added when appropriate (currently only
+        once per request).
+
+        >>> jar = CookieJar()
+        >>> ns_cookie = Cookie(0, "foo", '"bar"', None, False,
+        ...                   "example.com", False, False,
+        ...                   "/", False, False, None, True,
+        ...                   None, None, {})
+        >>> jar._cookie_attrs([ns_cookie])
+        ['foo="bar"']
+        >>> rfc2965_cookie = Cookie(1, "foo", "bar", None, False,
+        ...                         ".example.com", True, False,
+        ...                         "/", False, False, None, True,
+        ...                         None, None, {})
+        >>> jar._cookie_attrs([rfc2965_cookie])
+        ['$Version=1', 'foo=bar', '$Domain="example.com"']
+
+        """
+        version_set = False
+
+        attrs = []
+        for cookie in cookies:
+            # set version of Cookie header
+            # XXX
+            # What should it be if multiple matching Set-Cookie headers have
+            #  different versions themselves?
+            # Answer: there is no answer; was supposed to be settled by
+            #  RFC 2965 errata, but that may never appear...
+            version = cookie.version
+            if not version_set:
+                version_set = True
+                if version > 0:
+                    attrs.append("$Version=%s" % version)
+
+            # quote cookie value if necessary
+            # (not for Netscape protocol, which already has any quotes
+            #  intact, due to the poorly-specified Netscape Cookie: syntax)
+            if ((cookie.value is not None) and
+                self.non_word_re.search(cookie.value) and version > 0):
+                value = self.quote_re.sub(r"\\\1", cookie.value)
+            else:
+                value = cookie.value
+
+            # add cookie-attributes to be returned in Cookie header
+            if cookie.value is None:
+                attrs.append(cookie.name)
+            else:
+                attrs.append("%s=%s" % (cookie.name, value))
+            if version > 0:
+                if cookie.path_specified:
+                    attrs.append('$Path="%s"' % cookie.path)
+                if cookie.domain.startswith("."):
+                    domain = cookie.domain
+                    if (not cookie.domain_initial_dot and
+                        domain.startswith(".")):
+                        domain = domain[1:]
+                    attrs.append('$Domain="%s"' % domain)
+                if cookie.port is not None:
+                    p = "$Port"
+                    if cookie.port_specified:
+                        p = p + ('="%s"' % cookie.port)
+                    attrs.append(p)
+
+        return attrs
+
+    def add_cookie_header(self, request):
+        """Add correct Cookie: header to request (urllib2.Request object).
+
+        The Cookie2 header is also added unless policy.hide_cookie2 is true.
+
+        The request object (usually a urllib2.Request instance) must support
+        the methods get_full_url, get_host, is_unverifiable, get_type,
+        has_header, get_header, header_items and add_unredirected_header, as
+        documented by urllib2, and the port attribute (the port number).
+        Actually, RequestUpgradeProcessor will automatically upgrade your
+        Request object to one with has_header, get_header, header_items and
+        add_unredirected_header, if it lacks those methods, for compatibility
+        with pre-2.4 versions of urllib2.
+
+        """
+        debug("add_cookie_header")
+        cookies = self.cookies_for_request(request)
+
+        attrs = self._cookie_attrs(cookies)
+        if attrs:
+            if not request.has_header("Cookie"):
+                request.add_unredirected_header("Cookie", "; ".join(attrs))
+
+        # if necessary, advertise that we know RFC 2965
+        if self._policy.rfc2965 and not self._policy.hide_cookie2:
+            for cookie in cookies:
+                if cookie.version != 1 and not request.has_header("Cookie2"):
+                    request.add_unredirected_header("Cookie2", '$Version="1"')
+                    break
+
+        self.clear_expired_cookies()
+
+    def _normalized_cookie_tuples(self, attrs_set):
+        """Return list of tuples containing normalised cookie information.
+
+        attrs_set is the list of lists of key,value pairs extracted from
+        the Set-Cookie or Set-Cookie2 headers.
+
+        Tuples are name, value, standard, rest, where name and value are the
+        cookie name and value, standard is a dictionary containing the standard
+        cookie-attributes (discard, secure, version, expires or max-age,
+        domain, path and port) and rest is a dictionary containing the rest of
+        the cookie-attributes.
+
+        """
+        cookie_tuples = []
+
+        boolean_attrs = "discard", "secure"
+        value_attrs = ("version",
+                       "expires", "max-age",
+                       "domain", "path", "port",
+                       "comment", "commenturl")
+
+        for cookie_attrs in attrs_set:
+            name, value = cookie_attrs[0]
+
+            # Build dictionary of standard cookie-attributes (standard) and
+            # dictionary of other cookie-attributes (rest).
+
+            # Note: expiry time is normalised to seconds since epoch.  V0
+            # cookies should have the Expires cookie-attribute, and V1 cookies
+            # should have Max-Age, but since V1 includes RFC 2109 cookies (and
+            # since V0 cookies may be a mish-mash of Netscape and RFC 2109), we
+            # accept either (but prefer Max-Age).
+            max_age_set = False
+
+            bad_cookie = False
+
+            standard = {}
+            rest = {}
+            for k, v in cookie_attrs[1:]:
+                lc = k.lower()
+                # don't lose case distinction for unknown fields
+                if lc in value_attrs or lc in boolean_attrs:
+                    k = lc
+                if k in boolean_attrs and v is None:
+                    # boolean cookie-attribute is present, but has no value
+                    # (like "discard", rather than "port=80")
+                    v = True
+                if standard.has_key(k):
+                    # only first value is significant
+                    continue
+                if k == "domain":
+                    if v is None:
+                        debug("   missing value for domain attribute")
+                        bad_cookie = True
+                        break
+                    # RFC 2965 section 3.3.3
+                    v = v.lower()
+                if k == "expires":
+                    if max_age_set:
+                        # Prefer max-age to expires (like Mozilla)
+                        continue
+                    if v is None:
+                        debug("   missing or invalid value for expires "
+                              "attribute: treating as session cookie")
+                        continue
+                if k == "max-age":
+                    max_age_set = True
+                    if v is None:
+                        debug("   missing value for max-age attribute")
+                        bad_cookie = True
+                        break
+                    try:
+                        v = int(v)
+                    except ValueError:
+                        debug("   missing or invalid (non-numeric) value for "
+                              "max-age attribute")
+                        bad_cookie = True
+                        break
+                    # convert RFC 2965 Max-Age to seconds since epoch
+                    # XXX Strictly you're supposed to follow RFC 2616
+                    #   age-calculation rules.  Remember that zero Max-Age is a
+                    #   is a request to discard (old and new) cookie, though.
+                    k = "expires"
+                    v = self._now + v
+                if (k in value_attrs) or (k in boolean_attrs):
+                    if (v is None and
+                        k not in ["port", "comment", "commenturl"]):
+                        debug("   missing value for %s attribute" % k)
+                        bad_cookie = True
+                        break
+                    standard[k] = v
+                else:
+                    rest[k] = v
+
+            if bad_cookie:
+                continue
+
+            cookie_tuples.append((name, value, standard, rest))
+
+        return cookie_tuples
+
+    def _cookie_from_cookie_tuple(self, tup, request):
+        # standard is dict of standard cookie-attributes, rest is dict of the
+        # rest of them
+        name, value, standard, rest = tup
+
+        domain = standard.get("domain", Absent)
+        path = standard.get("path", Absent)
+        port = standard.get("port", Absent)
+        expires = standard.get("expires", Absent)
+
+        # set the easy defaults
+        version = standard.get("version", None)
+        if version is not None:
+            try:
+                version = int(version)
+            except ValueError:
+                return None  # invalid version, ignore cookie
+        secure = standard.get("secure", False)
+        # (discard is also set if expires is Absent)
+        discard = standard.get("discard", False)
+        comment = standard.get("comment", None)
+        comment_url = standard.get("commenturl", None)
+
+        # set default path
+        if path is not Absent and path != "":
+            path_specified = True
+            path = escape_path(path)
+        else:
+            path_specified = False
+            path = request_path(request)
+            i = path.rfind("/")
+            if i != -1:
+                if version == 0:
+                    # Netscape spec parts company from reality here
+                    path = path[:i]
+                else:
+                    path = path[:i+1]
+            if len(path) == 0: path = "/"
+
+        # set default domain
+        domain_specified = domain is not Absent
+        # but first we have to remember whether it starts with a dot
+        domain_initial_dot = False
+        if domain_specified:
+            domain_initial_dot = bool(domain.startswith("."))
+        if domain is Absent:
+            req_host, erhn = eff_request_host_lc(request)
+            domain = erhn
+        elif not domain.startswith("."):
+            domain = "."+domain
+
+        # set default port
+        port_specified = False
+        if port is not Absent:
+            if port is None:
+                # Port attr present, but has no value: default to request port.
+                # Cookie should then only be sent back on that port.
+                port = request_port(request)
+            else:
+                port_specified = True
+                port = re.sub(r"\s+", "", port)
+        else:
+            # No port attr present.  Cookie can be sent back on any port.
+            port = None
+
+        # set default expires and discard
+        if expires is Absent:
+            expires = None
+            discard = True
+
+        return Cookie(version,
+                      name, value,
+                      port, port_specified,
+                      domain, domain_specified, domain_initial_dot,
+                      path, path_specified,
+                      secure,
+                      expires,
+                      discard,
+                      comment,
+                      comment_url,
+                      rest)
+
+    def _cookies_from_attrs_set(self, attrs_set, request):
+        cookie_tuples = self._normalized_cookie_tuples(attrs_set)
+
+        cookies = []
+        for tup in cookie_tuples:
+            cookie = self._cookie_from_cookie_tuple(tup, request)
+            if cookie: cookies.append(cookie)
+        return cookies
+
+    def _process_rfc2109_cookies(self, cookies):
+        if self._policy.rfc2109_as_netscape is None:
+            rfc2109_as_netscape = not self._policy.rfc2965
+        else:
+            rfc2109_as_netscape = self._policy.rfc2109_as_netscape
+        for cookie in cookies:
+            if cookie.version == 1:
+                cookie.rfc2109 = True
+                if rfc2109_as_netscape: 
+                    # treat 2109 cookies as Netscape cookies rather than
+                    # as RFC2965 cookies
+                    cookie.version = 0
+
+    def _make_cookies(self, response, request):
+        # get cookie-attributes for RFC 2965 and Netscape protocols
+        headers = response.info()
+        rfc2965_hdrs = headers.getheaders("Set-Cookie2")
+        ns_hdrs = headers.getheaders("Set-Cookie")
+
+        rfc2965 = self._policy.rfc2965
+        netscape = self._policy.netscape
+
+        if ((not rfc2965_hdrs and not ns_hdrs) or
+            (not ns_hdrs and not rfc2965) or
+            (not rfc2965_hdrs and not netscape) or
+            (not netscape and not rfc2965)):
+            return []  # no relevant cookie headers: quick exit
+
+        try:
+            cookies = self._cookies_from_attrs_set(
+                split_header_words(rfc2965_hdrs), request)
+        except:
+            reraise_unmasked_exceptions()
+            cookies = []
+
+        if ns_hdrs and netscape:
+            try:
+                # RFC 2109 and Netscape cookies
+                ns_cookies = self._cookies_from_attrs_set(
+                    parse_ns_headers(ns_hdrs), request)
+            except:
+                reraise_unmasked_exceptions()
+                ns_cookies = []
+            self._process_rfc2109_cookies(ns_cookies)
+
+            # Look for Netscape cookies (from Set-Cookie headers) that match
+            # corresponding RFC 2965 cookies (from Set-Cookie2 headers).
+            # For each match, keep the RFC 2965 cookie and ignore the Netscape
+            # cookie (RFC 2965 section 9.1).  Actually, RFC 2109 cookies are
+            # bundled in with the Netscape cookies for this purpose, which is
+            # reasonable behaviour.
+            if rfc2965:
+                lookup = {}
+                for cookie in cookies:
+                    lookup[(cookie.domain, cookie.path, cookie.name)] = None
+
+                def no_matching_rfc2965(ns_cookie, lookup=lookup):
+                    key = ns_cookie.domain, ns_cookie.path, ns_cookie.name
+                    return not lookup.has_key(key)
+                ns_cookies = filter(no_matching_rfc2965, ns_cookies)
+
+            if ns_cookies:
+                cookies.extend(ns_cookies)
+
+        return cookies
+
+    def make_cookies(self, response, request):
+        """Return sequence of Cookie objects extracted from response object.
+
+        See extract_cookies.__doc__ for the interface required of the
+        response and request arguments.
+
+        """
+        self._policy._now = self._now = int(time.time())
+        return [cookie for cookie in self._make_cookies(response, request)
+                if cookie.expires is None or not cookie.expires <= self._now]
+
+    def set_cookie_if_ok(self, cookie, request):
+        """Set a cookie if policy says it's OK to do so.
+
+        cookie: mechanize.Cookie instance
+        request: see extract_cookies.__doc__ for the required interface
+
+        """
+        self._policy._now = self._now = int(time.time())
+
+        if self._policy.set_ok(cookie, request):
+            self.set_cookie(cookie)
+
+    def set_cookie(self, cookie):
+        """Set a cookie, without checking whether or not it should be set.
+
+        cookie: mechanize.Cookie instance
+        """
+        c = self._cookies
+        if not c.has_key(cookie.domain): c[cookie.domain] = {}
+        c2 = c[cookie.domain]
+        if not c2.has_key(cookie.path): c2[cookie.path] = {}
+        c3 = c2[cookie.path]
+        c3[cookie.name] = cookie
+
+    def extract_cookies(self, response, request):
+        """Extract cookies from response, where allowable given the request.
+
+        Look for allowable Set-Cookie: and Set-Cookie2: headers in the response
+        object passed as argument.  Any of these headers that are found are
+        used to update the state of the object (subject to the policy.set_ok
+        method's approval).
+
+        The response object (usually be the result of a call to
+        mechanize.urlopen, or similar) should support an info method, which
+        returns a mimetools.Message object (in fact, the 'mimetools.Message
+        object' may be any object that provides a getheaders method).
+
+        The request object (usually a urllib2.Request instance) must support
+        the methods get_full_url, get_type, get_host, and is_unverifiable, as
+        documented by urllib2, and the port attribute (the port number).  The
+        request is used to set default values for cookie-attributes as well as
+        for checking that the cookie is OK to be set.
+
+        """
+        debug("extract_cookies: %s", response.info())
+        self._policy._now = self._now = int(time.time())
+
+        for cookie in self._make_cookies(response, request):
+            if cookie.expires is not None and cookie.expires <= self._now:
+                # Expiry date in past is request to delete cookie.  This can't be
+                # in DefaultCookiePolicy, because can't delete cookies there.
+                try:
+                    self.clear(cookie.domain, cookie.path, cookie.name)
+                except KeyError:
+                    pass
+                debug("Expiring cookie, domain='%s', path='%s', name='%s'",
+                      cookie.domain, cookie.path, cookie.name)
+            elif self._policy.set_ok(cookie, request):
+                debug(" setting cookie: %s", cookie)
+                self.set_cookie(cookie)
+
+    def clear(self, domain=None, path=None, name=None):
+        """Clear some cookies.
+
+        Invoking this method without arguments will clear all cookies.  If
+        given a single argument, only cookies belonging to that domain will be
+        removed.  If given two arguments, cookies belonging to the specified
+        path within that domain are removed.  If given three arguments, then
+        the cookie with the specified name, path and domain is removed.
+
+        Raises KeyError if no matching cookie exists.
+
+        """
+        if name is not None:
+            if (domain is None) or (path is None):
+                raise ValueError(
+                    "domain and path must be given to remove a cookie by name")
+            del self._cookies[domain][path][name]
+        elif path is not None:
+            if domain is None:
+                raise ValueError(
+                    "domain must be given to remove cookies by path")
+            del self._cookies[domain][path]
+        elif domain is not None:
+            del self._cookies[domain]
+        else:
+            self._cookies = {}
+
+    def clear_session_cookies(self):
+        """Discard all session cookies.
+
+        Discards all cookies held by object which had either no Max-Age or
+        Expires cookie-attribute or an explicit Discard cookie-attribute, or
+        which otherwise have ended up with a true discard attribute.  For
+        interactive browsers, the end of a session usually corresponds to
+        closing the browser window.
+
+        Note that the save method won't save session cookies anyway, unless you
+        ask otherwise by passing a true ignore_discard argument.
+
+        """
+        for cookie in self:
+            if cookie.discard:
+                self.clear(cookie.domain, cookie.path, cookie.name)
+
+    def clear_expired_cookies(self):
+        """Discard all expired cookies.
+
+        You probably don't need to call this method: expired cookies are never
+        sent back to the server (provided you're using DefaultCookiePolicy),
+        this method is called by CookieJar itself every so often, and the save
+        method won't save expired cookies anyway (unless you ask otherwise by
+        passing a true ignore_expires argument).
+
+        """
+        now = time.time()
+        for cookie in self:
+            if cookie.is_expired(now):
+                self.clear(cookie.domain, cookie.path, cookie.name)
+
+    def __getitem__(self, i):
+        if i == 0:
+            self._getitem_iterator = self.__iter__()
+        elif self._prev_getitem_index != i-1: raise IndexError(
+            "CookieJar.__getitem__ only supports sequential iteration")
+        self._prev_getitem_index = i
+        try:
+            return self._getitem_iterator.next()
+        except StopIteration:
+            raise IndexError()
+
+    def __iter__(self):
+        return MappingIterator(self._cookies)
+
+    def __len__(self):
+        """Return number of contained cookies."""
+        i = 0
+        for cookie in self: i = i + 1
+        return i
+
+    def __repr__(self):
+        r = []
+        for cookie in self: r.append(repr(cookie))
+        return "<%s[%s]>" % (self.__class__, ", ".join(r))
+
+    def __str__(self):
+        r = []
+        for cookie in self: r.append(str(cookie))
+        return "<%s[%s]>" % (self.__class__, ", ".join(r))
+
+
+class LoadError(Exception): pass
+
+class FileCookieJar(CookieJar):
+    """CookieJar that can be loaded from and saved to a file.
+
+    Additional methods
+
+    save(filename=None, ignore_discard=False, ignore_expires=False)
+    load(filename=None, ignore_discard=False, ignore_expires=False)
+    revert(filename=None, ignore_discard=False, ignore_expires=False)
+
+    Additional public attributes
+
+    filename: filename for loading and saving cookies
+
+    Additional public readable attributes
+
+    delayload: request that cookies are lazily loaded from disk; this is only
+     a hint since this only affects performance, not behaviour (unless the
+     cookies on disk are changing); a CookieJar object may ignore it (in fact,
+     only MSIECookieJar lazily loads cookies at the moment)
+
+    """
+
+    def __init__(self, filename=None, delayload=False, policy=None):
+        """
+        See FileCookieJar.__doc__ for argument documentation.
+
+        Cookies are NOT loaded from the named file until either the load or
+        revert method is called.
+
+        """
+        CookieJar.__init__(self, policy)
+        if filename is not None and not isstringlike(filename):
+            raise ValueError("filename must be string-like")
+        self.filename = filename
+        self.delayload = bool(delayload)
+
+    def save(self, filename=None, ignore_discard=False, ignore_expires=False):
+        """Save cookies to a file.
+
+        filename: name of file in which to save cookies
+        ignore_discard: save even cookies set to be discarded
+        ignore_expires: save even cookies that have expired
+
+        The file is overwritten if it already exists, thus wiping all its
+        cookies.  Saved cookies can be restored later using the load or revert
+        methods.  If filename is not specified, self.filename is used; if
+        self.filename is None, ValueError is raised.
+
+        """
+        raise NotImplementedError()
+
+    def load(self, filename=None, ignore_discard=False, ignore_expires=False):
+        """Load cookies from a file.
+
+        Old cookies are kept unless overwritten by newly loaded ones.
+
+        Arguments are as for .save().
+
+        If filename is not specified, self.filename is used; if self.filename
+        is None, ValueError is raised.  The named file must be in the format
+        understood by the class, or LoadError will be raised.  This format will
+        be identical to that written by the save method, unless the load format
+        is not sufficiently well understood (as is the case for MSIECookieJar).
+
+        """
+        if filename is None:
+            if self.filename is not None: filename = self.filename
+            else: raise ValueError(MISSING_FILENAME_TEXT)
+
+        f = open(filename)
+        try:
+            self._really_load(f, filename, ignore_discard, ignore_expires)
+        finally:
+            f.close()
+
+    def revert(self, filename=None,
+               ignore_discard=False, ignore_expires=False):
+        """Clear all cookies and reload cookies from a saved file.
+
+        Raises LoadError (or IOError) if reversion is not successful; the
+        object's state will not be altered if this happens.
+
+        """
+        if filename is None:
+            if self.filename is not None: filename = self.filename
+            else: raise ValueError(MISSING_FILENAME_TEXT)
+
+        old_state = copy.deepcopy(self._cookies)
+        self._cookies = {}
+        try:
+            self.load(filename, ignore_discard, ignore_expires)
+        except (LoadError, IOError):
+            self._cookies = old_state
+            raise

Added: mechanize/tags/0.1.10/mechanize/_debug.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_debug.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_debug.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,28 @@
+import logging
+
+from urllib2 import BaseHandler
+from _response import response_seek_wrapper
+
+
+class HTTPResponseDebugProcessor(BaseHandler):
+    handler_order = 900  # before redirections, after everything else
+
+    def http_response(self, request, response):
+        if not hasattr(response, "seek"):
+            response = response_seek_wrapper(response)
+        info = logging.getLogger("mechanize.http_responses").info
+        try:
+            info(response.read())
+        finally:
+            response.seek(0)
+        info("*****************************************************")
+        return response
+
+    https_response = http_response
+
+class HTTPRedirectDebugProcessor(BaseHandler):
+    def http_request(self, request):
+        if hasattr(request, "redirect_dict"):
+            info = logging.getLogger("mechanize.http_redirects").info
+            info("redirecting to %s", request.get_full_url())
+        return request

Added: mechanize/tags/0.1.10/mechanize/_file.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_file.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_file.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,60 @@
+try:
+    from cStringIO import StringIO
+except ImportError:
+    from StringIO import StringIO
+import mimetools
+import os
+import socket
+import urllib
+from urllib2 import BaseHandler, URLError
+
+
+class FileHandler(BaseHandler):
+    # Use local file or FTP depending on form of URL
+    def file_open(self, req):
+        url = req.get_selector()
+        if url[:2] == '//' and url[2:3] != '/':
+            req.type = 'ftp'
+            return self.parent.open(req)
+        else:
+            return self.open_local_file(req)
+
+    # names for the localhost
+    names = None
+    def get_names(self):
+        if FileHandler.names is None:
+            try:
+                FileHandler.names = (socket.gethostbyname('localhost'),
+                                    socket.gethostbyname(socket.gethostname()))
+            except socket.gaierror:
+                FileHandler.names = (socket.gethostbyname('localhost'),)
+        return FileHandler.names
+
+    # not entirely sure what the rules are here
+    def open_local_file(self, req):
+        try:
+            import email.utils as emailutils
+        except ImportError:
+            import email.Utils as emailutils
+        import mimetypes
+        host = req.get_host()
+        file = req.get_selector()
+        localfile = urllib.url2pathname(file)
+        try:
+            stats = os.stat(localfile)
+            size = stats.st_size
+            modified = emailutils.formatdate(stats.st_mtime, usegmt=True)
+            mtype = mimetypes.guess_type(file)[0]
+            headers = mimetools.Message(StringIO(
+                'Content-type: %s\nContent-length: %d\nLast-modified: %s\n' %
+                (mtype or 'text/plain', size, modified)))
+            if host:
+                host, port = urllib.splitport(host)
+            if not host or \
+                (not port and socket.gethostbyname(host) in self.get_names()):
+                return urllib.addinfourl(open(localfile, 'rb'),
+                                  headers, 'file:'+file)
+        except OSError, msg:
+            # urllib2 users shouldn't expect OSErrors coming from urlopen()
+            raise URLError(msg)
+        raise URLError('file not on local host')

Added: mechanize/tags/0.1.10/mechanize/_firefox3cookiejar.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_firefox3cookiejar.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_firefox3cookiejar.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,249 @@
+"""Firefox 3 "cookies.sqlite" cookie persistence.
+
+Copyright 2008 John J Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import logging
+import time
+import sqlite3
+
+from _clientcookie import CookieJar, Cookie, MappingIterator
+from _util import isstringlike, experimental
+debug = logging.getLogger("mechanize.cookies").debug
+
+
+class Firefox3CookieJar(CookieJar):
+
+    """Firefox 3 cookie jar.
+
+    The cookies are stored in Firefox 3's "cookies.sqlite" format.
+
+    Constructor arguments:
+
+    filename: filename of cookies.sqlite (typically found at the top level
+     of a firefox profile directory)
+    autoconnect: as a convenience, connect to the SQLite cookies database at
+     Firefox3CookieJar construction time (default True)
+    policy: an object satisfying the mechanize.CookiePolicy interface
+
+    Note that this is NOT a FileCookieJar, and there are no .load(),
+    .save() or .restore() methods.  The database is in sync with the
+    cookiejar object's state after each public method call.
+
+    Following Firefox's own behaviour, session cookies are never saved to
+    the database.
+
+    The file is created, and an sqlite database written to it, if it does
+    not already exist. The moz_cookies database table is created if it does
+    not already exist.
+    """
+
+    # XXX
+    # handle DatabaseError exceptions
+    # add a FileCookieJar (explicit .save() / .revert() / .load() methods)
+
+    def __init__(self, filename, autoconnect=True, policy=None):
+        experimental("Firefox3CookieJar is experimental code")
+        CookieJar.__init__(self, policy)
+        if filename is not None and not isstringlike(filename):
+            raise ValueError("filename must be string-like")
+        self.filename = filename
+        self._conn = None
+        if autoconnect:
+            self.connect()
+
+    def connect(self):
+        self._conn = sqlite3.connect(self.filename)
+        self._conn.isolation_level = "DEFERRED"
+        self._create_table_if_necessary()
+
+    def close(self):
+        self._conn.close()
+
+    def _transaction(self, func):
+        try:
+            cur = self._conn.cursor()
+            try:
+                result = func(cur)
+            finally:
+                cur.close()
+        except:
+            self._conn.rollback()
+            raise
+        else:
+            self._conn.commit()
+        return result
+
+    def _execute(self, query, params=()):
+        return self._transaction(lambda cur: cur.execute(query, params))
+
+    def _query(self, query, params=()):
+        # XXX should we bother with a transaction?
+        cur = self._conn.cursor()
+        try:
+            cur.execute(query, params)
+            for row in cur.fetchall():
+                yield row
+        finally:
+            cur.close()
+
+    def _create_table_if_necessary(self):
+        self._execute("""\
+CREATE TABLE IF NOT EXISTS moz_cookies (id INTEGER PRIMARY KEY, name TEXT,
+    value TEXT, host TEXT, path TEXT,expiry INTEGER,
+    lastAccessed INTEGER, isSecure INTEGER, isHttpOnly INTEGER)""")
+
+    def _cookie_from_row(self, row):
+        (pk, name, value, domain, path, expires,
+         last_accessed, secure, http_only) = row
+
+        version = 0
+        domain = domain.encode("ascii", "ignore")
+        path = path.encode("ascii", "ignore")
+        name = name.encode("ascii", "ignore")
+        value = value.encode("ascii", "ignore")
+        secure = bool(secure)
+
+        # last_accessed isn't a cookie attribute, so isn't added to rest
+        rest = {}
+        if http_only:
+            rest["HttpOnly"] = None
+
+        if name == "":
+            name = value
+            value = None
+
+        initial_dot = domain.startswith(".")
+        domain_specified = initial_dot
+
+        discard = False
+        if expires == "":
+            expires = None
+            discard = True
+
+        return Cookie(version, name, value,
+                      None, False,
+                      domain, domain_specified, initial_dot,
+                      path, False,
+                      secure,
+                      expires,
+                      discard,
+                      None,
+                      None,
+                      rest)
+
+    def clear(self, domain=None, path=None, name=None):
+        CookieJar.clear(self, domain, path, name)
+        where_parts = []
+        sql_params = []
+        if domain is not None:
+            where_parts.append("host = ?")
+            sql_params.append(domain)
+            if path is not None:
+                where_parts.append("path = ?")
+                sql_params.append(path)
+                if name is not None:
+                    where_parts.append("name = ?")
+                    sql_params.append(name)
+        where = " AND ".join(where_parts)
+        if where:
+            where = " WHERE " + where
+        def clear(cur):
+            cur.execute("DELETE FROM moz_cookies%s" % where,
+                        tuple(sql_params))
+        self._transaction(clear)
+
+    def _row_from_cookie(self, cookie, cur):
+        expires = cookie.expires
+        if cookie.discard:
+            expires = ""
+
+        domain = unicode(cookie.domain)
+        path = unicode(cookie.path)
+        name = unicode(cookie.name)
+        value = unicode(cookie.value)
+        secure = bool(int(cookie.secure))
+
+        if value is None:
+            value = name
+            name = ""
+
+        last_accessed = int(time.time())
+        http_only = cookie.has_nonstandard_attr("HttpOnly")
+
+        query = cur.execute("""SELECT MAX(id) + 1 from moz_cookies""")
+        pk = query.fetchone()[0]
+        if pk is None:
+            pk = 1
+
+        return (pk, name, value, domain, path, expires,
+                last_accessed, secure, http_only)
+
+    def set_cookie(self, cookie):
+        if cookie.discard:
+            CookieJar.set_cookie(self, cookie)
+            return
+
+        def set_cookie(cur):
+            # XXX
+            # is this RFC 2965-correct?
+            # could this do an UPDATE instead?
+            row = self._row_from_cookie(cookie, cur)
+            name, unused, domain, path = row[1:5]
+            cur.execute("""\
+DELETE FROM moz_cookies WHERE host = ? AND path = ? AND name = ?""",
+                        (domain, path, name))
+            cur.execute("""\
+INSERT INTO moz_cookies VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
+""", row)
+        self._transaction(set_cookie)
+
+    def __iter__(self):
+        # session (non-persistent) cookies
+        for cookie in MappingIterator(self._cookies):
+            yield cookie
+        # persistent cookies
+        for row in self._query("""\
+SELECT * FROM moz_cookies ORDER BY name, path, host"""):
+            yield self._cookie_from_row(row)
+
+    def _cookies_for_request(self, request):
+        session_cookies = CookieJar._cookies_for_request(self, request)
+        def get_cookies(cur):
+            query = cur.execute("SELECT host from moz_cookies")
+            domains = [row[0] for row in query.fetchmany()]
+            cookies = []
+            for domain in domains:
+                cookies += self._persistent_cookies_for_domain(domain,
+                                                               request, cur)
+            return cookies
+        persistent_coookies = self._transaction(get_cookies)
+        return session_cookies + persistent_coookies
+
+    def _persistent_cookies_for_domain(self, domain, request, cur):
+        cookies = []
+        if not self._policy.domain_return_ok(domain, request):
+            return []
+        debug("Checking %s for cookies to return", domain)
+        query = cur.execute("""\
+SELECT * from moz_cookies WHERE host = ? ORDER BY path""",
+                            (domain,))
+        cookies = [self._cookie_from_row(row) for row in query.fetchmany()]
+        last_path = None
+        r = []
+        for cookie in cookies:
+            if (cookie.path != last_path and
+                not self._policy.path_return_ok(cookie.path, request)):
+                last_path = cookie.path
+                continue
+            if not self._policy.return_ok(cookie, request):
+                debug("   not returning cookie")
+                continue
+            debug("   it's a match")
+            r.append(cookie)
+        return r

Added: mechanize/tags/0.1.10/mechanize/_gzip.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_gzip.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_gzip.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,103 @@
+import urllib2
+from cStringIO import StringIO
+import _response
+
+# GzipConsumer was taken from Fredrik Lundh's effbot.org-0.1-20041009 library
+class GzipConsumer:
+
+    def __init__(self, consumer):
+        self.__consumer = consumer
+        self.__decoder = None
+        self.__data = ""
+
+    def __getattr__(self, key):
+        return getattr(self.__consumer, key)
+
+    def feed(self, data):
+        if self.__decoder is None:
+            # check if we have a full gzip header
+            data = self.__data + data
+            try:
+                i = 10
+                flag = ord(data[3])
+                if flag & 4: # extra
+                    x = ord(data[i]) + 256*ord(data[i+1])
+                    i = i + 2 + x
+                if flag & 8: # filename
+                    while ord(data[i]):
+                        i = i + 1
+                    i = i + 1
+                if flag & 16: # comment
+                    while ord(data[i]):
+                        i = i + 1
+                    i = i + 1
+                if flag & 2: # crc
+                    i = i + 2
+                if len(data) < i:
+                    raise IndexError("not enough data")
+                if data[:3] != "\x1f\x8b\x08":
+                    raise IOError("invalid gzip data")
+                data = data[i:]
+            except IndexError:
+                self.__data = data
+                return # need more data
+            import zlib
+            self.__data = ""
+            self.__decoder = zlib.decompressobj(-zlib.MAX_WBITS)
+        data = self.__decoder.decompress(data)
+        if data:
+            self.__consumer.feed(data)
+
+    def close(self):
+        if self.__decoder:
+            data = self.__decoder.flush()
+            if data:
+                self.__consumer.feed(data)
+        self.__consumer.close()
+
+
+# --------------------------------------------------------------------
+
+# the rest of this module is John Lee's stupid code, not
+# Fredrik's nice code :-)
+
+class stupid_gzip_consumer:
+    def __init__(self): self.data = []
+    def feed(self, data): self.data.append(data)
+
+class stupid_gzip_wrapper(_response.closeable_response):
+    def __init__(self, response):
+        self._response = response
+
+        c = stupid_gzip_consumer()
+        gzc = GzipConsumer(c)
+        gzc.feed(response.read())
+        self.__data = StringIO("".join(c.data))
+
+    def read(self, size=-1):
+        return self.__data.read(size)
+    def readline(self, size=-1):
+        return self.__data.readline(size)
+    def readlines(self, sizehint=-1):
+        return self.__data.readlines(sizehint)
+
+    def __getattr__(self, name):
+        # delegate unknown methods/attributes
+        return getattr(self._response, name)
+
+class HTTPGzipProcessor(urllib2.BaseHandler):
+    handler_order = 200  # response processing before HTTPEquivProcessor
+
+    def http_request(self, request):
+        request.add_header("Accept-Encoding", "gzip")
+        return request
+
+    def http_response(self, request, response):
+        # post-process response
+        enc_hdrs = response.info().getheaders("Content-encoding")
+        for enc_hdr in enc_hdrs:
+            if ("gzip" in enc_hdr) or ("compress" in enc_hdr):
+                return stupid_gzip_wrapper(response)
+        return response
+
+    https_response = http_response

Added: mechanize/tags/0.1.10/mechanize/_headersutil.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_headersutil.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_headersutil.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,232 @@
+"""Utility functions for HTTP header value parsing and construction.
+
+Copyright 1997-1998, Gisle Aas
+Copyright 2002-2006, John J. Lee
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import os, re
+from types import StringType
+from types import UnicodeType
+STRING_TYPES = StringType, UnicodeType
+
+from _util import http2time
+import _rfc3986
+
+def is_html(ct_headers, url, allow_xhtml=False):
+    """
+    ct_headers: Sequence of Content-Type headers
+    url: Response URL
+
+    """
+    if not ct_headers:
+        # guess
+        ext = os.path.splitext(_rfc3986.urlsplit(url)[2])[1]
+        html_exts = [".htm", ".html"]
+        if allow_xhtml:
+            html_exts += [".xhtml"]
+        return ext in html_exts
+    # use first header
+    ct = split_header_words(ct_headers)[0][0][0]
+    html_types = ["text/html"]
+    if allow_xhtml:
+        html_types += [
+            "text/xhtml", "text/xml",
+            "application/xml", "application/xhtml+xml",
+            ]
+    return ct in html_types
+
+def unmatched(match):
+    """Return unmatched part of re.Match object."""
+    start, end = match.span(0)
+    return match.string[:start]+match.string[end:]
+
+token_re =        re.compile(r"^\s*([^=\s;,]+)")
+quoted_value_re = re.compile(r"^\s*=\s*\"([^\"\\]*(?:\\.[^\"\\]*)*)\"")
+value_re =        re.compile(r"^\s*=\s*([^\s;,]*)")
+escape_re = re.compile(r"\\(.)")
+def split_header_words(header_values):
+    r"""Parse header values into a list of lists containing key,value pairs.
+
+    The function knows how to deal with ",", ";" and "=" as well as quoted
+    values after "=".  A list of space separated tokens are parsed as if they
+    were separated by ";".
+
+    If the header_values passed as argument contains multiple values, then they
+    are treated as if they were a single value separated by comma ",".
+
+    This means that this function is useful for parsing header fields that
+    follow this syntax (BNF as from the HTTP/1.1 specification, but we relax
+    the requirement for tokens).
+
+      headers           = #header
+      header            = (token | parameter) *( [";"] (token | parameter))
+
+      token             = 1*<any CHAR except CTLs or separators>
+      separators        = "(" | ")" | "<" | ">" | "@"
+                        | "," | ";" | ":" | "\" | <">
+                        | "/" | "[" | "]" | "?" | "="
+                        | "{" | "}" | SP | HT
+
+      quoted-string     = ( <"> *(qdtext | quoted-pair ) <"> )
+      qdtext            = <any TEXT except <">>
+      quoted-pair       = "\" CHAR
+
+      parameter         = attribute "=" value
+      attribute         = token
+      value             = token | quoted-string
+
+    Each header is represented by a list of key/value pairs.  The value for a
+    simple token (not part of a parameter) is None.  Syntactically incorrect
+    headers will not necessarily be parsed as you would want.
+
+    This is easier to describe with some examples:
+
+    >>> split_header_words(['foo="bar"; port="80,81"; discard, bar=baz'])
+    [[('foo', 'bar'), ('port', '80,81'), ('discard', None)], [('bar', 'baz')]]
+    >>> split_header_words(['text/html; charset="iso-8859-1"'])
+    [[('text/html', None), ('charset', 'iso-8859-1')]]
+    >>> split_header_words([r'Basic realm="\"foo\bar\""'])
+    [[('Basic', None), ('realm', '"foobar"')]]
+
+    """
+    assert type(header_values) not in STRING_TYPES
+    result = []
+    for text in header_values:
+        orig_text = text
+        pairs = []
+        while text:
+            m = token_re.search(text)
+            if m:
+                text = unmatched(m)
+                name = m.group(1)
+                m = quoted_value_re.search(text)
+                if m:  # quoted value
+                    text = unmatched(m)
+                    value = m.group(1)
+                    value = escape_re.sub(r"\1", value)
+                else:
+                    m = value_re.search(text)
+                    if m:  # unquoted value
+                        text = unmatched(m)
+                        value = m.group(1)
+                        value = value.rstrip()
+                    else:
+                        # no value, a lone token
+                        value = None
+                pairs.append((name, value))
+            elif text.lstrip().startswith(","):
+                # concatenated headers, as per RFC 2616 section 4.2
+                text = text.lstrip()[1:]
+                if pairs: result.append(pairs)
+                pairs = []
+            else:
+                # skip junk
+                non_junk, nr_junk_chars = re.subn("^[=\s;]*", "", text)
+                assert nr_junk_chars > 0, (
+                    "split_header_words bug: '%s', '%s', %s" %
+                    (orig_text, text, pairs))
+                text = non_junk
+        if pairs: result.append(pairs)
+    return result
+
+join_escape_re = re.compile(r"([\"\\])")
+def join_header_words(lists):
+    """Do the inverse of the conversion done by split_header_words.
+
+    Takes a list of lists of (key, value) pairs and produces a single header
+    value.  Attribute values are quoted if needed.
+
+    >>> join_header_words([[("text/plain", None), ("charset", "iso-8859/1")]])
+    'text/plain; charset="iso-8859/1"'
+    >>> join_header_words([[("text/plain", None)], [("charset", "iso-8859/1")]])
+    'text/plain, charset="iso-8859/1"'
+
+    """
+    headers = []
+    for pairs in lists:
+        attr = []
+        for k, v in pairs:
+            if v is not None:
+                if not re.search(r"^\w+$", v):
+                    v = join_escape_re.sub(r"\\\1", v)  # escape " and \
+                    v = '"%s"' % v
+                if k is None:  # Netscape cookies may have no name
+                    k = v
+                else:
+                    k = "%s=%s" % (k, v)
+            attr.append(k)
+        if attr: headers.append("; ".join(attr))
+    return ", ".join(headers)
+
+def strip_quotes(text):
+    if text.startswith('"'):
+        text = text[1:]
+    if text.endswith('"'):
+        text = text[:-1]
+    return text
+
+def parse_ns_headers(ns_headers):
+    """Ad-hoc parser for Netscape protocol cookie-attributes.
+
+    The old Netscape cookie format for Set-Cookie can for instance contain
+    an unquoted "," in the expires field, so we have to use this ad-hoc
+    parser instead of split_header_words.
+
+    XXX This may not make the best possible effort to parse all the crap
+    that Netscape Cookie headers contain.  Ronald Tschalar's HTTPClient
+    parser is probably better, so could do worse than following that if
+    this ever gives any trouble.
+
+    Currently, this is also used for parsing RFC 2109 cookies.
+
+    """
+    known_attrs = ("expires", "domain", "path", "secure",
+                   # RFC 2109 attrs (may turn up in Netscape cookies, too)
+                   "version", "port", "max-age")
+
+    result = []
+    for ns_header in ns_headers:
+        pairs = []
+        version_set = False
+        params = re.split(r";\s*", ns_header)
+        for ii in range(len(params)):
+            param = params[ii]
+            param = param.rstrip()
+            if param == "": continue
+            if "=" not in param:
+                k, v = param, None
+            else:
+                k, v = re.split(r"\s*=\s*", param, 1)
+                k = k.lstrip()
+            if ii != 0:
+                lc = k.lower()
+                if lc in known_attrs:
+                    k = lc
+                if k == "version":
+                    # This is an RFC 2109 cookie.
+                    v = strip_quotes(v)
+                    version_set = True
+                if k == "expires":
+                    # convert expires date to seconds since epoch
+                    v = http2time(strip_quotes(v))  # None if invalid
+            pairs.append((k, v))
+
+        if pairs:
+            if not version_set:
+                pairs.append(("version", "0"))
+            result.append(pairs)
+
+    return result
+
+
+def _test():
+   import doctest, _headersutil
+   return doctest.testmod(_headersutil)
+
+if __name__ == "__main__":
+   _test()

Added: mechanize/tags/0.1.10/mechanize/_html.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_html.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_html.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,631 @@
+"""HTML handling.
+
+Copyright 2003-2006 John J. Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it under
+the terms of the BSD or ZPL 2.1 licenses (see the file COPYING.txt
+included with the distribution).
+
+"""
+
+import re, copy, htmlentitydefs
+import sgmllib, ClientForm
+
+import _request
+from _headersutil import split_header_words, is_html as _is_html
+import _rfc3986
+
+DEFAULT_ENCODING = "latin-1"
+
+COMPRESS_RE = re.compile(r"\s+")
+
+
+# the base classe is purely for backwards compatibility
+class ParseError(ClientForm.ParseError): pass
+
+
+class CachingGeneratorFunction(object):
+    """Caching wrapper around a no-arguments iterable."""
+
+    def __init__(self, iterable):
+        self._cache = []
+        # wrap iterable to make it non-restartable (otherwise, repeated
+        # __call__ would give incorrect results)
+        self._iterator = iter(iterable)
+
+    def __call__(self):
+        cache = self._cache
+        for item in cache:
+            yield item
+        for item in self._iterator:
+            cache.append(item)
+            yield item
+
+
+class EncodingFinder:
+    def __init__(self, default_encoding):
+        self._default_encoding = default_encoding
+    def encoding(self, response):
+        # HTTPEquivProcessor may be in use, so both HTTP and HTTP-EQUIV
+        # headers may be in the response.  HTTP-EQUIV headers come last,
+        # so try in order from first to last.
+        for ct in response.info().getheaders("content-type"):
+            for k, v in split_header_words([ct])[0]:
+                if k == "charset":
+                    return v
+        return self._default_encoding
+
+class ResponseTypeFinder:
+    def __init__(self, allow_xhtml):
+        self._allow_xhtml = allow_xhtml
+    def is_html(self, response, encoding):
+        ct_hdrs = response.info().getheaders("content-type")
+        url = response.geturl()
+        # XXX encoding
+        return _is_html(ct_hdrs, url, self._allow_xhtml)
+
+
+# idea for this argument-processing trick is from Peter Otten
+class Args:
+    def __init__(self, args_map):
+        self.dictionary = dict(args_map)
+    def __getattr__(self, key):
+        try:
+            return self.dictionary[key]
+        except KeyError:
+            return getattr(self.__class__, key)
+
+def form_parser_args(
+    select_default=False,
+    form_parser_class=None,
+    request_class=None,
+    backwards_compat=False,
+    ):
+    return Args(locals())
+
+
+class Link:
+    def __init__(self, base_url, url, text, tag, attrs):
+        assert None not in [url, tag, attrs]
+        self.base_url = base_url
+        self.absolute_url = _rfc3986.urljoin(base_url, url)
+        self.url, self.text, self.tag, self.attrs = url, text, tag, attrs
+    def __cmp__(self, other):
+        try:
+            for name in "url", "text", "tag", "attrs":
+                if getattr(self, name) != getattr(other, name):
+                    return -1
+        except AttributeError:
+            return -1
+        return 0
+    def __repr__(self):
+        return "Link(base_url=%r, url=%r, text=%r, tag=%r, attrs=%r)" % (
+            self.base_url, self.url, self.text, self.tag, self.attrs)
+
+
+class LinksFactory:
+
+    def __init__(self,
+                 link_parser_class=None,
+                 link_class=Link,
+                 urltags=None,
+                 ):
+        import _pullparser
+        if link_parser_class is None:
+            link_parser_class = _pullparser.TolerantPullParser
+        self.link_parser_class = link_parser_class
+        self.link_class = link_class
+        if urltags is None:
+            urltags = {
+                "a": "href",
+                "area": "href",
+                "frame": "src",
+                "iframe": "src",
+                }
+        self.urltags = urltags
+        self._response = None
+        self._encoding = None
+
+    def set_response(self, response, base_url, encoding):
+        self._response = response
+        self._encoding = encoding
+        self._base_url = base_url
+
+    def links(self):
+        """Return an iterator that provides links of the document."""
+        response = self._response
+        encoding = self._encoding
+        base_url = self._base_url
+        p = self.link_parser_class(response, encoding=encoding)
+
+        try:
+            for token in p.tags(*(self.urltags.keys()+["base"])):
+                if token.type == "endtag":
+                    continue
+                if token.data == "base":
+                    base_href = dict(token.attrs).get("href")
+                    if base_href is not None:
+                        base_url = base_href
+                    continue
+                attrs = dict(token.attrs)
+                tag = token.data
+                name = attrs.get("name")
+                text = None
+                # XXX use attr_encoding for ref'd doc if that doc does not
+                #  provide one by other means
+                #attr_encoding = attrs.get("charset")
+                url = attrs.get(self.urltags[tag])  # XXX is "" a valid URL?
+                if not url:
+                    # Probably an <A NAME="blah"> link or <AREA NOHREF...>.
+                    # For our purposes a link is something with a URL, so
+                    # ignore this.
+                    continue
+
+                url = _rfc3986.clean_url(url, encoding)
+                if tag == "a":
+                    if token.type != "startendtag":
+                        # hmm, this'd break if end tag is missing
+                        text = p.get_compressed_text(("endtag", tag))
+                    # but this doesn't work for eg.
+                    # <a href="blah"><b>Andy</b></a>
+                    #text = p.get_compressed_text()
+
+                yield Link(base_url, url, text, tag, token.attrs)
+        except sgmllib.SGMLParseError, exc:
+            raise ParseError(exc)
+
+class FormsFactory:
+
+    """Makes a sequence of objects satisfying ClientForm.HTMLForm interface.
+
+    After calling .forms(), the .global_form attribute is a form object
+    containing all controls not a descendant of any FORM element.
+
+    For constructor argument docs, see ClientForm.ParseResponse
+    argument docs.
+
+    """
+
+    def __init__(self,
+                 select_default=False,
+                 form_parser_class=None,
+                 request_class=None,
+                 backwards_compat=False,
+                 ):
+        import ClientForm
+        self.select_default = select_default
+        if form_parser_class is None:
+            form_parser_class = ClientForm.FormParser
+        self.form_parser_class = form_parser_class
+        if request_class is None:
+            request_class = _request.Request
+        self.request_class = request_class
+        self.backwards_compat = backwards_compat
+        self._response = None
+        self.encoding = None
+        self.global_form = None
+
+    def set_response(self, response, encoding):
+        self._response = response
+        self.encoding = encoding
+        self.global_form = None
+
+    def forms(self):
+        import ClientForm
+        encoding = self.encoding
+        try:
+            forms = ClientForm.ParseResponseEx(
+                self._response,
+                select_default=self.select_default,
+                form_parser_class=self.form_parser_class,
+                request_class=self.request_class,
+                encoding=encoding,
+                _urljoin=_rfc3986.urljoin,
+                _urlparse=_rfc3986.urlsplit,
+                _urlunparse=_rfc3986.urlunsplit,
+                )
+        except ClientForm.ParseError, exc:
+            raise ParseError(exc)
+        self.global_form = forms[0]
+        return forms[1:]
+
+class TitleFactory:
+    def __init__(self):
+        self._response = self._encoding = None
+
+    def set_response(self, response, encoding):
+        self._response = response
+        self._encoding = encoding
+
+    def _get_title_text(self, parser):
+        import _pullparser
+        text = []
+        tok = None
+        while 1:
+            try:
+                tok = parser.get_token()
+            except _pullparser.NoMoreTokensError:
+                break
+            if tok.type == "data":
+                text.append(str(tok))
+            elif tok.type == "entityref":
+                t = unescape("&%s;" % tok.data,
+                             parser._entitydefs, parser.encoding)
+                text.append(t)
+            elif tok.type == "charref":
+                t = unescape_charref(tok.data, parser.encoding)
+                text.append(t)
+            elif tok.type in ["starttag", "endtag", "startendtag"]:
+                tag_name = tok.data
+                if tok.type == "endtag" and tag_name == "title":
+                    break
+                text.append(str(tok))
+        return COMPRESS_RE.sub(" ", "".join(text).strip())
+
+    def title(self):
+        import _pullparser
+        p = _pullparser.TolerantPullParser(
+            self._response, encoding=self._encoding)
+        try:
+            try:
+                p.get_tag("title")
+            except _pullparser.NoMoreTokensError:
+                return None
+            else:
+                return self._get_title_text(p)
+        except sgmllib.SGMLParseError, exc:
+            raise ParseError(exc)
+
+
+def unescape(data, entities, encoding):
+    if data is None or "&" not in data:
+        return data
+
+    def replace_entities(match):
+        ent = match.group()
+        if ent[1] == "#":
+            return unescape_charref(ent[2:-1], encoding)
+
+        repl = entities.get(ent[1:-1])
+        if repl is not None:
+            repl = unichr(repl)
+            if type(repl) != type(""):
+                try:
+                    repl = repl.encode(encoding)
+                except UnicodeError:
+                    repl = ent
+        else:
+            repl = ent
+        return repl
+
+    return re.sub(r"&#?[A-Za-z0-9]+?;", replace_entities, data)
+
+def unescape_charref(data, encoding):
+    name, base = data, 10
+    if name.startswith("x"):
+        name, base= name[1:], 16
+    uc = unichr(int(name, base))
+    if encoding is None:
+        return uc
+    else:
+        try:
+            repl = uc.encode(encoding)
+        except UnicodeError:
+            repl = "&#%s;" % data
+        return repl
+
+
+# bizarre import gymnastics for bundled BeautifulSoup
+import _beautifulsoup
+import ClientForm
+RobustFormParser, NestingRobustFormParser = ClientForm._create_bs_classes(
+    _beautifulsoup.BeautifulSoup, _beautifulsoup.ICantBelieveItsBeautifulSoup
+    )
+# monkeypatch sgmllib to fix http://www.python.org/sf/803422 :-(
+sgmllib.charref = re.compile("&#(x?[0-9a-fA-F]+)[^0-9a-fA-F]")
+
+class MechanizeBs(_beautifulsoup.BeautifulSoup):
+    _entitydefs = htmlentitydefs.name2codepoint
+    # don't want the magic Microsoft-char workaround
+    PARSER_MASSAGE = [(re.compile('(<[^<>]*)/>'),
+                       lambda(x):x.group(1) + ' />'),
+                      (re.compile('<!\s+([^<>]*)>'),
+                       lambda(x):'<!' + x.group(1) + '>')
+                      ]
+
+    def __init__(self, encoding, text=None, avoidParserProblems=True,
+                 initialTextIsEverything=True):
+        self._encoding = encoding
+        _beautifulsoup.BeautifulSoup.__init__(
+            self, text, avoidParserProblems, initialTextIsEverything)
+
+    def handle_charref(self, ref):
+        t = unescape("&#%s;"%ref, self._entitydefs, self._encoding)
+        self.handle_data(t)
+    def handle_entityref(self, ref):
+        t = unescape("&%s;"%ref, self._entitydefs, self._encoding)
+        self.handle_data(t)
+    def unescape_attrs(self, attrs):
+        escaped_attrs = []
+        for key, val in attrs:
+            val = unescape(val, self._entitydefs, self._encoding)
+            escaped_attrs.append((key, val))
+        return escaped_attrs
+
+class RobustLinksFactory:
+
+    compress_re = COMPRESS_RE
+
+    def __init__(self,
+                 link_parser_class=None,
+                 link_class=Link,
+                 urltags=None,
+                 ):
+        if link_parser_class is None:
+            link_parser_class = MechanizeBs
+        self.link_parser_class = link_parser_class
+        self.link_class = link_class
+        if urltags is None:
+            urltags = {
+                "a": "href",
+                "area": "href",
+                "frame": "src",
+                "iframe": "src",
+                }
+        self.urltags = urltags
+        self._bs = None
+        self._encoding = None
+        self._base_url = None
+
+    def set_soup(self, soup, base_url, encoding):
+        self._bs = soup
+        self._base_url = base_url
+        self._encoding = encoding
+
+    def links(self):
+        import _beautifulsoup
+        bs = self._bs
+        base_url = self._base_url
+        encoding = self._encoding
+        gen = bs.recursiveChildGenerator()
+        for ch in bs.recursiveChildGenerator():
+            if (isinstance(ch, _beautifulsoup.Tag) and
+                ch.name in self.urltags.keys()+["base"]):
+                link = ch
+                attrs = bs.unescape_attrs(link.attrs)
+                attrs_dict = dict(attrs)
+                if link.name == "base":
+                    base_href = attrs_dict.get("href")
+                    if base_href is not None:
+                        base_url = base_href
+                    continue
+                url_attr = self.urltags[link.name]
+                url = attrs_dict.get(url_attr)
+                if not url:
+                    continue
+                url = _rfc3986.clean_url(url, encoding)
+                text = link.fetchText(lambda t: True)
+                if not text:
+                    # follow _pullparser's weird behaviour rigidly
+                    if link.name == "a":
+                        text = ""
+                    else:
+                        text = None
+                else:
+                    text = self.compress_re.sub(" ", " ".join(text).strip())
+                yield Link(base_url, url, text, link.name, attrs)
+
+
+class RobustFormsFactory(FormsFactory):
+    def __init__(self, *args, **kwds):
+        args = form_parser_args(*args, **kwds)
+        if args.form_parser_class is None:
+            args.form_parser_class = RobustFormParser
+        FormsFactory.__init__(self, **args.dictionary)
+
+    def set_response(self, response, encoding):
+        self._response = response
+        self.encoding = encoding
+
+
+class RobustTitleFactory:
+    def __init__(self):
+        self._bs = self._encoding = None
+
+    def set_soup(self, soup, encoding):
+        self._bs = soup
+        self._encoding = encoding
+
+    def title(self):
+        import _beautifulsoup
+        title = self._bs.first("title")
+        if title == _beautifulsoup.Null:
+            return None
+        else:
+            inner_html = "".join([str(node) for node in title.contents])
+            return COMPRESS_RE.sub(" ", inner_html.strip())
+
+
+class Factory:
+    """Factory for forms, links, etc.
+
+    This interface may expand in future.
+
+    Public methods:
+
+    set_request_class(request_class)
+    set_response(response)
+    forms()
+    links()
+
+    Public attributes:
+
+    Note that accessing these attributes may raise ParseError.
+
+    encoding: string specifying the encoding of response if it contains a text
+     document (this value is left unspecified for documents that do not have
+     an encoding, e.g. an image file)
+    is_html: true if response contains an HTML document (XHTML may be
+     regarded as HTML too)
+    title: page title, or None if no title or not HTML
+    global_form: form object containing all controls that are not descendants
+     of any FORM element, or None if the forms_factory does not support
+     supplying a global form
+
+    """
+
+    LAZY_ATTRS = ["encoding", "is_html", "title", "global_form"]
+
+    def __init__(self, forms_factory, links_factory, title_factory,
+                 encoding_finder=EncodingFinder(DEFAULT_ENCODING),
+                 response_type_finder=ResponseTypeFinder(allow_xhtml=False),
+                 ):
+        """
+
+        Pass keyword arguments only.
+
+        default_encoding: character encoding to use if encoding cannot be
+         determined (or guessed) from the response.  You should turn on
+         HTTP-EQUIV handling if you want the best chance of getting this right
+         without resorting to this default.  The default value of this
+         parameter (currently latin-1) may change in future.
+
+        """
+        self._forms_factory = forms_factory
+        self._links_factory = links_factory
+        self._title_factory = title_factory
+        self._encoding_finder = encoding_finder
+        self._response_type_finder = response_type_finder
+
+        self.set_response(None)
+
+    def set_request_class(self, request_class):
+        """Set urllib2.Request class.
+
+        ClientForm.HTMLForm instances returned by .forms() will return
+        instances of this class when .click()ed.
+
+        """
+        self._forms_factory.request_class = request_class
+
+    def set_response(self, response):
+        """Set response.
+
+        The response must either be None or implement the same interface as
+        objects returned by urllib2.urlopen().
+
+        """
+        self._response = response
+        self._forms_genf = self._links_genf = None
+        self._get_title = None
+        for name in self.LAZY_ATTRS:
+            try:
+                delattr(self, name)
+            except AttributeError:
+                pass
+
+    def __getattr__(self, name):
+        if name not in self.LAZY_ATTRS:
+            return getattr(self.__class__, name)
+
+        if name == "encoding":
+            self.encoding = self._encoding_finder.encoding(
+                copy.copy(self._response))
+            return self.encoding
+        elif name == "is_html":
+            self.is_html = self._response_type_finder.is_html(
+                copy.copy(self._response), self.encoding)
+            return self.is_html
+        elif name == "title":
+            if self.is_html:
+                self.title = self._title_factory.title()
+            else:
+                self.title = None
+            return self.title
+        elif name == "global_form":
+            self.forms()
+            return self.global_form
+
+    def forms(self):
+        """Return iterable over ClientForm.HTMLForm-like objects.
+
+        Raises mechanize.ParseError on failure.
+        """
+        # this implementation sets .global_form as a side-effect, for benefit
+        # of __getattr__ impl
+        if self._forms_genf is None:
+            try:
+                self._forms_genf = CachingGeneratorFunction(
+                    self._forms_factory.forms())
+            except:  # XXXX define exception!
+                self.set_response(self._response)
+                raise
+            self.global_form = getattr(
+                self._forms_factory, "global_form", None)
+        return self._forms_genf()
+
+    def links(self):
+        """Return iterable over mechanize.Link-like objects.
+
+        Raises mechanize.ParseError on failure.
+        """
+        if self._links_genf is None:
+            try:
+                self._links_genf = CachingGeneratorFunction(
+                    self._links_factory.links())
+            except:  # XXXX define exception!
+                self.set_response(self._response)
+                raise
+        return self._links_genf()
+
+class DefaultFactory(Factory):
+    """Based on sgmllib."""
+    def __init__(self, i_want_broken_xhtml_support=False):
+        Factory.__init__(
+            self,
+            forms_factory=FormsFactory(),
+            links_factory=LinksFactory(),
+            title_factory=TitleFactory(),
+            response_type_finder=ResponseTypeFinder(
+                allow_xhtml=i_want_broken_xhtml_support),
+            )
+
+    def set_response(self, response):
+        Factory.set_response(self, response)
+        if response is not None:
+            self._forms_factory.set_response(
+                copy.copy(response), self.encoding)
+            self._links_factory.set_response(
+                copy.copy(response), response.geturl(), self.encoding)
+            self._title_factory.set_response(
+                copy.copy(response), self.encoding)
+
+class RobustFactory(Factory):
+    """Based on BeautifulSoup, hopefully a bit more robust to bad HTML than is
+    DefaultFactory.
+
+    """
+    def __init__(self, i_want_broken_xhtml_support=False,
+                 soup_class=None):
+        Factory.__init__(
+            self,
+            forms_factory=RobustFormsFactory(),
+            links_factory=RobustLinksFactory(),
+            title_factory=RobustTitleFactory(),
+            response_type_finder=ResponseTypeFinder(
+                allow_xhtml=i_want_broken_xhtml_support),
+            )
+        if soup_class is None:
+            soup_class = MechanizeBs
+        self._soup_class = soup_class
+
+    def set_response(self, response):
+        Factory.set_response(self, response)
+        if response is not None:
+            data = response.read()
+            soup = self._soup_class(self.encoding, data)
+            self._forms_factory.set_response(
+                copy.copy(response), self.encoding)
+            self._links_factory.set_soup(
+                soup, response.geturl(), self.encoding)
+            self._title_factory.set_soup(soup, self.encoding)

Added: mechanize/tags/0.1.10/mechanize/_http.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_http.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_http.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,758 @@
+"""HTTP related handlers.
+
+Note that some other HTTP handlers live in more specific modules: _auth.py,
+_gzip.py, etc.
+
+
+Copyright 2002-2006 John J Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import time, htmlentitydefs, logging, socket, \
+       urllib2, urllib, httplib, sgmllib
+from urllib2 import URLError, HTTPError, BaseHandler
+from cStringIO import StringIO
+
+from _clientcookie import CookieJar
+from _headersutil import is_html
+from _html import unescape, unescape_charref
+from _request import Request
+from _response import closeable_response, response_seek_wrapper
+import _rfc3986
+import _sockettimeout
+
+debug = logging.getLogger("mechanize").debug
+debug_robots = logging.getLogger("mechanize.robots").debug
+
+# monkeypatch urllib2.HTTPError to show URL
+## def urllib2_str(self):
+##     return 'HTTP Error %s: %s (%s)' % (
+##         self.code, self.msg, self.geturl())
+## urllib2.HTTPError.__str__ = urllib2_str
+
+
+CHUNK = 1024  # size of chunks fed to HTML HEAD parser, in bytes
+DEFAULT_ENCODING = 'latin-1'
+
+
+try:
+    socket._fileobject("fake socket", close=True)
+except TypeError:
+    # python <= 2.4
+    create_readline_wrapper = socket._fileobject
+else:
+    def create_readline_wrapper(fh):
+        return socket._fileobject(fh, close=True)
+
+
+# This adds "refresh" to the list of redirectables and provides a redirection
+# algorithm that doesn't go into a loop in the presence of cookies
+# (Python 2.4 has this new algorithm, 2.3 doesn't).
+class HTTPRedirectHandler(BaseHandler):
+    # maximum number of redirections to any single URL
+    # this is needed because of the state that cookies introduce
+    max_repeats = 4
+    # maximum total number of redirections (regardless of URL) before
+    # assuming we're in a loop
+    max_redirections = 10
+
+    # Implementation notes:
+
+    # To avoid the server sending us into an infinite loop, the request
+    # object needs to track what URLs we have already seen.  Do this by
+    # adding a handler-specific attribute to the Request object.  The value
+    # of the dict is used to count the number of times the same URL has
+    # been visited.  This is needed because visiting the same URL twice
+    # does not necessarily imply a loop, thanks to state introduced by
+    # cookies.
+
+    # Always unhandled redirection codes:
+    # 300 Multiple Choices: should not handle this here.
+    # 304 Not Modified: no need to handle here: only of interest to caches
+    #     that do conditional GETs
+    # 305 Use Proxy: probably not worth dealing with here
+    # 306 Unused: what was this for in the previous versions of protocol??
+
+    def redirect_request(self, newurl, req, fp, code, msg, headers):
+        """Return a Request or None in response to a redirect.
+
+        This is called by the http_error_30x methods when a redirection
+        response is received.  If a redirection should take place, return a
+        new Request to allow http_error_30x to perform the redirect;
+        otherwise, return None to indicate that an HTTPError should be
+        raised.
+
+        """
+        if code in (301, 302, 303, "refresh") or \
+               (code == 307 and not req.has_data()):
+            # Strictly (according to RFC 2616), 301 or 302 in response to
+            # a POST MUST NOT cause a redirection without confirmation
+            # from the user (of urllib2, in this case).  In practice,
+            # essentially all clients do redirect in this case, so we do
+            # the same.
+            # XXX really refresh redirections should be visiting; tricky to
+            #  fix, so this will wait until post-stable release
+            new = Request(newurl,
+                          headers=req.headers,
+                          origin_req_host=req.get_origin_req_host(),
+                          unverifiable=True,
+                          visit=False,
+                          )
+            new._origin_req = getattr(req, "_origin_req", req)
+            return new
+        else:
+            raise HTTPError(req.get_full_url(), code, msg, headers, fp)
+
+    def http_error_302(self, req, fp, code, msg, headers):
+        # Some servers (incorrectly) return multiple Location headers
+        # (so probably same goes for URI).  Use first header.
+        if headers.has_key('location'):
+            newurl = headers.getheaders('location')[0]
+        elif headers.has_key('uri'):
+            newurl = headers.getheaders('uri')[0]
+        else:
+            return
+        newurl = _rfc3986.clean_url(newurl, "latin-1")
+        newurl = _rfc3986.urljoin(req.get_full_url(), newurl)
+
+        # XXX Probably want to forget about the state of the current
+        # request, although that might interact poorly with other
+        # handlers that also use handler-specific request attributes
+        new = self.redirect_request(newurl, req, fp, code, msg, headers)
+        if new is None:
+            return
+
+        # loop detection
+        # .redirect_dict has a key url if url was previously visited.
+        if hasattr(req, 'redirect_dict'):
+            visited = new.redirect_dict = req.redirect_dict
+            if (visited.get(newurl, 0) >= self.max_repeats or
+                len(visited) >= self.max_redirections):
+                raise HTTPError(req.get_full_url(), code,
+                                self.inf_msg + msg, headers, fp)
+        else:
+            visited = new.redirect_dict = req.redirect_dict = {}
+        visited[newurl] = visited.get(newurl, 0) + 1
+
+        # Don't close the fp until we are sure that we won't use it
+        # with HTTPError.  
+        fp.read()
+        fp.close()
+
+        return self.parent.open(new)
+
+    http_error_301 = http_error_303 = http_error_307 = http_error_302
+    http_error_refresh = http_error_302
+
+    inf_msg = "The HTTP server returned a redirect error that would " \
+              "lead to an infinite loop.\n" \
+              "The last 30x error message was:\n"
+
+
+# XXX would self.reset() work, instead of raising this exception?
+class EndOfHeadError(Exception): pass
+class AbstractHeadParser:
+    # only these elements are allowed in or before HEAD of document
+    head_elems = ("html", "head",
+                  "title", "base",
+                  "script", "style", "meta", "link", "object")
+    _entitydefs = htmlentitydefs.name2codepoint
+    _encoding = DEFAULT_ENCODING
+
+    def __init__(self):
+        self.http_equiv = []
+
+    def start_meta(self, attrs):
+        http_equiv = content = None
+        for key, value in attrs:
+            if key == "http-equiv":
+                http_equiv = self.unescape_attr_if_required(value)
+            elif key == "content":
+                content = self.unescape_attr_if_required(value)
+        if http_equiv is not None and content is not None:
+            self.http_equiv.append((http_equiv, content))
+
+    def end_head(self):
+        raise EndOfHeadError()
+
+    def handle_entityref(self, name):
+        #debug("%s", name)
+        self.handle_data(unescape(
+            '&%s;' % name, self._entitydefs, self._encoding))
+
+    def handle_charref(self, name):
+        #debug("%s", name)
+        self.handle_data(unescape_charref(name, self._encoding))
+
+    def unescape_attr(self, name):
+        #debug("%s", name)
+        return unescape(name, self._entitydefs, self._encoding)
+
+    def unescape_attrs(self, attrs):
+        #debug("%s", attrs)
+        escaped_attrs = {}
+        for key, val in attrs.items():
+            escaped_attrs[key] = self.unescape_attr(val)
+        return escaped_attrs
+
+    def unknown_entityref(self, ref):
+        self.handle_data("&%s;" % ref)
+
+    def unknown_charref(self, ref):
+        self.handle_data("&#%s;" % ref)
+
+
+try:
+    import HTMLParser
+except ImportError:
+    pass
+else:
+    class XHTMLCompatibleHeadParser(AbstractHeadParser,
+                                    HTMLParser.HTMLParser):
+        def __init__(self):
+            HTMLParser.HTMLParser.__init__(self)
+            AbstractHeadParser.__init__(self)
+
+        def handle_starttag(self, tag, attrs):
+            if tag not in self.head_elems:
+                raise EndOfHeadError()
+            try:
+                method = getattr(self, 'start_' + tag)
+            except AttributeError:
+                try:
+                    method = getattr(self, 'do_' + tag)
+                except AttributeError:
+                    pass # unknown tag
+                else:
+                    method(attrs)
+            else:
+                method(attrs)
+
+        def handle_endtag(self, tag):
+            if tag not in self.head_elems:
+                raise EndOfHeadError()
+            try:
+                method = getattr(self, 'end_' + tag)
+            except AttributeError:
+                pass # unknown tag
+            else:
+                method()
+
+        def unescape(self, name):
+            # Use the entitydefs passed into constructor, not
+            # HTMLParser.HTMLParser's entitydefs.
+            return self.unescape_attr(name)
+
+        def unescape_attr_if_required(self, name):
+            return name  # HTMLParser.HTMLParser already did it
+
+class HeadParser(AbstractHeadParser, sgmllib.SGMLParser):
+
+    def _not_called(self):
+        assert False
+
+    def __init__(self):
+        sgmllib.SGMLParser.__init__(self)
+        AbstractHeadParser.__init__(self)
+
+    def handle_starttag(self, tag, method, attrs):
+        if tag not in self.head_elems:
+            raise EndOfHeadError()
+        if tag == "meta":
+            method(attrs)
+
+    def unknown_starttag(self, tag, attrs):
+        self.handle_starttag(tag, self._not_called, attrs)
+
+    def handle_endtag(self, tag, method):
+        if tag in self.head_elems:
+            method()
+        else:
+            raise EndOfHeadError()
+
+    def unescape_attr_if_required(self, name):
+        return self.unescape_attr(name)
+
+def parse_head(fileobj, parser):
+    """Return a list of key, value pairs."""
+    while 1:
+        data = fileobj.read(CHUNK)
+        try:
+            parser.feed(data)
+        except EndOfHeadError:
+            break
+        if len(data) != CHUNK:
+            # this should only happen if there is no HTML body, or if
+            # CHUNK is big
+            break
+    return parser.http_equiv
+
+class HTTPEquivProcessor(BaseHandler):
+    """Append META HTTP-EQUIV headers to regular HTTP headers."""
+
+    handler_order = 300  # before handlers that look at HTTP headers
+
+    def __init__(self, head_parser_class=HeadParser,
+                 i_want_broken_xhtml_support=False,
+                 ):
+        self.head_parser_class = head_parser_class
+        self._allow_xhtml = i_want_broken_xhtml_support
+
+    def http_response(self, request, response):
+        if not hasattr(response, "seek"):
+            response = response_seek_wrapper(response)
+        http_message = response.info()
+        url = response.geturl()
+        ct_hdrs = http_message.getheaders("content-type")
+        if is_html(ct_hdrs, url, self._allow_xhtml):
+            try:
+                try:
+                    html_headers = parse_head(response,
+                                              self.head_parser_class())
+                finally:
+                    response.seek(0)
+            except (HTMLParser.HTMLParseError,
+                    sgmllib.SGMLParseError):
+                pass
+            else:
+                for hdr, val in html_headers:
+                    # add a header
+                    http_message.dict[hdr.lower()] = val
+                    text = hdr + ": " + val
+                    for line in text.split("\n"):
+                        http_message.headers.append(line + "\n")
+        return response
+
+    https_response = http_response
+
+class HTTPCookieProcessor(BaseHandler):
+    """Handle HTTP cookies.
+
+    Public attributes:
+
+    cookiejar: CookieJar instance
+
+    """
+    def __init__(self, cookiejar=None):
+        if cookiejar is None:
+            cookiejar = CookieJar()
+        self.cookiejar = cookiejar
+
+    def http_request(self, request):
+        self.cookiejar.add_cookie_header(request)
+        return request
+
+    def http_response(self, request, response):
+        self.cookiejar.extract_cookies(response, request)
+        return response
+
+    https_request = http_request
+    https_response = http_response
+
+try:
+    import robotparser
+except ImportError:
+    pass
+else:
+    class MechanizeRobotFileParser(robotparser.RobotFileParser):
+
+        def __init__(self, url='', opener=None):
+            robotparser.RobotFileParser.__init__(self, url)
+            self._opener = opener
+            self._timeout = _sockettimeout._GLOBAL_DEFAULT_TIMEOUT
+
+        def set_opener(self, opener=None):
+            import _opener
+            if opener is None:
+                opener = _opener.OpenerDirector()
+            self._opener = opener
+
+        def set_timeout(self, timeout):
+            self._timeout = timeout
+
+        def read(self):
+            """Reads the robots.txt URL and feeds it to the parser."""
+            if self._opener is None:
+                self.set_opener()
+            req = Request(self.url, unverifiable=True, visit=False,
+                          timeout=self._timeout)
+            try:
+                f = self._opener.open(req)
+            except HTTPError, f:
+                pass
+            except (IOError, socket.error, OSError), exc:
+                debug_robots("ignoring error opening %r: %s" %
+                                   (self.url, exc))
+                return
+            lines = []
+            line = f.readline()
+            while line:
+                lines.append(line.strip())
+                line = f.readline()
+            status = f.code
+            if status == 401 or status == 403:
+                self.disallow_all = True
+                debug_robots("disallow all")
+            elif status >= 400:
+                self.allow_all = True
+                debug_robots("allow all")
+            elif status == 200 and lines:
+                debug_robots("parse lines")
+                self.parse(lines)
+
+    class RobotExclusionError(urllib2.HTTPError):
+        def __init__(self, request, *args):
+            apply(urllib2.HTTPError.__init__, (self,)+args)
+            self.request = request
+
+    class HTTPRobotRulesProcessor(BaseHandler):
+        # before redirections, after everything else
+        handler_order = 800
+
+        try:
+            from httplib import HTTPMessage
+        except:
+            from mimetools import Message
+            http_response_class = Message
+        else:
+            http_response_class = HTTPMessage
+
+        def __init__(self, rfp_class=MechanizeRobotFileParser):
+            self.rfp_class = rfp_class
+            self.rfp = None
+            self._host = None
+
+        def http_request(self, request):
+            scheme = request.get_type()
+            if scheme not in ["http", "https"]:
+                # robots exclusion only applies to HTTP
+                return request
+
+            if request.get_selector() == "/robots.txt":
+                # /robots.txt is always OK to fetch
+                return request
+
+            host = request.get_host()
+
+            # robots.txt requests don't need to be allowed by robots.txt :-)
+            origin_req = getattr(request, "_origin_req", None)
+            if (origin_req is not None and
+                origin_req.get_selector() == "/robots.txt" and
+                origin_req.get_host() == host
+                ):
+                return request
+
+            if host != self._host:
+                self.rfp = self.rfp_class()
+                try:
+                    self.rfp.set_opener(self.parent)
+                except AttributeError:
+                    debug("%r instance does not support set_opener" %
+                          self.rfp.__class__)
+                self.rfp.set_url(scheme+"://"+host+"/robots.txt")
+                self.rfp.set_timeout(request.timeout)
+                self.rfp.read()
+                self._host = host
+
+            ua = request.get_header("User-agent", "")
+            if self.rfp.can_fetch(ua, request.get_full_url()):
+                return request
+            else:
+                # XXX This should really have raised URLError.  Too late now...
+                msg = "request disallowed by robots.txt"
+                raise RobotExclusionError(
+                    request,
+                    request.get_full_url(),
+                    403, msg,
+                    self.http_response_class(StringIO()), StringIO(msg))
+
+        https_request = http_request
+
+class HTTPRefererProcessor(BaseHandler):
+    """Add Referer header to requests.
+
+    This only makes sense if you use each RefererProcessor for a single
+    chain of requests only (so, for example, if you use a single
+    HTTPRefererProcessor to fetch a series of URLs extracted from a single
+    page, this will break).
+
+    There's a proper implementation of this in mechanize.Browser.
+
+    """
+    def __init__(self):
+        self.referer = None
+
+    def http_request(self, request):
+        if ((self.referer is not None) and
+            not request.has_header("Referer")):
+            request.add_unredirected_header("Referer", self.referer)
+        return request
+
+    def http_response(self, request, response):
+        self.referer = response.geturl()
+        return response
+
+    https_request = http_request
+    https_response = http_response
+
+
+def clean_refresh_url(url):
+    # e.g. Firefox 1.5 does (something like) this
+    if ((url.startswith('"') and url.endswith('"')) or
+        (url.startswith("'") and url.endswith("'"))):
+        url = url[1:-1]
+    return _rfc3986.clean_url(url, "latin-1")  # XXX encoding
+
+def parse_refresh_header(refresh):
+    """
+    >>> parse_refresh_header("1; url=http://example.com/")
+    (1.0, 'http://example.com/')
+    >>> parse_refresh_header("1; url='http://example.com/'")
+    (1.0, 'http://example.com/')
+    >>> parse_refresh_header("1")
+    (1.0, None)
+    >>> parse_refresh_header("blah")
+    Traceback (most recent call last):
+    ValueError: invalid literal for float(): blah
+
+    """
+
+    ii = refresh.find(";")
+    if ii != -1:
+        pause, newurl_spec = float(refresh[:ii]), refresh[ii+1:]
+        jj = newurl_spec.find("=")
+        key = None
+        if jj != -1:
+            key, newurl = newurl_spec[:jj], newurl_spec[jj+1:]
+            newurl = clean_refresh_url(newurl)
+        if key is None or key.strip().lower() != "url":
+            raise ValueError()
+    else:
+        pause, newurl = float(refresh), None
+    return pause, newurl
+
+class HTTPRefreshProcessor(BaseHandler):
+    """Perform HTTP Refresh redirections.
+
+    Note that if a non-200 HTTP code has occurred (for example, a 30x
+    redirect), this processor will do nothing.
+
+    By default, only zero-time Refresh headers are redirected.  Use the
+    max_time attribute / constructor argument to allow Refresh with longer
+    pauses.  Use the honor_time attribute / constructor argument to control
+    whether the requested pause is honoured (with a time.sleep()) or
+    skipped in favour of immediate redirection.
+
+    Public attributes:
+
+    max_time: see above
+    honor_time: see above
+
+    """
+    handler_order = 1000
+
+    def __init__(self, max_time=0, honor_time=True):
+        self.max_time = max_time
+        self.honor_time = honor_time
+        self._sleep = time.sleep
+
+    def http_response(self, request, response):
+        code, msg, hdrs = response.code, response.msg, response.info()
+
+        if code == 200 and hdrs.has_key("refresh"):
+            refresh = hdrs.getheaders("refresh")[0]
+            try:
+                pause, newurl = parse_refresh_header(refresh)
+            except ValueError:
+                debug("bad Refresh header: %r" % refresh)
+                return response
+
+            if newurl is None:
+                newurl = response.geturl()
+            if (self.max_time is None) or (pause <= self.max_time):
+                if pause > 1E-3 and self.honor_time:
+                    self._sleep(pause)
+                hdrs["location"] = newurl
+                # hardcoded http is NOT a bug
+                response = self.parent.error(
+                    "http", request, response,
+                    "refresh", msg, hdrs)
+            else:
+                debug("Refresh header ignored: %r" % refresh)
+
+        return response
+
+    https_response = http_response
+
+class HTTPErrorProcessor(BaseHandler):
+    """Process HTTP error responses.
+
+    The purpose of this handler is to to allow other response processors a
+    look-in by removing the call to parent.error() from
+    AbstractHTTPHandler.
+
+    For non-200 error codes, this just passes the job on to the
+    Handler.<proto>_error_<code> methods, via the OpenerDirector.error
+    method.  Eventually, urllib2.HTTPDefaultErrorHandler will raise an
+    HTTPError if no other handler handles the error.
+
+    """
+    handler_order = 1000  # after all other processors
+
+    def http_response(self, request, response):
+        code, msg, hdrs = response.code, response.msg, response.info()
+
+        if code != 200:
+            # hardcoded http is NOT a bug
+            response = self.parent.error(
+                "http", request, response, code, msg, hdrs)
+
+        return response
+
+    https_response = http_response
+
+
+class HTTPDefaultErrorHandler(BaseHandler):
+    def http_error_default(self, req, fp, code, msg, hdrs):
+        # why these error methods took the code, msg, headers args in the first
+        # place rather than a response object, I don't know, but to avoid
+        # multiple wrapping, we're discarding them
+
+        if isinstance(fp, urllib2.HTTPError):
+            response = fp
+        else:
+            response = urllib2.HTTPError(
+                req.get_full_url(), code, msg, hdrs, fp)
+        assert code == response.code
+        assert msg == response.msg
+        assert hdrs == response.hdrs
+        raise response
+
+
+class AbstractHTTPHandler(BaseHandler):
+
+    def __init__(self, debuglevel=0):
+        self._debuglevel = debuglevel
+
+    def set_http_debuglevel(self, level):
+        self._debuglevel = level
+
+    def do_request_(self, request):
+        host = request.get_host()
+        if not host:
+            raise URLError('no host given')
+
+        if request.has_data():  # POST
+            data = request.get_data()
+            if not request.has_header('Content-type'):
+                request.add_unredirected_header(
+                    'Content-type',
+                    'application/x-www-form-urlencoded')
+            if not request.has_header('Content-length'):
+                request.add_unredirected_header(
+                    'Content-length', '%d' % len(data))
+
+        scheme, sel = urllib.splittype(request.get_selector())
+        sel_host, sel_path = urllib.splithost(sel)
+        if not request.has_header('Host'):
+            request.add_unredirected_header('Host', sel_host or host)
+        for name, value in self.parent.addheaders:
+            name = name.capitalize()
+            if not request.has_header(name):
+                request.add_unredirected_header(name, value)
+
+        return request
+
+    def do_open(self, http_class, req):
+        """Return an addinfourl object for the request, using http_class.
+
+        http_class must implement the HTTPConnection API from httplib.
+        The addinfourl return value is a file-like object.  It also
+        has methods and attributes including:
+            - info(): return a mimetools.Message object for the headers
+            - geturl(): return the original request URL
+            - code: HTTP status code
+        """
+        host_port = req.get_host()
+        if not host_port:
+            raise URLError('no host given')
+
+        try:
+            h = http_class(host_port, timeout=req.timeout)
+        except TypeError:
+            # Python < 2.6, no per-connection timeout support
+            h = http_class(host_port)
+        h.set_debuglevel(self._debuglevel)
+
+        headers = dict(req.headers)
+        headers.update(req.unredirected_hdrs)
+        # We want to make an HTTP/1.1 request, but the addinfourl
+        # class isn't prepared to deal with a persistent connection.
+        # It will try to read all remaining data from the socket,
+        # which will block while the server waits for the next request.
+        # So make sure the connection gets closed after the (only)
+        # request.
+        headers["Connection"] = "close"
+        headers = dict(
+            [(name.title(), val) for name, val in headers.items()])
+        try:
+            h.request(req.get_method(), req.get_selector(), req.data, headers)
+            r = h.getresponse()
+        except socket.error, err: # XXX what error?
+            raise URLError(err)
+
+        # Pick apart the HTTPResponse object to get the addinfourl
+        # object initialized properly.
+
+        # Wrap the HTTPResponse object in socket's file object adapter
+        # for Windows.  That adapter calls recv(), so delegate recv()
+        # to read().  This weird wrapping allows the returned object to
+        # have readline() and readlines() methods.
+
+        # XXX It might be better to extract the read buffering code
+        # out of socket._fileobject() and into a base class.
+
+        r.recv = r.read
+        fp = create_readline_wrapper(r)
+
+        resp = closeable_response(fp, r.msg, req.get_full_url(),
+                                  r.status, r.reason)
+        return resp
+
+
+class HTTPHandler(AbstractHTTPHandler):
+    def http_open(self, req):
+        return self.do_open(httplib.HTTPConnection, req)
+
+    http_request = AbstractHTTPHandler.do_request_
+
+if hasattr(httplib, 'HTTPS'):
+
+    class HTTPSConnectionFactory:
+        def __init__(self, key_file, cert_file):
+            self._key_file = key_file
+            self._cert_file = cert_file
+        def __call__(self, hostport):
+            return httplib.HTTPSConnection(
+                hostport,
+                key_file=self._key_file, cert_file=self._cert_file)
+
+    class HTTPSHandler(AbstractHTTPHandler):
+        def __init__(self, client_cert_manager=None):
+            AbstractHTTPHandler.__init__(self)
+            self.client_cert_manager = client_cert_manager
+
+        def https_open(self, req):
+            if self.client_cert_manager is not None:
+                key_file, cert_file = self.client_cert_manager.find_key_cert(
+                    req.get_full_url())
+                conn_factory = HTTPSConnectionFactory(key_file, cert_file)
+            else:
+                conn_factory = httplib.HTTPSConnection
+            return self.do_open(conn_factory, req)
+
+        https_request = AbstractHTTPHandler.do_request_

Added: mechanize/tags/0.1.10/mechanize/_lwpcookiejar.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_lwpcookiejar.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_lwpcookiejar.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,185 @@
+"""Load / save to libwww-perl (LWP) format files.
+
+Actually, the format is slightly extended from that used by LWP's
+(libwww-perl's) HTTP::Cookies, to avoid losing some RFC 2965 information
+not recorded by LWP.
+
+It uses the version string "2.0", though really there isn't an LWP Cookies
+2.0 format.  This indicates that there is extra information in here
+(domain_dot and port_spec) while still being compatible with libwww-perl,
+I hope.
+
+Copyright 2002-2006 John J Lee <jjl at pobox.com>
+Copyright 1997-1999 Gisle Aas (original libwww-perl code)
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import time, re, logging
+
+from _clientcookie import reraise_unmasked_exceptions, FileCookieJar, Cookie, \
+     MISSING_FILENAME_TEXT, LoadError
+from _headersutil import join_header_words, split_header_words
+from _util import iso2time, time2isoz
+
+debug = logging.getLogger("mechanize").debug
+
+
+def lwp_cookie_str(cookie):
+    """Return string representation of Cookie in an the LWP cookie file format.
+
+    Actually, the format is extended a bit -- see module docstring.
+
+    """
+    h = [(cookie.name, cookie.value),
+         ("path", cookie.path),
+         ("domain", cookie.domain)]
+    if cookie.port is not None: h.append(("port", cookie.port))
+    if cookie.path_specified: h.append(("path_spec", None))
+    if cookie.port_specified: h.append(("port_spec", None))
+    if cookie.domain_initial_dot: h.append(("domain_dot", None))
+    if cookie.secure: h.append(("secure", None))
+    if cookie.expires: h.append(("expires",
+                               time2isoz(float(cookie.expires))))
+    if cookie.discard: h.append(("discard", None))
+    if cookie.comment: h.append(("comment", cookie.comment))
+    if cookie.comment_url: h.append(("commenturl", cookie.comment_url))
+    if cookie.rfc2109: h.append(("rfc2109", None))
+
+    keys = cookie.nonstandard_attr_keys()
+    keys.sort()
+    for k in keys:
+        h.append((k, str(cookie.get_nonstandard_attr(k))))
+
+    h.append(("version", str(cookie.version)))
+
+    return join_header_words([h])
+
+class LWPCookieJar(FileCookieJar):
+    """
+    The LWPCookieJar saves a sequence of"Set-Cookie3" lines.
+    "Set-Cookie3" is the format used by the libwww-perl libary, not known
+    to be compatible with any browser, but which is easy to read and
+    doesn't lose information about RFC 2965 cookies.
+
+    Additional methods
+
+    as_lwp_str(ignore_discard=True, ignore_expired=True)
+
+    """
+
+    magic_re = r"^\#LWP-Cookies-(\d+\.\d+)"
+
+    def as_lwp_str(self, ignore_discard=True, ignore_expires=True):
+        """Return cookies as a string of "\n"-separated "Set-Cookie3" headers.
+
+        ignore_discard and ignore_expires: see docstring for FileCookieJar.save
+
+        """
+        now = time.time()
+        r = []
+        for cookie in self:
+            if not ignore_discard and cookie.discard:
+                debug("   Not saving %s: marked for discard", cookie.name)
+                continue
+            if not ignore_expires and cookie.is_expired(now):
+                debug("   Not saving %s: expired", cookie.name)
+                continue
+            r.append("Set-Cookie3: %s" % lwp_cookie_str(cookie))
+        return "\n".join(r+[""])
+
+    def save(self, filename=None, ignore_discard=False, ignore_expires=False):
+        if filename is None:
+            if self.filename is not None: filename = self.filename
+            else: raise ValueError(MISSING_FILENAME_TEXT)
+
+        f = open(filename, "w")
+        try:
+            debug("Saving LWP cookies file")
+            # There really isn't an LWP Cookies 2.0 format, but this indicates
+            # that there is extra information in here (domain_dot and
+            # port_spec) while still being compatible with libwww-perl, I hope.
+            f.write("#LWP-Cookies-2.0\n")
+            f.write(self.as_lwp_str(ignore_discard, ignore_expires))
+        finally:
+            f.close()
+
+    def _really_load(self, f, filename, ignore_discard, ignore_expires):
+        magic = f.readline()
+        if not re.search(self.magic_re, magic):
+            msg = "%s does not seem to contain cookies" % filename
+            raise LoadError(msg)
+
+        now = time.time()
+
+        header = "Set-Cookie3:"
+        boolean_attrs = ("port_spec", "path_spec", "domain_dot",
+                         "secure", "discard", "rfc2109")
+        value_attrs = ("version",
+                       "port", "path", "domain",
+                       "expires",
+                       "comment", "commenturl")
+
+        try:
+            while 1:
+                line = f.readline()
+                if line == "": break
+                if not line.startswith(header):
+                    continue
+                line = line[len(header):].strip()
+
+                for data in split_header_words([line]):
+                    name, value = data[0]
+                    standard = {}
+                    rest = {}
+                    for k in boolean_attrs:
+                        standard[k] = False
+                    for k, v in data[1:]:
+                        if k is not None:
+                            lc = k.lower()
+                        else:
+                            lc = None
+                        # don't lose case distinction for unknown fields
+                        if (lc in value_attrs) or (lc in boolean_attrs):
+                            k = lc
+                        if k in boolean_attrs:
+                            if v is None: v = True
+                            standard[k] = v
+                        elif k in value_attrs:
+                            standard[k] = v
+                        else:
+                            rest[k] = v
+
+                    h = standard.get
+                    expires = h("expires")
+                    discard = h("discard")
+                    if expires is not None:
+                        expires = iso2time(expires)
+                    if expires is None:
+                        discard = True
+                    domain = h("domain")
+                    domain_specified = domain.startswith(".")
+                    c = Cookie(h("version"), name, value,
+                               h("port"), h("port_spec"),
+                               domain, domain_specified, h("domain_dot"),
+                               h("path"), h("path_spec"),
+                               h("secure"),
+                               expires,
+                               discard,
+                               h("comment"),
+                               h("commenturl"),
+                               rest,
+                               h("rfc2109"),
+                               ) 
+                    if not ignore_discard and c.discard:
+                        continue
+                    if not ignore_expires and c.is_expired(now):
+                        continue
+                    self.set_cookie(c)
+        except:
+            reraise_unmasked_exceptions((IOError,))
+            raise LoadError("invalid Set-Cookie3 format file %s" % filename)
+

Added: mechanize/tags/0.1.10/mechanize/_mechanize.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_mechanize.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_mechanize.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,676 @@
+"""Stateful programmatic WWW navigation, after Perl's WWW::Mechanize.
+
+Copyright 2003-2006 John J. Lee <jjl at pobox.com>
+Copyright 2003 Andy Lester (original Perl code)
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file COPYING.txt
+included with the distribution).
+
+"""
+
+import urllib2, copy, re, os, urllib
+
+
+from _html import DefaultFactory
+import _response
+import _request
+import _rfc3986
+import _sockettimeout
+from _useragent import UserAgentBase
+
+__version__ = (0, 1, 10, None, None)  # 0.1.10
+
+class BrowserStateError(Exception): pass
+class LinkNotFoundError(Exception): pass
+class FormNotFoundError(Exception): pass
+
+
+def sanepathname2url(path):
+    urlpath = urllib.pathname2url(path)
+    if os.name == "nt" and urlpath.startswith("///"):
+        urlpath = urlpath[2:]
+    # XXX don't ask me about the mac...
+    return urlpath
+
+
+class History:
+    """
+
+    Though this will become public, the implied interface is not yet stable.
+
+    """
+    def __init__(self):
+        self._history = []  # LIFO
+    def add(self, request, response):
+        self._history.append((request, response))
+    def back(self, n, _response):
+        response = _response  # XXX move Browser._response into this class?
+        while n > 0 or response is None:
+            try:
+                request, response = self._history.pop()
+            except IndexError:
+                raise BrowserStateError("already at start of history")
+            n -= 1
+        return request, response
+    def clear(self):
+        del self._history[:]
+    def close(self):
+        for request, response in self._history:
+            if response is not None:
+                response.close()
+        del self._history[:]
+
+
+class HTTPRefererProcessor(urllib2.BaseHandler):
+    def http_request(self, request):
+        # See RFC 2616 14.36.  The only times we know the source of the
+        # request URI has a URI associated with it are redirect, and
+        # Browser.click() / Browser.submit() / Browser.follow_link().
+        # Otherwise, it's the user's job to add any Referer header before
+        # .open()ing.
+        if hasattr(request, "redirect_dict"):
+            request = self.parent._add_referer_header(
+                request, origin_request=False)
+        return request
+
+    https_request = http_request
+
+
+class Browser(UserAgentBase):
+    """Browser-like class with support for history, forms and links.
+
+    BrowserStateError is raised whenever the browser is in the wrong state to
+    complete the requested operation - eg., when .back() is called when the
+    browser history is empty, or when .follow_link() is called when the current
+    response does not contain HTML data.
+
+    Public attributes:
+
+    request: current request (mechanize.Request or urllib2.Request)
+    form: currently selected form (see .select_form())
+
+    """
+
+    handler_classes = copy.copy(UserAgentBase.handler_classes)
+    handler_classes["_referer"] = HTTPRefererProcessor
+    default_features = copy.copy(UserAgentBase.default_features)
+    default_features.append("_referer")
+
+    def __init__(self,
+                 factory=None,
+                 history=None,
+                 request_class=None,
+                 ):
+        """
+
+        Only named arguments should be passed to this constructor.
+
+        factory: object implementing the mechanize.Factory interface.
+        history: object implementing the mechanize.History interface.  Note
+         this interface is still experimental and may change in future.
+        request_class: Request class to use.  Defaults to mechanize.Request
+         by default for Pythons older than 2.4, urllib2.Request otherwise.
+
+        The Factory and History objects passed in are 'owned' by the Browser,
+        so they should not be shared across Browsers.  In particular,
+        factory.set_response() should not be called except by the owning
+        Browser itself.
+
+        Note that the supplied factory's request_class is overridden by this
+        constructor, to ensure only one Request class is used.
+
+        """
+        self._handle_referer = True
+
+        if history is None:
+            history = History()
+        self._history = history
+
+        if request_class is None:
+            if not hasattr(urllib2.Request, "add_unredirected_header"):
+                request_class = _request.Request
+            else:
+                request_class = urllib2.Request  # Python >= 2.4
+
+        if factory is None:
+            factory = DefaultFactory()
+        factory.set_request_class(request_class)
+        self._factory = factory
+        self.request_class = request_class
+
+        self.request = None
+        self._set_response(None, False)
+
+        # do this last to avoid __getattr__ problems
+        UserAgentBase.__init__(self)
+
+    def close(self):
+        UserAgentBase.close(self)
+        if self._response is not None:
+            self._response.close()    
+        if self._history is not None:
+            self._history.close()
+            self._history = None
+
+        # make use after .close easy to spot
+        self.form = None
+        self.request = self._response = None
+        self.request = self.response = self.set_response = None
+        self.geturl =  self.reload = self.back = None
+        self.clear_history = self.set_cookie = self.links = self.forms = None
+        self.viewing_html = self.encoding = self.title = None
+        self.select_form = self.click = self.submit = self.click_link = None
+        self.follow_link = self.find_link = None
+
+    def set_handle_referer(self, handle):
+        """Set whether to add Referer header to each request."""
+        self._set_handler("_referer", handle)
+        self._handle_referer = bool(handle)
+
+    def _add_referer_header(self, request, origin_request=True):
+        if self.request is None:
+            return request
+        scheme = request.get_type()
+        original_scheme = self.request.get_type()
+        if scheme not in ["http", "https"]:
+            return request
+        if not origin_request and not self.request.has_header("Referer"):
+            return request
+
+        if (self._handle_referer and
+            original_scheme in ["http", "https"] and
+            not (original_scheme == "https" and scheme != "https")):
+            # strip URL fragment (RFC 2616 14.36)
+            parts = _rfc3986.urlsplit(self.request.get_full_url())
+            parts = parts[:-1]+(None,)
+            referer = _rfc3986.urlunsplit(parts)
+            request.add_unredirected_header("Referer", referer)
+        return request
+
+    def open_novisit(self, url, data=None,
+                     timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+        """Open a URL without visiting it.
+
+        Browser state (including request, response, history, forms and links)
+        is left unchanged by calling this function.
+
+        The interface is the same as for .open().
+
+        This is useful for things like fetching images.
+
+        See also .retrieve().
+
+        """
+        return self._mech_open(url, data, visit=False, timeout=timeout)
+
+    def open(self, url, data=None,
+             timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+        return self._mech_open(url, data, timeout=timeout)
+
+    def _mech_open(self, url, data=None, update_history=True, visit=None,
+                   timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+        try:
+            url.get_full_url
+        except AttributeError:
+            # string URL -- convert to absolute URL if required
+            scheme, authority = _rfc3986.urlsplit(url)[:2]
+            if scheme is None:
+                # relative URL
+                if self._response is None:
+                    raise BrowserStateError(
+                        "can't fetch relative reference: "
+                        "not viewing any document")
+                url = _rfc3986.urljoin(self._response.geturl(), url)
+
+        request = self._request(url, data, visit, timeout)
+        visit = request.visit
+        if visit is None:
+            visit = True
+
+        if visit:
+            self._visit_request(request, update_history)
+
+        success = True
+        try:
+            response = UserAgentBase.open(self, request, data)
+        except urllib2.HTTPError, error:
+            success = False
+            if error.fp is None:  # not a response
+                raise
+            response = error
+##         except (IOError, socket.error, OSError), error:
+##             # Yes, urllib2 really does raise all these :-((
+##             # See test_urllib2.py for examples of socket.gaierror and OSError,
+##             # plus note that FTPHandler raises IOError.
+##             # XXX I don't seem to have an example of exactly socket.error being
+##             #  raised, only socket.gaierror...
+##             # I don't want to start fixing these here, though, since this is a
+##             # subclass of OpenerDirector, and it would break old code.  Even in
+##             # Python core, a fix would need some backwards-compat. hack to be
+##             # acceptable.
+##             raise
+
+        if visit:
+            self._set_response(response, False)
+            response = copy.copy(self._response)
+        elif response is not None:
+            response = _response.upgrade_response(response)
+
+        if not success:
+            raise response
+        return response
+
+    def __str__(self):
+        text = []
+        text.append("<%s " % self.__class__.__name__)
+        if self._response:
+            text.append("visiting %s" % self._response.geturl())
+        else:
+            text.append("(not visiting a URL)")
+        if self.form:
+            text.append("\n selected form:\n %s\n" % str(self.form))
+        text.append(">")
+        return "".join(text)
+
+    def response(self):
+        """Return a copy of the current response.
+
+        The returned object has the same interface as the object returned by
+        .open() (or urllib2.urlopen()).
+
+        """
+        return copy.copy(self._response)
+
+    def open_local_file(self, filename):
+        path = sanepathname2url(os.path.abspath(filename))
+        url = 'file://'+path
+        return self.open(url)
+
+    def set_response(self, response):
+        """Replace current response with (a copy of) response.
+
+        response may be None.
+
+        This is intended mostly for HTML-preprocessing.
+        """
+        self._set_response(response, True)
+
+    def _set_response(self, response, close_current):
+        # sanity check, necessary but far from sufficient
+        if not (response is None or
+                (hasattr(response, "info") and hasattr(response, "geturl") and
+                 hasattr(response, "read")
+                 )
+                ):
+            raise ValueError("not a response object")
+
+        self.form = None
+        if response is not None:
+            response = _response.upgrade_response(response)
+        if close_current and self._response is not None:
+            self._response.close()
+        self._response = response
+        self._factory.set_response(response)
+
+    def visit_response(self, response, request=None):
+        """Visit the response, as if it had been .open()ed.
+
+        Unlike .set_response(), this updates history rather than replacing the
+        current response.
+        """
+        if request is None:
+            request = _request.Request(response.geturl())
+        self._visit_request(request, True)
+        self._set_response(response, False)
+
+    def _visit_request(self, request, update_history):
+        if self._response is not None:
+            self._response.close()
+        if self.request is not None and update_history:
+            self._history.add(self.request, self._response)
+        self._response = None
+        # we want self.request to be assigned even if UserAgentBase.open
+        # fails
+        self.request = request
+
+    def geturl(self):
+        """Get URL of current document."""
+        if self._response is None:
+            raise BrowserStateError("not viewing any document")
+        return self._response.geturl()
+
+    def reload(self):
+        """Reload current document, and return response object."""
+        if self.request is None:
+            raise BrowserStateError("no URL has yet been .open()ed")
+        if self._response is not None:
+            self._response.close()
+        return self._mech_open(self.request, update_history=False)
+
+    def back(self, n=1):
+        """Go back n steps in history, and return response object.
+
+        n: go back this number of steps (default 1 step)
+
+        """
+        if self._response is not None:
+            self._response.close()
+        self.request, response = self._history.back(n, self._response)
+        self.set_response(response)
+        if not response.read_complete:
+            return self.reload()
+        return copy.copy(response)
+
+    def clear_history(self):
+        self._history.clear()
+
+    def set_cookie(self, cookie_string):
+        """Request to set a cookie.
+
+        Note that it is NOT necessary to call this method under ordinary
+        circumstances: cookie handling is normally entirely automatic.  The
+        intended use case is rather to simulate the setting of a cookie by
+        client script in a web page (e.g. JavaScript).  In that case, use of
+        this method is necessary because mechanize currently does not support
+        JavaScript, VBScript, etc.
+
+        The cookie is added in the same way as if it had arrived with the
+        current response, as a result of the current request.  This means that,
+        for example, if it is not appropriate to set the cookie based on the
+        current request, no cookie will be set.
+
+        The cookie will be returned automatically with subsequent responses
+        made by the Browser instance whenever that's appropriate.
+
+        cookie_string should be a valid value of the Set-Cookie header.
+
+        For example:
+
+        browser.set_cookie(
+            "sid=abcdef; expires=Wednesday, 09-Nov-06 23:12:40 GMT")
+
+        Currently, this method does not allow for adding RFC 2986 cookies.
+        This limitation will be lifted if anybody requests it.
+
+        """
+        if self._response is None:
+            raise BrowserStateError("not viewing any document")
+        if self.request.get_type() not in ["http", "https"]:
+            raise BrowserStateError("can't set cookie for non-HTTP/HTTPS "
+                                    "transactions")
+        cookiejar = self._ua_handlers["_cookies"].cookiejar
+        response = self.response()  # copy
+        headers = response.info()
+        headers["Set-cookie"] = cookie_string
+        cookiejar.extract_cookies(response, self.request)
+
+    def links(self, **kwds):
+        """Return iterable over links (mechanize.Link objects)."""
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+        links = self._factory.links()
+        if kwds:
+            return self._filter_links(links, **kwds)
+        else:
+            return links
+
+    def forms(self):
+        """Return iterable over forms.
+
+        The returned form objects implement the ClientForm.HTMLForm interface.
+
+        """
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+        return self._factory.forms()
+
+    def global_form(self):
+        """Return the global form object, or None if the factory implementation
+        did not supply one.
+
+        The "global" form object contains all controls that are not descendants
+        of any FORM element.
+
+        The returned form object implements the ClientForm.HTMLForm interface.
+
+        This is a separate method since the global form is not regarded as part
+        of the sequence of forms in the document -- mostly for
+        backwards-compatibility.
+
+        """
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+        return self._factory.global_form
+
+    def viewing_html(self):
+        """Return whether the current response contains HTML data."""
+        if self._response is None:
+            raise BrowserStateError("not viewing any document")
+        return self._factory.is_html
+
+    def encoding(self):
+        if self._response is None:
+            raise BrowserStateError("not viewing any document")
+        return self._factory.encoding
+
+    def title(self):
+        r"""Return title, or None if there is no title element in the document.
+
+        Treatment of any tag children of attempts to follow Firefox and IE
+        (currently, tags are preserved).
+
+        """
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+        return self._factory.title
+
+    def select_form(self, name=None, predicate=None, nr=None):
+        """Select an HTML form for input.
+
+        This is a bit like giving a form the "input focus" in a browser.
+
+        If a form is selected, the Browser object supports the HTMLForm
+        interface, so you can call methods like .set_value(), .set(), and
+        .click().
+
+        Another way to select a form is to assign to the .form attribute.  The
+        form assigned should be one of the objects returned by the .forms()
+        method.
+
+        At least one of the name, predicate and nr arguments must be supplied.
+        If no matching form is found, mechanize.FormNotFoundError is raised.
+
+        If name is specified, then the form must have the indicated name.
+
+        If predicate is specified, then the form must match that function.  The
+        predicate function is passed the HTMLForm as its single argument, and
+        should return a boolean value indicating whether the form matched.
+
+        nr, if supplied, is the sequence number of the form (where 0 is the
+        first).  Note that control 0 is the first form matching all the other
+        arguments (if supplied); it is not necessarily the first control in the
+        form.  The "global form" (consisting of all form controls not contained
+        in any FORM element) is considered not to be part of this sequence and
+        to have no name, so will not be matched unless both name and nr are
+        None.
+
+        """
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+        if (name is None) and (predicate is None) and (nr is None):
+            raise ValueError(
+                "at least one argument must be supplied to specify form")
+
+        global_form = self._factory.global_form
+        if nr is None and name is None and \
+               predicate is not None and predicate(global_form):
+            self.form = global_form
+            return
+
+        orig_nr = nr
+        for form in self.forms():
+            if name is not None and name != form.name:
+                continue
+            if predicate is not None and not predicate(form):
+                continue
+            if nr:
+                nr -= 1
+                continue
+            self.form = form
+            break  # success
+        else:
+            # failure
+            description = []
+            if name is not None: description.append("name '%s'" % name)
+            if predicate is not None:
+                description.append("predicate %s" % predicate)
+            if orig_nr is not None: description.append("nr %d" % orig_nr)
+            description = ", ".join(description)
+            raise FormNotFoundError("no form matching "+description)
+
+    def click(self, *args, **kwds):
+        """See ClientForm.HTMLForm.click for documentation."""
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+        request = self.form.click(*args, **kwds)
+        return self._add_referer_header(request)
+
+    def submit(self, *args, **kwds):
+        """Submit current form.
+
+        Arguments are as for ClientForm.HTMLForm.click().
+
+        Return value is same as for Browser.open().
+
+        """
+        return self.open(self.click(*args, **kwds))
+
+    def click_link(self, link=None, **kwds):
+        """Find a link and return a Request object for it.
+
+        Arguments are as for .find_link(), except that a link may be supplied
+        as the first argument.
+
+        """
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+        if not link:
+            link = self.find_link(**kwds)
+        else:
+            if kwds:
+                raise ValueError(
+                    "either pass a Link, or keyword arguments, not both")
+        request = self.request_class(link.absolute_url)
+        return self._add_referer_header(request)
+
+    def follow_link(self, link=None, **kwds):
+        """Find a link and .open() it.
+
+        Arguments are as for .click_link().
+
+        Return value is same as for Browser.open().
+
+        """
+        return self.open(self.click_link(link, **kwds))
+
+    def find_link(self, **kwds):
+        """Find a link in current page.
+
+        Links are returned as mechanize.Link objects.
+
+        # Return third link that .search()-matches the regexp "python"
+        # (by ".search()-matches", I mean that the regular expression method
+        # .search() is used, rather than .match()).
+        find_link(text_regex=re.compile("python"), nr=2)
+
+        # Return first http link in the current page that points to somewhere
+        # on python.org whose link text (after tags have been removed) is
+        # exactly "monty python".
+        find_link(text="monty python",
+                  url_regex=re.compile("http.*python.org"))
+
+        # Return first link with exactly three HTML attributes.
+        find_link(predicate=lambda link: len(link.attrs) == 3)
+
+        Links include anchors (<a>), image maps (<area>), and frames (<frame>,
+        <iframe>).
+
+        All arguments must be passed by keyword, not position.  Zero or more
+        arguments may be supplied.  In order to find a link, all arguments
+        supplied must match.
+
+        If a matching link is not found, mechanize.LinkNotFoundError is raised.
+
+        text: link text between link tags: eg. <a href="blah">this bit</a> (as
+         returned by pullparser.get_compressed_text(), ie. without tags but
+         with opening tags "textified" as per the pullparser docs) must compare
+         equal to this argument, if supplied
+        text_regex: link text between tag (as defined above) must match the
+         regular expression object or regular expression string passed as this
+         argument, if supplied
+        name, name_regex: as for text and text_regex, but matched against the
+         name HTML attribute of the link tag
+        url, url_regex: as for text and text_regex, but matched against the
+         URL of the link tag (note this matches against Link.url, which is a
+         relative or absolute URL according to how it was written in the HTML)
+        tag: element name of opening tag, eg. "a"
+        predicate: a function taking a Link object as its single argument,
+         returning a boolean result, indicating whether the links
+        nr: matches the nth link that matches all other criteria (default 0)
+
+        """
+        try:
+            return self._filter_links(self._factory.links(), **kwds).next()
+        except StopIteration:
+            raise LinkNotFoundError()
+
+    def __getattr__(self, name):
+        # pass through ClientForm / DOMForm methods and attributes
+        form = self.__dict__.get("form")
+        if form is None:
+            raise AttributeError(
+                "%s instance has no attribute %s (perhaps you forgot to "
+                ".select_form()?)" % (self.__class__, name))
+        return getattr(form, name)
+
+    def _filter_links(self, links,
+                    text=None, text_regex=None,
+                    name=None, name_regex=None,
+                    url=None, url_regex=None,
+                    tag=None,
+                    predicate=None,
+                    nr=0
+                    ):
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+
+        found_links = []
+        orig_nr = nr
+
+        for link in links:
+            if url is not None and url != link.url:
+                continue
+            if url_regex is not None and not re.search(url_regex, link.url):
+                continue
+            if (text is not None and
+                (link.text is None or text != link.text)):
+                continue
+            if (text_regex is not None and
+                (link.text is None or not re.search(text_regex, link.text))):
+                continue
+            if name is not None and name != dict(link.attrs).get("name"):
+                continue
+            if name_regex is not None:
+                link_name = dict(link.attrs).get("name")
+                if link_name is None or not re.search(name_regex, link_name):
+                    continue
+            if tag is not None and tag != link.tag:
+                continue
+            if predicate is not None and not predicate(link):
+                continue
+            if nr:
+                nr -= 1
+                continue
+            yield link
+            nr = orig_nr

Added: mechanize/tags/0.1.10/mechanize/_mozillacookiejar.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_mozillacookiejar.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_mozillacookiejar.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,161 @@
+"""Mozilla / Netscape cookie loading / saving.
+
+Copyright 2002-2006 John J Lee <jjl at pobox.com>
+Copyright 1997-1999 Gisle Aas (original libwww-perl code)
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import re, time, logging
+
+from _clientcookie import reraise_unmasked_exceptions, FileCookieJar, Cookie, \
+     MISSING_FILENAME_TEXT, LoadError
+debug = logging.getLogger("ClientCookie").debug
+
+
+class MozillaCookieJar(FileCookieJar):
+    """
+
+    WARNING: you may want to backup your browser's cookies file if you use
+    this class to save cookies.  I *think* it works, but there have been
+    bugs in the past!
+
+    This class differs from CookieJar only in the format it uses to save and
+    load cookies to and from a file.  This class uses the Mozilla/Netscape
+    `cookies.txt' format.  lynx uses this file format, too.
+
+    Don't expect cookies saved while the browser is running to be noticed by
+    the browser (in fact, Mozilla on unix will overwrite your saved cookies if
+    you change them on disk while it's running; on Windows, you probably can't
+    save at all while the browser is running).
+
+    Note that the Mozilla/Netscape format will downgrade RFC2965 cookies to
+    Netscape cookies on saving.
+
+    In particular, the cookie version and port number information is lost,
+    together with information about whether or not Path, Port and Discard were
+    specified by the Set-Cookie2 (or Set-Cookie) header, and whether or not the
+    domain as set in the HTTP header started with a dot (yes, I'm aware some
+    domains in Netscape files start with a dot and some don't -- trust me, you
+    really don't want to know any more about this).
+
+    Note that though Mozilla and Netscape use the same format, they use
+    slightly different headers.  The class saves cookies using the Netscape
+    header by default (Mozilla can cope with that).
+
+    """
+    magic_re = "#( Netscape)? HTTP Cookie File"
+    header = """\
+    # Netscape HTTP Cookie File
+    # http://www.netscape.com/newsref/std/cookie_spec.html
+    # This is a generated file!  Do not edit.
+
+"""
+
+    def _really_load(self, f, filename, ignore_discard, ignore_expires):
+        now = time.time()
+
+        magic = f.readline()
+        if not re.search(self.magic_re, magic):
+            f.close()
+            raise LoadError(
+                "%s does not look like a Netscape format cookies file" %
+                filename)
+
+        try:
+            while 1:
+                line = f.readline()
+                if line == "": break
+
+                # last field may be absent, so keep any trailing tab
+                if line.endswith("\n"): line = line[:-1]
+
+                # skip comments and blank lines XXX what is $ for?
+                if (line.strip().startswith("#") or
+                    line.strip().startswith("$") or
+                    line.strip() == ""):
+                    continue
+
+                domain, domain_specified, path, secure, expires, name, value = \
+                    line.split("\t", 6)
+                secure = (secure == "TRUE")
+                domain_specified = (domain_specified == "TRUE")
+                if name == "":
+                    name = value
+                    value = None
+
+                initial_dot = domain.startswith(".")
+                if domain_specified != initial_dot:
+                    raise LoadError("domain and domain specified flag don't "
+                                    "match in %s: %s" % (filename, line))
+
+                discard = False
+                if expires == "":
+                    expires = None
+                    discard = True
+
+                # assume path_specified is false
+                c = Cookie(0, name, value,
+                           None, False,
+                           domain, domain_specified, initial_dot,
+                           path, False,
+                           secure,
+                           expires,
+                           discard,
+                           None,
+                           None,
+                           {})
+                if not ignore_discard and c.discard:
+                    continue
+                if not ignore_expires and c.is_expired(now):
+                    continue
+                self.set_cookie(c)
+
+        except:
+            reraise_unmasked_exceptions((IOError, LoadError))
+            raise LoadError("invalid Netscape format file %s: %s" %
+                            (filename, line))
+
+    def save(self, filename=None, ignore_discard=False, ignore_expires=False):
+        if filename is None:
+            if self.filename is not None: filename = self.filename
+            else: raise ValueError(MISSING_FILENAME_TEXT)
+
+        f = open(filename, "w")
+        try:
+            debug("Saving Netscape cookies.txt file")
+            f.write(self.header)
+            now = time.time()
+            for cookie in self:
+                if not ignore_discard and cookie.discard:
+                    debug("   Not saving %s: marked for discard", cookie.name)
+                    continue
+                if not ignore_expires and cookie.is_expired(now):
+                    debug("   Not saving %s: expired", cookie.name)
+                    continue
+                if cookie.secure: secure = "TRUE"
+                else: secure = "FALSE"
+                if cookie.domain.startswith("."): initial_dot = "TRUE"
+                else: initial_dot = "FALSE"
+                if cookie.expires is not None:
+                    expires = str(cookie.expires)
+                else:
+                    expires = ""
+                if cookie.value is None:
+                    # cookies.txt regards 'Set-Cookie: foo' as a cookie
+                    # with no name, whereas cookielib regards it as a
+                    # cookie with no value.
+                    name = ""
+                    value = cookie.name
+                else:
+                    name = cookie.name
+                    value = cookie.value
+                f.write(
+                    "\t".join([cookie.domain, initial_dot, cookie.path,
+                               secure, expires, name, value])+
+                    "\n")
+        finally:
+            f.close()

Added: mechanize/tags/0.1.10/mechanize/_msiecookiejar.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_msiecookiejar.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_msiecookiejar.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,388 @@
+"""Microsoft Internet Explorer cookie loading on Windows.
+
+Copyright 2002-2003 Johnny Lee <typo_pl at hotmail.com> (MSIE Perl code)
+Copyright 2002-2006 John J Lee <jjl at pobox.com> (The Python port)
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+# XXX names and comments are not great here
+
+import os, re, time, struct, logging
+if os.name == "nt":
+    import _winreg
+
+from _clientcookie import FileCookieJar, CookieJar, Cookie, \
+     MISSING_FILENAME_TEXT, LoadError
+
+debug = logging.getLogger("mechanize").debug
+
+
+def regload(path, leaf):
+    key = _winreg.OpenKey(_winreg.HKEY_CURRENT_USER, path, 0,
+                          _winreg.KEY_ALL_ACCESS)
+    try:
+        value = _winreg.QueryValueEx(key, leaf)[0]
+    except WindowsError:
+        value = None
+    return value
+
+WIN32_EPOCH = 0x019db1ded53e8000L  # 1970 Jan 01 00:00:00 in Win32 FILETIME
+
+def epoch_time_offset_from_win32_filetime(filetime):
+    """Convert from win32 filetime to seconds-since-epoch value.
+
+    MSIE stores create and expire times as Win32 FILETIME, which is 64
+    bits of 100 nanosecond intervals since Jan 01 1601.
+
+    mechanize expects time in 32-bit value expressed in seconds since the
+    epoch (Jan 01 1970).
+
+    """
+    if filetime < WIN32_EPOCH:
+        raise ValueError("filetime (%d) is before epoch (%d)" %
+                         (filetime, WIN32_EPOCH))
+
+    return divmod((filetime - WIN32_EPOCH), 10000000L)[0]
+
+def binary_to_char(c): return "%02X" % ord(c)
+def binary_to_str(d): return "".join(map(binary_to_char, list(d)))
+
+class MSIEBase:
+    magic_re = re.compile(r"Client UrlCache MMF Ver \d\.\d.*")
+    padding = "\x0d\xf0\xad\x0b"
+
+    msie_domain_re = re.compile(r"^([^/]+)(/.*)$")
+    cookie_re = re.compile("Cookie\:.+\@([\x21-\xFF]+).*?"
+                           "(.+\@[\x21-\xFF]+\.txt)")
+
+    # path under HKEY_CURRENT_USER from which to get location of index.dat
+    reg_path = r"software\microsoft\windows" \
+               r"\currentversion\explorer\shell folders"
+    reg_key = "Cookies"
+
+    def __init__(self):
+        self._delayload_domains = {}
+
+    def _delayload_domain(self, domain):
+        # if necessary, lazily load cookies for this domain
+        delayload_info = self._delayload_domains.get(domain)
+        if delayload_info is not None:
+            cookie_file, ignore_discard, ignore_expires = delayload_info
+            try:
+                self.load_cookie_data(cookie_file,
+                                      ignore_discard, ignore_expires)
+            except (LoadError, IOError):
+                debug("error reading cookie file, skipping: %s", cookie_file)
+            else:
+                del self._delayload_domains[domain]
+
+    def _load_cookies_from_file(self, filename):
+        debug("Loading MSIE cookies file: %s", filename)
+        cookies = []
+
+        cookies_fh = open(filename)
+
+        try:
+            while 1:
+                key = cookies_fh.readline()
+                if key == "": break
+
+                rl = cookies_fh.readline
+                def getlong(rl=rl): return long(rl().rstrip())
+                def getstr(rl=rl): return rl().rstrip()
+
+                key = key.rstrip()
+                value = getstr()
+                domain_path = getstr()
+                flags = getlong()  # 0x2000 bit is for secure I think
+                lo_expire = getlong()
+                hi_expire = getlong()
+                lo_create = getlong()
+                hi_create = getlong()
+                sep = getstr()
+
+                if "" in (key, value, domain_path, flags, hi_expire, lo_expire,
+                          hi_create, lo_create, sep) or (sep != "*"):
+                    break
+
+                m = self.msie_domain_re.search(domain_path)
+                if m:
+                    domain = m.group(1)
+                    path = m.group(2)
+
+                    cookies.append({"KEY": key, "VALUE": value,
+                                    "DOMAIN": domain, "PATH": path,
+                                    "FLAGS": flags, "HIXP": hi_expire,
+                                    "LOXP": lo_expire, "HICREATE": hi_create,
+                                    "LOCREATE": lo_create})
+        finally:
+            cookies_fh.close()
+
+        return cookies
+
+    def load_cookie_data(self, filename,
+                         ignore_discard=False, ignore_expires=False):
+        """Load cookies from file containing actual cookie data.
+
+        Old cookies are kept unless overwritten by newly loaded ones.
+
+        You should not call this method if the delayload attribute is set.
+
+        I think each of these files contain all cookies for one user, domain,
+        and path.
+
+        filename: file containing cookies -- usually found in a file like
+         C:\WINNT\Profiles\joe\Cookies\joe at blah[1].txt
+
+        """
+        now = int(time.time())
+
+        cookie_data = self._load_cookies_from_file(filename)
+
+        for cookie in cookie_data:
+            flags = cookie["FLAGS"]
+            secure = ((flags & 0x2000) != 0)
+            filetime = (cookie["HIXP"] << 32) + cookie["LOXP"]
+            expires = epoch_time_offset_from_win32_filetime(filetime)
+            if expires < now:
+                discard = True
+            else:
+                discard = False
+            domain = cookie["DOMAIN"]
+            initial_dot = domain.startswith(".")
+            if initial_dot:
+                domain_specified = True
+            else:
+                # MSIE 5 does not record whether the domain cookie-attribute
+                # was specified.
+                # Assuming it wasn't is conservative, because with strict
+                # domain matching this will match less frequently; with regular
+                # Netscape tail-matching, this will match at exactly the same
+                # times that domain_specified = True would.  It also means we
+                # don't have to prepend a dot to achieve consistency with our
+                # own & Mozilla's domain-munging scheme.
+                domain_specified = False
+
+            # assume path_specified is false
+            # XXX is there other stuff in here? -- eg. comment, commentURL?
+            c = Cookie(0,
+                       cookie["KEY"], cookie["VALUE"],
+                       None, False,
+                       domain, domain_specified, initial_dot,
+                       cookie["PATH"], False,
+                       secure,
+                       expires,
+                       discard,
+                       None,
+                       None,
+                       {"flags": flags})
+            if not ignore_discard and c.discard:
+                continue
+            if not ignore_expires and c.is_expired(now):
+                continue
+            CookieJar.set_cookie(self, c)
+
+    def load_from_registry(self, ignore_discard=False, ignore_expires=False,
+                           username=None):
+        """
+        username: only required on win9x
+
+        """
+        cookies_dir = regload(self.reg_path, self.reg_key)
+        filename = os.path.normpath(os.path.join(cookies_dir, "INDEX.DAT"))
+        self.load(filename, ignore_discard, ignore_expires, username)
+
+    def _really_load(self, index, filename, ignore_discard, ignore_expires,
+                     username):
+        now = int(time.time())
+
+        if username is None:
+            username = os.environ['USERNAME'].lower()
+
+        cookie_dir = os.path.dirname(filename)
+
+        data = index.read(256)
+        if len(data) != 256:
+            raise LoadError("%s file is too short" % filename)
+
+        # Cookies' index.dat file starts with 32 bytes of signature
+        # followed by an offset to the first record, stored as a little-
+        # endian DWORD.
+        sig, size, data = data[:32], data[32:36], data[36:]
+        size = struct.unpack("<L", size)[0]
+
+        # check that sig is valid
+        if not self.magic_re.match(sig) or size != 0x4000:
+            raise LoadError("%s ['%s' %s] does not seem to contain cookies" %
+                          (str(filename), sig, size))
+
+        # skip to start of first record
+        index.seek(size, 0)
+
+        sector = 128  # size of sector in bytes
+
+        while 1:
+            data = ""
+
+            # Cookies are usually in two contiguous sectors, so read in two
+            # sectors and adjust if not a Cookie.
+            to_read = 2 * sector
+            d = index.read(to_read)
+            if len(d) != to_read:
+                break
+            data = data + d
+
+            # Each record starts with a 4-byte signature and a count
+            # (little-endian DWORD) of sectors for the record.
+            sig, size, data = data[:4], data[4:8], data[8:]
+            size = struct.unpack("<L", size)[0]
+
+            to_read = (size - 2) * sector
+
+##             from urllib import quote
+##             print "data", quote(data)
+##             print "sig", quote(sig)
+##             print "size in sectors", size
+##             print "size in bytes", size*sector
+##             print "size in units of 16 bytes", (size*sector) / 16
+##             print "size to read in bytes", to_read
+##             print
+
+            if sig != "URL ":
+                assert sig in ("HASH", "LEAK", \
+                               self.padding, "\x00\x00\x00\x00"), \
+                               "unrecognized MSIE index.dat record: %s" % \
+                               binary_to_str(sig)
+                if sig == "\x00\x00\x00\x00":
+                    # assume we've got all the cookies, and stop
+                    break
+                if sig == self.padding:
+                    continue
+                # skip the rest of this record
+                assert to_read >= 0
+                if size != 2:
+                    assert to_read != 0
+                    index.seek(to_read, 1)
+                continue
+
+            # read in rest of record if necessary
+            if size > 2:
+                more_data = index.read(to_read)
+                if len(more_data) != to_read: break
+                data = data + more_data
+
+            cookie_re = ("Cookie\:%s\@([\x21-\xFF]+).*?" % username +
+                         "(%s\@[\x21-\xFF]+\.txt)" % username)
+            m = re.search(cookie_re, data, re.I)
+            if m:
+                cookie_file = os.path.join(cookie_dir, m.group(2))
+                if not self.delayload:
+                    try:
+                        self.load_cookie_data(cookie_file,
+                                              ignore_discard, ignore_expires)
+                    except (LoadError, IOError):
+                        debug("error reading cookie file, skipping: %s",
+                              cookie_file)
+                else:
+                    domain = m.group(1)
+                    i = domain.find("/")
+                    if i != -1:
+                        domain = domain[:i]
+
+                    self._delayload_domains[domain] = (
+                        cookie_file, ignore_discard, ignore_expires)
+
+
+class MSIECookieJar(MSIEBase, FileCookieJar):
+    """FileCookieJar that reads from the Windows MSIE cookies database.
+
+    MSIECookieJar can read the cookie files of Microsoft Internet Explorer
+    (MSIE) for Windows version 5 on Windows NT and version 6 on Windows XP and
+    Windows 98.  Other configurations may also work, but are untested.  Saving
+    cookies in MSIE format is NOT supported.  If you save cookies, they'll be
+    in the usual Set-Cookie3 format, which you can read back in using an
+    instance of the plain old CookieJar class.  Don't save using the same
+    filename that you loaded cookies from, because you may succeed in
+    clobbering your MSIE cookies index file!
+
+    You should be able to have LWP share Internet Explorer's cookies like
+    this (note you need to supply a username to load_from_registry if you're on
+    Windows 9x or Windows ME):
+
+    cj = MSIECookieJar(delayload=1)
+    # find cookies index file in registry and load cookies from it
+    cj.load_from_registry()
+    opener = mechanize.build_opener(mechanize.HTTPCookieProcessor(cj))
+    response = opener.open("http://example.com/")
+
+    Iterating over a delayloaded MSIECookieJar instance will not cause any
+    cookies to be read from disk.  To force reading of all cookies from disk,
+    call read_all_cookies.  Note that the following methods iterate over self:
+    clear_temporary_cookies, clear_expired_cookies, __len__, __repr__, __str__
+    and as_string.
+
+    Additional methods:
+
+    load_from_registry(ignore_discard=False, ignore_expires=False,
+                       username=None)
+    load_cookie_data(filename, ignore_discard=False, ignore_expires=False)
+    read_all_cookies()
+
+    """
+    def __init__(self, filename=None, delayload=False, policy=None):
+        MSIEBase.__init__(self)
+        FileCookieJar.__init__(self, filename, delayload, policy)
+
+    def set_cookie(self, cookie):
+        if self.delayload:
+            self._delayload_domain(cookie.domain)
+        CookieJar.set_cookie(self, cookie)
+
+    def _cookies_for_request(self, request):
+        """Return a list of cookies to be returned to server."""
+        domains = self._cookies.copy()
+        domains.update(self._delayload_domains)
+        domains = domains.keys()
+
+        cookies = []
+        for domain in domains:
+            cookies.extend(self._cookies_for_domain(domain, request))
+        return cookies
+
+    def _cookies_for_domain(self, domain, request):
+        if not self._policy.domain_return_ok(domain, request):
+            return []
+        debug("Checking %s for cookies to return", domain)
+        if self.delayload:
+            self._delayload_domain(domain)
+        return CookieJar._cookies_for_domain(self, domain, request)
+
+    def read_all_cookies(self):
+        """Eagerly read in all cookies."""
+        if self.delayload:
+            for domain in self._delayload_domains.keys():
+                self._delayload_domain(domain)
+
+    def load(self, filename, ignore_discard=False, ignore_expires=False,
+             username=None):
+        """Load cookies from an MSIE 'index.dat' cookies index file.
+
+        filename: full path to cookie index file
+        username: only required on win9x
+
+        """
+        if filename is None:
+            if self.filename is not None: filename = self.filename
+            else: raise ValueError(MISSING_FILENAME_TEXT)
+
+        index = open(filename, "rb")
+
+        try:
+            self._really_load(index, filename, ignore_discard, ignore_expires,
+                              username)
+        finally:
+            index.close()

Added: mechanize/tags/0.1.10/mechanize/_opener.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_opener.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_opener.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,436 @@
+"""Integration with Python standard library module urllib2: OpenerDirector
+class.
+
+Copyright 2004-2006 John J Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import os, urllib2, bisect, httplib, types, tempfile
+try:
+    import threading as _threading
+except ImportError:
+    import dummy_threading as _threading
+try:
+    set
+except NameError:
+    import sets
+    set = sets.Set
+
+import _file
+import _http
+from _request import Request
+import _response
+import _rfc3986
+import _sockettimeout
+import _upgrade
+from _util import isstringlike
+
+
+class ContentTooShortError(urllib2.URLError):
+    def __init__(self, reason, result):
+        urllib2.URLError.__init__(self, reason)
+        self.result = result
+
+
+def set_request_attr(req, name, value, default):
+    try:
+        getattr(req, name)
+    except AttributeError:
+        setattr(req, name, default)
+    if value is not default:
+        setattr(req, name, value)
+
+
+class OpenerDirector(urllib2.OpenerDirector):
+    def __init__(self):
+        urllib2.OpenerDirector.__init__(self)
+        # really none of these are (sanely) public -- the lack of initial
+        # underscore on some is just due to following urllib2
+        self.process_response = {}
+        self.process_request = {}
+        self._any_request = {}
+        self._any_response = {}
+        self._handler_index_valid = True
+        self._tempfiles = []
+
+    def add_handler(self, handler):
+        if handler in self.handlers:
+            return
+        # XXX why does self.handlers need to be sorted?
+        bisect.insort(self.handlers, handler)
+        handler.add_parent(self)
+        self._handler_index_valid = False
+
+    def _maybe_reindex_handlers(self):
+        if self._handler_index_valid:
+            return
+
+        handle_error = {}
+        handle_open = {}
+        process_request = {}
+        process_response = {}
+        any_request = set()
+        any_response = set()
+        unwanted = []
+
+        for handler in self.handlers:
+            added = False
+            for meth in dir(handler):
+                if meth in ["redirect_request", "do_open", "proxy_open"]:
+                    # oops, coincidental match
+                    continue
+
+                if meth == "any_request":
+                    any_request.add(handler)
+                    added = True
+                    continue
+                elif meth == "any_response":
+                    any_response.add(handler)
+                    added = True
+                    continue
+
+                ii = meth.find("_")
+                scheme = meth[:ii]
+                condition = meth[ii+1:]
+
+                if condition.startswith("error"):
+                    jj = meth[ii+1:].find("_") + ii + 1
+                    kind = meth[jj+1:]
+                    try:
+                        kind = int(kind)
+                    except ValueError:
+                        pass
+                    lookup = handle_error.setdefault(scheme, {})
+                elif condition == "open":
+                    kind = scheme
+                    lookup = handle_open
+                elif condition == "request":
+                    kind = scheme
+                    lookup = process_request
+                elif condition == "response":
+                    kind = scheme
+                    lookup = process_response
+                else:
+                    continue
+
+                lookup.setdefault(kind, set()).add(handler)
+                added = True
+
+            if not added:
+                unwanted.append(handler)
+
+        for handler in unwanted:
+            self.handlers.remove(handler)
+
+        # sort indexed methods
+        # XXX could be cleaned up
+        for lookup in [process_request, process_response]:
+            for scheme, handlers in lookup.iteritems():
+                lookup[scheme] = handlers
+        for scheme, lookup in handle_error.iteritems():
+            for code, handlers in lookup.iteritems():
+                handlers = list(handlers)
+                handlers.sort()
+                lookup[code] = handlers
+        for scheme, handlers in handle_open.iteritems():
+            handlers = list(handlers)
+            handlers.sort()
+            handle_open[scheme] = handlers
+
+        # cache the indexes
+        self.handle_error = handle_error
+        self.handle_open = handle_open
+        self.process_request = process_request
+        self.process_response = process_response
+        self._any_request = any_request
+        self._any_response = any_response
+
+    def _request(self, url_or_req, data, visit,
+                 timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+        if isstringlike(url_or_req):
+            req = Request(url_or_req, data, visit=visit, timeout=timeout)
+        else:
+            # already a urllib2.Request or mechanize.Request instance
+            req = url_or_req
+            if data is not None:
+                req.add_data(data)
+            # XXX yuck
+            set_request_attr(req, "visit", visit, None)
+            set_request_attr(req, "timeout", timeout,
+                             _sockettimeout._GLOBAL_DEFAULT_TIMEOUT)
+        return req
+
+    def open(self, fullurl, data=None,
+             timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+        req = self._request(fullurl, data, None, timeout)
+        req_scheme = req.get_type()
+
+        self._maybe_reindex_handlers()
+
+        # pre-process request
+        # XXX should we allow a Processor to change the URL scheme
+        #   of the request?
+        request_processors = set(self.process_request.get(req_scheme, []))
+        request_processors.update(self._any_request)
+        request_processors = list(request_processors)
+        request_processors.sort()
+        for processor in request_processors:
+            for meth_name in ["any_request", req_scheme+"_request"]:
+                meth = getattr(processor, meth_name, None)
+                if meth:
+                    req = meth(req)
+
+        # In Python >= 2.4, .open() supports processors already, so we must
+        # call ._open() instead.
+        urlopen = getattr(urllib2.OpenerDirector, "_open",
+                          urllib2.OpenerDirector.open)
+        response = urlopen(self, req, data)
+
+        # post-process response
+        response_processors = set(self.process_response.get(req_scheme, []))
+        response_processors.update(self._any_response)
+        response_processors = list(response_processors)
+        response_processors.sort()
+        for processor in response_processors:
+            for meth_name in ["any_response", req_scheme+"_response"]:
+                meth = getattr(processor, meth_name, None)
+                if meth:
+                    response = meth(req, response)
+
+        return response
+
+    def error(self, proto, *args):
+        if proto in ['http', 'https']:
+            # XXX http[s] protocols are special-cased
+            dict = self.handle_error['http'] # https is not different than http
+            proto = args[2]  # YUCK!
+            meth_name = 'http_error_%s' % proto
+            http_err = 1
+            orig_args = args
+        else:
+            dict = self.handle_error
+            meth_name = proto + '_error'
+            http_err = 0
+        args = (dict, proto, meth_name) + args
+        result = apply(self._call_chain, args)
+        if result:
+            return result
+
+        if http_err:
+            args = (dict, 'default', 'http_error_default') + orig_args
+            return apply(self._call_chain, args)
+
+    BLOCK_SIZE = 1024*8
+    def retrieve(self, fullurl, filename=None, reporthook=None, data=None,
+                 timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+        """Returns (filename, headers).
+
+        For remote objects, the default filename will refer to a temporary
+        file.  Temporary files are removed when the OpenerDirector.close()
+        method is called.
+
+        For file: URLs, at present the returned filename is None.  This may
+        change in future.
+
+        If the actual number of bytes read is less than indicated by the
+        Content-Length header, raises ContentTooShortError (a URLError
+        subclass).  The exception's .result attribute contains the (filename,
+        headers) that would have been returned.
+
+        """
+        req = self._request(fullurl, data, False, timeout)
+        scheme = req.get_type()
+        fp = self.open(req)
+        headers = fp.info()
+        if filename is None and scheme == 'file':
+            # XXX req.get_selector() seems broken here, return None,
+            #   pending sanity :-/
+            return None, headers
+            #return urllib.url2pathname(req.get_selector()), headers
+        if filename:
+            tfp = open(filename, 'wb')
+        else:
+            path = _rfc3986.urlsplit(req.get_full_url())[2]
+            suffix = os.path.splitext(path)[1]
+            fd, filename = tempfile.mkstemp(suffix)
+            self._tempfiles.append(filename)
+            tfp = os.fdopen(fd, 'wb')
+
+        result = filename, headers
+        bs = self.BLOCK_SIZE
+        size = -1
+        read = 0
+        blocknum = 0
+        if reporthook:
+            if "content-length" in headers:
+                size = int(headers["Content-Length"])
+            reporthook(blocknum, bs, size)
+        while 1:
+            block = fp.read(bs)
+            if block == "":
+                break
+            read += len(block)
+            tfp.write(block)
+            blocknum += 1
+            if reporthook:
+                reporthook(blocknum, bs, size)
+        fp.close()
+        tfp.close()
+        del fp
+        del tfp
+
+        # raise exception if actual size does not match content-length header
+        if size >= 0 and read < size:
+            raise ContentTooShortError(
+                "retrieval incomplete: "
+                "got only %i out of %i bytes" % (read, size),
+                result
+                )
+
+        return result
+
+    def close(self):
+        urllib2.OpenerDirector.close(self)
+
+        # make it very obvious this object is no longer supposed to be used
+        self.open = self.error = self.retrieve = self.add_handler = None
+
+        if self._tempfiles:
+            for filename in self._tempfiles:
+                try:
+                    os.unlink(filename)
+                except OSError:
+                    pass
+            del self._tempfiles[:]
+
+
+def wrapped_open(urlopen, process_response_object, fullurl, data=None,
+                 timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+    success = True
+    try:
+        response = urlopen(fullurl, data, timeout)
+    except urllib2.HTTPError, error:
+        success = False
+        if error.fp is None:  # not a response
+            raise
+        response = error
+
+    if response is not None:
+        response = process_response_object(response)
+
+    if not success:
+        raise response
+    return response
+
+class ResponseProcessingOpener(OpenerDirector):
+
+    def open(self, fullurl, data=None,
+             timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+        def bound_open(fullurl, data=None,
+                       timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+            return OpenerDirector.open(self, fullurl, data, timeout)
+        return wrapped_open(
+            bound_open, self.process_response_object, fullurl, data, timeout)
+
+    def process_response_object(self, response):
+        return response
+
+
+class SeekableResponseOpener(ResponseProcessingOpener):
+    def process_response_object(self, response):
+        return _response.seek_wrapped_response(response)
+
+
+class OpenerFactory:
+    """This class's interface is quite likely to change."""
+
+    default_classes = [
+        # handlers
+        urllib2.ProxyHandler,
+        urllib2.UnknownHandler,
+        _http.HTTPHandler,  # derived from new AbstractHTTPHandler
+        _http.HTTPDefaultErrorHandler,
+        _http.HTTPRedirectHandler,  # bugfixed
+        urllib2.FTPHandler,
+        _file.FileHandler,
+        # processors
+        _upgrade.HTTPRequestUpgradeProcessor,
+        _http.HTTPCookieProcessor,
+        _http.HTTPErrorProcessor,
+        ]
+    if hasattr(httplib, 'HTTPS'):
+        default_classes.append(_http.HTTPSHandler)
+    handlers = []
+    replacement_handlers = []
+
+    def __init__(self, klass=OpenerDirector):
+        self.klass = klass
+
+    def build_opener(self, *handlers):
+        """Create an opener object from a list of handlers and processors.
+
+        The opener will use several default handlers and processors, including
+        support for HTTP and FTP.
+
+        If any of the handlers passed as arguments are subclasses of the
+        default handlers, the default handlers will not be used.
+
+        """
+        opener = self.klass()
+        default_classes = list(self.default_classes)
+        skip = []
+        for klass in default_classes:
+            for check in handlers:
+                if type(check) == types.ClassType:
+                    if issubclass(check, klass):
+                        skip.append(klass)
+                elif type(check) == types.InstanceType:
+                    if isinstance(check, klass):
+                        skip.append(klass)
+        for klass in skip:
+            default_classes.remove(klass)
+
+        for klass in default_classes:
+            opener.add_handler(klass())
+        for h in handlers:
+            if type(h) == types.ClassType:
+                h = h()
+            opener.add_handler(h)
+
+        return opener
+
+
+build_opener = OpenerFactory().build_opener
+
+_opener = None
+urlopen_lock = _threading.Lock()
+def urlopen(url, data=None, timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+    global _opener
+    if _opener is None:
+        urlopen_lock.acquire()
+        try:
+            if _opener is None:
+                _opener = build_opener()
+        finally:
+            urlopen_lock.release()
+    return _opener.open(url, data, timeout)
+
+def urlretrieve(url, filename=None, reporthook=None, data=None,
+                timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+    global _opener
+    if _opener is None:
+        urlopen_lock.acquire()
+        try:
+            if _opener is None:
+                _opener = build_opener()
+        finally:
+            urlopen_lock.release()
+    return _opener.retrieve(url, filename, reporthook, data, timeout)
+
+def install_opener(opener):
+    global _opener
+    _opener = opener

Added: mechanize/tags/0.1.10/mechanize/_pullparser.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_pullparser.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_pullparser.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,390 @@
+"""A simple "pull API" for HTML parsing, after Perl's HTML::TokeParser.
+
+Examples
+
+This program extracts all links from a document.  It will print one
+line for each link, containing the URL and the textual description
+between the <A>...</A> tags:
+
+import pullparser, sys
+f = file(sys.argv[1])
+p = pullparser.PullParser(f)
+for token in p.tags("a"):
+    if token.type == "endtag": continue
+    url = dict(token.attrs).get("href", "-")
+    text = p.get_compressed_text(endat=("endtag", "a"))
+    print "%s\t%s" % (url, text)
+
+This program extracts the <TITLE> from the document:
+
+import pullparser, sys
+f = file(sys.argv[1])
+p = pullparser.PullParser(f)
+if p.get_tag("title"):
+    title = p.get_compressed_text()
+    print "Title: %s" % title
+
+
+Copyright 2003-2006 John J. Lee <jjl at pobox.com>
+Copyright 1998-2001 Gisle Aas (original libwww-perl code)
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses.
+
+"""
+
+import re, htmlentitydefs
+import sgmllib, HTMLParser
+from xml.sax import saxutils
+
+from _html import unescape, unescape_charref
+
+
+class NoMoreTokensError(Exception): pass
+
+class Token:
+    """Represents an HTML tag, declaration, processing instruction etc.
+
+    Behaves as both a tuple-like object (ie. iterable) and has attributes
+    .type, .data and .attrs.
+
+    >>> t = Token("starttag", "a", [("href", "http://www.python.org/")])
+    >>> t == ("starttag", "a", [("href", "http://www.python.org/")])
+    True
+    >>> (t.type, t.data) == ("starttag", "a")
+    True
+    >>> t.attrs == [("href", "http://www.python.org/")]
+    True
+
+    Public attributes
+
+    type: one of "starttag", "endtag", "startendtag", "charref", "entityref",
+     "data", "comment", "decl", "pi", after the corresponding methods of
+     HTMLParser.HTMLParser
+    data: For a tag, the tag name; otherwise, the relevant data carried by the
+     tag, as a string
+    attrs: list of (name, value) pairs representing HTML attributes
+     (or None if token does not represent an opening tag)
+
+    """
+    def __init__(self, type, data, attrs=None):
+        self.type = type
+        self.data = data
+        self.attrs = attrs
+    def __iter__(self):
+        return iter((self.type, self.data, self.attrs))
+    def __eq__(self, other):
+        type, data, attrs = other
+        if (self.type == type and
+            self.data == data and
+            self.attrs == attrs):
+            return True
+        else:
+            return False
+    def __ne__(self, other): return not self.__eq__(other)
+    def __repr__(self):
+        args = ", ".join(map(repr, [self.type, self.data, self.attrs]))
+        return self.__class__.__name__+"(%s)" % args
+
+    def __str__(self):
+        """
+        >>> print Token("starttag", "br")
+        <br>
+        >>> print Token("starttag", "a",
+        ...     [("href", "http://www.python.org/"), ("alt", '"foo"')])
+        <a href="http://www.python.org/" alt='"foo"'>
+        >>> print Token("startendtag", "br")
+        <br />
+        >>> print Token("startendtag", "br", [("spam", "eggs")])
+        <br spam="eggs" />
+        >>> print Token("endtag", "p")
+        </p>
+        >>> print Token("charref", "38")
+        &#38;
+        >>> print Token("entityref", "amp")
+        &amp;
+        >>> print Token("data", "foo\\nbar")
+        foo
+        bar
+        >>> print Token("comment", "Life is a bowl\\nof cherries.")
+        <!--Life is a bowl
+        of cherries.-->
+        >>> print Token("decl", "decl")
+        <!decl>
+        >>> print Token("pi", "pi")
+        <?pi>
+        """
+        if self.attrs is not None:
+            attrs = "".join([" %s=%s" % (k, saxutils.quoteattr(v)) for
+                             k, v in self.attrs])
+        else:
+            attrs = ""
+        if self.type == "starttag":
+            return "<%s%s>" % (self.data, attrs)
+        elif self.type == "startendtag":
+            return "<%s%s />" % (self.data, attrs)
+        elif self.type == "endtag":
+            return "</%s>" % self.data
+        elif self.type == "charref":
+            return "&#%s;" % self.data
+        elif self.type == "entityref":
+            return "&%s;" % self.data
+        elif self.type == "data":
+            return self.data
+        elif self.type == "comment":
+            return "<!--%s-->" % self.data
+        elif self.type == "decl":
+            return "<!%s>" % self.data
+        elif self.type == "pi":
+            return "<?%s>" % self.data
+        assert False
+
+
+def iter_until_exception(fn, exception, *args, **kwds):
+    while 1:
+        try:
+            yield fn(*args, **kwds)
+        except exception:
+            raise StopIteration
+
+
+class _AbstractParser:
+    chunk = 1024
+    compress_re = re.compile(r"\s+")
+    def __init__(self, fh, textify={"img": "alt", "applet": "alt"},
+                 encoding="ascii", entitydefs=None):
+        """
+        fh: file-like object (only a .read() method is required) from which to
+         read HTML to be parsed
+        textify: mapping used by .get_text() and .get_compressed_text() methods
+         to represent opening tags as text
+        encoding: encoding used to encode numeric character references by
+         .get_text() and .get_compressed_text() ("ascii" by default)
+
+        entitydefs: mapping like {"amp": "&", ...} containing HTML entity
+         definitions (a sensible default is used).  This is used to unescape
+         entities in .get_text() (and .get_compressed_text()) and attribute
+         values.  If the encoding can not represent the character, the entity
+         reference is left unescaped.  Note that entity references (both
+         numeric - e.g. &#123; or &#xabc; - and non-numeric - e.g. &amp;) are
+         unescaped in attribute values and the return value of .get_text(), but
+         not in data outside of tags.  Instead, entity references outside of
+         tags are represented as tokens.  This is a bit odd, it's true :-/
+
+        If the element name of an opening tag matches a key in the textify
+        mapping then that tag is converted to text.  The corresponding value is
+        used to specify which tag attribute to obtain the text from.  textify
+        maps from element names to either:
+
+          - an HTML attribute name, in which case the HTML attribute value is
+            used as its text value along with the element name in square
+            brackets (eg."alt text goes here[IMG]", or, if the alt attribute
+            were missing, just "[IMG]")
+          - a callable object (eg. a function) which takes a Token and returns
+            the string to be used as its text value
+
+        If textify has no key for an element name, nothing is substituted for
+        the opening tag.
+
+        Public attributes:
+
+        encoding and textify: see above
+
+        """
+        self._fh = fh
+        self._tokenstack = []  # FIFO
+        self.textify = textify
+        self.encoding = encoding
+        if entitydefs is None:
+            entitydefs = htmlentitydefs.name2codepoint
+        self._entitydefs = entitydefs
+
+    def __iter__(self): return self
+
+    def tags(self, *names):
+        return iter_until_exception(self.get_tag, NoMoreTokensError, *names)
+
+    def tokens(self, *tokentypes):
+        return iter_until_exception(self.get_token, NoMoreTokensError,
+                                    *tokentypes)
+
+    def next(self):
+        try:
+            return self.get_token()
+        except NoMoreTokensError:
+            raise StopIteration()
+
+    def get_token(self, *tokentypes):
+        """Pop the next Token object from the stack of parsed tokens.
+
+        If arguments are given, they are taken to be token types in which the
+        caller is interested: tokens representing other elements will be
+        skipped.  Element names must be given in lower case.
+
+        Raises NoMoreTokensError.
+
+        """
+        while 1:
+            while self._tokenstack:
+                token = self._tokenstack.pop(0)
+                if tokentypes:
+                    if token.type in tokentypes:
+                        return token
+                else:
+                    return token
+            data = self._fh.read(self.chunk)
+            if not data:
+                raise NoMoreTokensError()
+            self.feed(data)
+
+    def unget_token(self, token):
+        """Push a Token back onto the stack."""
+        self._tokenstack.insert(0, token)
+
+    def get_tag(self, *names):
+        """Return the next Token that represents an opening or closing tag.
+
+        If arguments are given, they are taken to be element names in which the
+        caller is interested: tags representing other elements will be skipped.
+        Element names must be given in lower case.
+
+        Raises NoMoreTokensError.
+
+        """
+        while 1:
+            tok = self.get_token()
+            if tok.type not in ["starttag", "endtag", "startendtag"]:
+                continue
+            if names:
+                if tok.data in names:
+                    return tok
+            else:
+                return tok
+
+    def get_text(self, endat=None):
+        """Get some text.
+
+        endat: stop reading text at this tag (the tag is included in the
+         returned text); endtag is a tuple (type, name) where type is
+         "starttag", "endtag" or "startendtag", and name is the element name of
+         the tag (element names must be given in lower case)
+
+        If endat is not given, .get_text() will stop at the next opening or
+        closing tag, or when there are no more tokens (no exception is raised).
+        Note that .get_text() includes the text representation (if any) of the
+        opening tag, but pushes the opening tag back onto the stack.  As a
+        result, if you want to call .get_text() again, you need to call
+        .get_tag() first (unless you want an empty string returned when you
+        next call .get_text()).
+
+        Entity references are translated using the value of the entitydefs
+        constructor argument (a mapping from names to characters like that
+        provided by the standard module htmlentitydefs).  Named entity
+        references that are not in this mapping are left unchanged.
+
+        The textify attribute is used to translate opening tags into text: see
+        the class docstring.
+
+        """
+        text = []
+        tok = None
+        while 1:
+            try:
+                tok = self.get_token()
+            except NoMoreTokensError:
+                # unget last token (not the one we just failed to get)
+                if tok: self.unget_token(tok)
+                break
+            if tok.type == "data":
+                text.append(tok.data)
+            elif tok.type == "entityref":
+                t = unescape("&%s;"%tok.data, self._entitydefs, self.encoding)
+                text.append(t)
+            elif tok.type == "charref":
+                t = unescape_charref(tok.data, self.encoding)
+                text.append(t)
+            elif tok.type in ["starttag", "endtag", "startendtag"]:
+                tag_name = tok.data
+                if tok.type in ["starttag", "startendtag"]:
+                    alt = self.textify.get(tag_name)
+                    if alt is not None:
+                        if callable(alt):
+                            text.append(alt(tok))
+                        elif tok.attrs is not None:
+                            for k, v in tok.attrs:
+                                if k == alt:
+                                    text.append(v)
+                            text.append("[%s]" % tag_name.upper())
+                if endat is None or endat == (tok.type, tag_name):
+                    self.unget_token(tok)
+                    break
+        return "".join(text)
+
+    def get_compressed_text(self, *args, **kwds):
+        """
+        As .get_text(), but collapses each group of contiguous whitespace to a
+        single space character, and removes all initial and trailing
+        whitespace.
+
+        """
+        text = self.get_text(*args, **kwds)
+        text = text.strip()
+        return self.compress_re.sub(" ", text)
+
+    def handle_startendtag(self, tag, attrs):
+        self._tokenstack.append(Token("startendtag", tag, attrs))
+    def handle_starttag(self, tag, attrs):
+        self._tokenstack.append(Token("starttag", tag, attrs))
+    def handle_endtag(self, tag):
+        self._tokenstack.append(Token("endtag", tag))
+    def handle_charref(self, name):
+        self._tokenstack.append(Token("charref", name))
+    def handle_entityref(self, name):
+        self._tokenstack.append(Token("entityref", name))
+    def handle_data(self, data):
+        self._tokenstack.append(Token("data", data))
+    def handle_comment(self, data):
+        self._tokenstack.append(Token("comment", data))
+    def handle_decl(self, decl):
+        self._tokenstack.append(Token("decl", decl))
+    def unknown_decl(self, data):
+        # XXX should this call self.error instead?
+        #self.error("unknown declaration: " + `data`)
+        self._tokenstack.append(Token("decl", data))
+    def handle_pi(self, data):
+        self._tokenstack.append(Token("pi", data))
+
+    def unescape_attr(self, name):
+        return unescape(name, self._entitydefs, self.encoding)
+    def unescape_attrs(self, attrs):
+        escaped_attrs = []
+        for key, val in attrs:
+            escaped_attrs.append((key, self.unescape_attr(val)))
+        return escaped_attrs
+
+class PullParser(_AbstractParser, HTMLParser.HTMLParser):
+    def __init__(self, *args, **kwds):
+        HTMLParser.HTMLParser.__init__(self)
+        _AbstractParser.__init__(self, *args, **kwds)
+    def unescape(self, name):
+        # Use the entitydefs passed into constructor, not
+        # HTMLParser.HTMLParser's entitydefs.
+        return self.unescape_attr(name)
+
+class TolerantPullParser(_AbstractParser, sgmllib.SGMLParser):
+    def __init__(self, *args, **kwds):
+        sgmllib.SGMLParser.__init__(self)
+        _AbstractParser.__init__(self, *args, **kwds)
+    def unknown_starttag(self, tag, attrs):
+        attrs = self.unescape_attrs(attrs)
+        self._tokenstack.append(Token("starttag", tag, attrs))
+    def unknown_endtag(self, tag):
+        self._tokenstack.append(Token("endtag", tag))
+
+
+def _test():
+   import doctest, _pullparser
+   return doctest.testmod(_pullparser)
+
+if __name__ == "__main__":
+   _test()

Added: mechanize/tags/0.1.10/mechanize/_request.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_request.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_request.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,87 @@
+"""Integration with Python standard library module urllib2: Request class.
+
+Copyright 2004-2006 John J Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import urllib2, urllib, logging
+
+from _clientcookie import request_host_lc
+import _rfc3986
+import _sockettimeout
+
+warn = logging.getLogger("mechanize").warning
+
+
+class Request(urllib2.Request):
+    def __init__(self, url, data=None, headers={},
+                 origin_req_host=None, unverifiable=False, visit=None,
+                 timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+        # In mechanize 0.2, the interpretation of a unicode url argument will
+        # change: A unicode url argument will be interpreted as an IRI, and a
+        # bytestring as a URI. For now, we accept unicode or bytestring.  We
+        # don't insist that the value is always a URI (specifically, must only
+        # contain characters which are legal), because that might break working
+        # code (who knows what bytes some servers want to see, especially with
+        # browser plugins for internationalised URIs).
+        if not _rfc3986.is_clean_uri(url):
+            warn("url argument is not a URI "
+                 "(contains illegal characters) %r" % url)
+        urllib2.Request.__init__(self, url, data, headers)
+        self.selector = None
+        self.unredirected_hdrs = {}
+        self.visit = visit
+        self.timeout = timeout
+
+        # All the terminology below comes from RFC 2965.
+        self.unverifiable = unverifiable
+        # Set request-host of origin transaction.
+        # The origin request-host is needed in order to decide whether
+        # unverifiable sub-requests (automatic redirects, images embedded
+        # in HTML, etc.) are to third-party hosts.  If they are, the
+        # resulting transactions might need to be conducted with cookies
+        # turned off.
+        if origin_req_host is None:
+            origin_req_host = request_host_lc(self)
+        self.origin_req_host = origin_req_host
+
+    def get_selector(self):
+        return urllib.splittag(self.__r_host)[0]
+
+    def get_origin_req_host(self):
+        return self.origin_req_host
+
+    def is_unverifiable(self):
+        return self.unverifiable
+
+    def add_unredirected_header(self, key, val):
+        """Add a header that will not be added to a redirected request."""
+        self.unredirected_hdrs[key.capitalize()] = val
+
+    def has_header(self, header_name):
+        """True iff request has named header (regular or unredirected)."""
+        return (header_name in self.headers or
+                header_name in self.unredirected_hdrs)
+
+    def get_header(self, header_name, default=None):
+        return self.headers.get(
+            header_name,
+            self.unredirected_hdrs.get(header_name, default))
+
+    def header_items(self):
+        hdrs = self.unredirected_hdrs.copy()
+        hdrs.update(self.headers)
+        return hdrs.items()
+
+    def __str__(self):
+        return "<Request for %s>" % self.get_full_url()
+
+    def get_method(self):
+        if self.has_data():
+            return "POST"
+        else:
+            return "GET"

Added: mechanize/tags/0.1.10/mechanize/_response.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_response.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_response.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,514 @@
+"""Response classes.
+
+The seek_wrapper code is not used if you're using UserAgent with
+.set_seekable_responses(False), or if you're using the urllib2-level interface
+without SeekableProcessor or HTTPEquivProcessor.  Class closeable_response is
+instantiated by some handlers (AbstractHTTPHandler), but the closeable_response
+interface is only depended upon by Browser-level code.  Function
+upgrade_response is only used if you're using Browser or
+ResponseUpgradeProcessor.
+
+
+Copyright 2006 John J. Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file COPYING.txt
+included with the distribution).
+
+"""
+
+import copy, mimetools
+from cStringIO import StringIO
+import urllib2
+
+# XXX Andrew Dalke kindly sent me a similar class in response to my request on
+# comp.lang.python, which I then proceeded to lose.  I wrote this class
+# instead, but I think he's released his code publicly since, could pinch the
+# tests from it, at least...
+
+# For testing seek_wrapper invariant (note that
+# test_urllib2.HandlerTest.test_seekable is expected to fail when this
+# invariant checking is turned on).  The invariant checking is done by module
+# ipdc, which is available here:
+# http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/436834
+## from ipdbc import ContractBase
+## class seek_wrapper(ContractBase):
+class seek_wrapper:
+    """Adds a seek method to a file object.
+
+    This is only designed for seeking on readonly file-like objects.
+
+    Wrapped file-like object must have a read method.  The readline method is
+    only supported if that method is present on the wrapped object.  The
+    readlines method is always supported.  xreadlines and iteration are
+    supported only for Python 2.2 and above.
+
+    Public attributes:
+
+    wrapped: the wrapped file object
+    is_closed: true iff .close() has been called
+
+    WARNING: All other attributes of the wrapped object (ie. those that are not
+    one of wrapped, read, readline, readlines, xreadlines, __iter__ and next)
+    are passed through unaltered, which may or may not make sense for your
+    particular file object.
+
+    """
+    # General strategy is to check that cache is full enough, then delegate to
+    # the cache (self.__cache, which is a cStringIO.StringIO instance).  A seek
+    # position (self.__pos) is maintained independently of the cache, in order
+    # that a single cache may be shared between multiple seek_wrapper objects.
+    # Copying using module copy shares the cache in this way.
+
+    def __init__(self, wrapped):
+        self.wrapped = wrapped
+        self.__read_complete_state = [False]
+        self.__is_closed_state = [False]
+        self.__have_readline = hasattr(self.wrapped, "readline")
+        self.__cache = StringIO()
+        self.__pos = 0  # seek position
+
+    def invariant(self):
+        # The end of the cache is always at the same place as the end of the
+        # wrapped file.
+        return self.wrapped.tell() == len(self.__cache.getvalue())
+
+    def close(self):
+        self.wrapped.close()
+        self.is_closed = True
+
+    def __getattr__(self, name):
+        if name == "is_closed":
+            return self.__is_closed_state[0]
+        elif name == "read_complete":
+            return self.__read_complete_state[0]
+
+        wrapped = self.__dict__.get("wrapped")
+        if wrapped:
+            return getattr(wrapped, name)
+
+        return getattr(self.__class__, name)
+
+    def __setattr__(self, name, value):
+        if name == "is_closed":
+            self.__is_closed_state[0] = bool(value)
+        elif name == "read_complete":
+            if not self.is_closed:
+                self.__read_complete_state[0] = bool(value)
+        else:
+            self.__dict__[name] = value
+
+    def seek(self, offset, whence=0):
+        assert whence in [0,1,2]
+
+        # how much data, if any, do we need to read?
+        if whence == 2:  # 2: relative to end of *wrapped* file
+            if offset < 0: raise ValueError("negative seek offset")
+            # since we don't know yet where the end of that file is, we must
+            # read everything
+            to_read = None
+        else:
+            if whence == 0:  # 0: absolute
+                if offset < 0: raise ValueError("negative seek offset")
+                dest = offset
+            else:  # 1: relative to current position
+                pos = self.__pos
+                if pos < offset:
+                    raise ValueError("seek to before start of file")
+                dest = pos + offset
+            end = len(self.__cache.getvalue())
+            to_read = dest - end
+            if to_read < 0:
+                to_read = 0
+
+        if to_read != 0:
+            self.__cache.seek(0, 2)
+            if to_read is None:
+                assert whence == 2
+                self.__cache.write(self.wrapped.read())
+                self.read_complete = True
+                self.__pos = self.__cache.tell() - offset
+            else:
+                data = self.wrapped.read(to_read)
+                if not data:
+                    self.read_complete = True
+                else:
+                    self.__cache.write(data)
+                # Don't raise an exception even if we've seek()ed past the end
+                # of .wrapped, since fseek() doesn't complain in that case.
+                # Also like fseek(), pretend we have seek()ed past the end,
+                # i.e. not:
+                #self.__pos = self.__cache.tell()
+                # but rather:
+                self.__pos = dest
+        else:
+            self.__pos = dest
+
+    def tell(self):
+        return self.__pos
+
+    def __copy__(self):
+        cpy = self.__class__(self.wrapped)
+        cpy.__cache = self.__cache
+        cpy.__read_complete_state = self.__read_complete_state
+        cpy.__is_closed_state = self.__is_closed_state
+        return cpy
+
+    def get_data(self):
+        pos = self.__pos
+        try:
+            self.seek(0)
+            return self.read(-1)
+        finally:
+            self.__pos = pos
+
+    def read(self, size=-1):
+        pos = self.__pos
+        end = len(self.__cache.getvalue())
+        available = end - pos
+
+        # enough data already cached?
+        if size <= available and size != -1:
+            self.__cache.seek(pos)
+            self.__pos = pos+size
+            return self.__cache.read(size)
+
+        # no, so read sufficient data from wrapped file and cache it
+        self.__cache.seek(0, 2)
+        if size == -1:
+            self.__cache.write(self.wrapped.read())
+            self.read_complete = True
+        else:
+            to_read = size - available
+            assert to_read > 0
+            data = self.wrapped.read(to_read)
+            if not data:
+                self.read_complete = True
+            else:
+                self.__cache.write(data)
+        self.__cache.seek(pos)
+
+        data = self.__cache.read(size)
+        self.__pos = self.__cache.tell()
+        assert self.__pos == pos + len(data)
+        return data
+
+    def readline(self, size=-1):
+        if not self.__have_readline:
+            raise NotImplementedError("no readline method on wrapped object")
+
+        # line we're about to read might not be complete in the cache, so
+        # read another line first
+        pos = self.__pos
+        self.__cache.seek(0, 2)
+        data = self.wrapped.readline()
+        if not data:
+            self.read_complete = True
+        else:
+            self.__cache.write(data)
+        self.__cache.seek(pos)
+
+        data = self.__cache.readline()
+        if size != -1:
+            r = data[:size]
+            self.__pos = pos+size
+        else:
+            r = data
+            self.__pos = pos+len(data)
+        return r
+
+    def readlines(self, sizehint=-1):
+        pos = self.__pos
+        self.__cache.seek(0, 2)
+        self.__cache.write(self.wrapped.read())
+        self.read_complete = True
+        self.__cache.seek(pos)
+        data = self.__cache.readlines(sizehint)
+        self.__pos = self.__cache.tell()
+        return data
+
+    def __iter__(self): return self
+    def next(self):
+        line = self.readline()
+        if line == "": raise StopIteration
+        return line
+
+    xreadlines = __iter__
+
+    def __repr__(self):
+        return ("<%s at %s whose wrapped object = %r>" %
+                (self.__class__.__name__, hex(abs(id(self))), self.wrapped))
+
+
+class response_seek_wrapper(seek_wrapper):
+
+    """
+    Supports copying response objects and setting response body data.
+
+    """
+
+    def __init__(self, wrapped):
+        seek_wrapper.__init__(self, wrapped)
+        self._headers = self.wrapped.info()
+
+    def __copy__(self):
+        cpy = seek_wrapper.__copy__(self)
+        # copy headers from delegate
+        cpy._headers = copy.copy(self.info())
+        return cpy
+
+    # Note that .info() and .geturl() (the only two urllib2 response methods
+    # that are not implemented by seek_wrapper) must be here explicitly rather
+    # than by seek_wrapper's __getattr__ delegation) so that the nasty
+    # dynamically-created HTTPError classes in get_seek_wrapper_class() get the
+    # wrapped object's implementation, and not HTTPError's.
+
+    def info(self):
+        return self._headers
+
+    def geturl(self):
+        return self.wrapped.geturl()
+
+    def set_data(self, data):
+        self.seek(0)
+        self.read()
+        self.close()
+        cache = self._seek_wrapper__cache = StringIO()
+        cache.write(data)
+        self.seek(0)
+
+
+class eoffile:
+    # file-like object that always claims to be at end-of-file...
+    def read(self, size=-1): return ""
+    def readline(self, size=-1): return ""
+    def __iter__(self): return self
+    def next(self): return ""
+    def close(self): pass
+
+class eofresponse(eoffile):
+    def __init__(self, url, headers, code, msg):
+        self._url = url
+        self._headers = headers
+        self.code = code
+        self.msg = msg
+    def geturl(self): return self._url
+    def info(self): return self._headers
+
+
+class closeable_response:
+    """Avoids unnecessarily clobbering urllib.addinfourl methods on .close().
+
+    Only supports responses returned by mechanize.HTTPHandler.
+
+    After .close(), the following methods are supported:
+
+    .read()
+    .readline()
+    .info()
+    .geturl()
+    .__iter__()
+    .next()
+    .close()
+
+    and the following attributes are supported:
+
+    .code
+    .msg
+
+    Also supports pickling (but the stdlib currently does something to prevent
+    it: http://python.org/sf/1144636).
+
+    """
+    # presence of this attr indicates is useable after .close()
+    closeable_response = None
+
+    def __init__(self, fp, headers, url, code, msg):
+        self._set_fp(fp)
+        self._headers = headers
+        self._url = url
+        self.code = code
+        self.msg = msg
+
+    def _set_fp(self, fp):
+        self.fp = fp
+        self.read = self.fp.read
+        self.readline = self.fp.readline
+        if hasattr(self.fp, "readlines"): self.readlines = self.fp.readlines
+        if hasattr(self.fp, "fileno"):
+            self.fileno = self.fp.fileno
+        else:
+            self.fileno = lambda: None
+        self.__iter__ = self.fp.__iter__
+        self.next = self.fp.next
+
+    def __repr__(self):
+        return '<%s at %s whose fp = %r>' % (
+            self.__class__.__name__, hex(abs(id(self))), self.fp)
+
+    def info(self):
+        return self._headers
+
+    def geturl(self):
+        return self._url
+
+    def close(self):
+        wrapped = self.fp
+        wrapped.close()
+        new_wrapped = eofresponse(
+            self._url, self._headers, self.code, self.msg)
+        self._set_fp(new_wrapped)
+
+    def __getstate__(self):
+        # There are three obvious options here:
+        # 1. truncate
+        # 2. read to end
+        # 3. close socket, pickle state including read position, then open
+        #    again on unpickle and use Range header
+        # XXXX um, 4. refuse to pickle unless .close()d.  This is better,
+        #  actually ("errors should never pass silently").  Pickling doesn't
+        #  work anyway ATM, because of http://python.org/sf/1144636 so fix
+        #  this later
+
+        # 2 breaks pickle protocol, because one expects the original object
+        # to be left unscathed by pickling.  3 is too complicated and
+        # surprising (and too much work ;-) to happen in a sane __getstate__.
+        # So we do 1.
+
+        state = self.__dict__.copy()
+        new_wrapped = eofresponse(
+            self._url, self._headers, self.code, self.msg)
+        state["wrapped"] = new_wrapped
+        return state
+
+def test_response(data='test data', headers=[],
+                  url="http://example.com/", code=200, msg="OK"):
+    return make_response(data, headers, url, code, msg)
+
+def test_html_response(data='test data', headers=[],
+                       url="http://example.com/", code=200, msg="OK"):
+    headers += [("Content-type", "text/html")]
+    return make_response(data, headers, url, code, msg)
+
+def make_response(data, headers, url, code, msg):
+    """Convenient factory for objects implementing response interface.
+
+    data: string containing response body data
+    headers: sequence of (name, value) pairs
+    url: URL of response
+    code: integer response code (e.g. 200)
+    msg: string response code message (e.g. "OK")
+
+    """
+    mime_headers = make_headers(headers)
+    r = closeable_response(StringIO(data), mime_headers, url, code, msg)
+    return response_seek_wrapper(r)
+
+
+def make_headers(headers):
+    """
+    headers: sequence of (name, value) pairs
+    """
+    hdr_text = []
+    for name_value in headers:
+        hdr_text.append("%s: %s" % name_value)
+    return mimetools.Message(StringIO("\n".join(hdr_text)))
+
+
+# Rest of this module is especially horrible, but needed, at least until fork
+# urllib2.  Even then, may want to preseve urllib2 compatibility.
+
+def get_seek_wrapper_class(response):
+    # in order to wrap response objects that are also exceptions, we must
+    # dynamically subclass the exception :-(((
+    if (isinstance(response, urllib2.HTTPError) and
+        not hasattr(response, "seek")):
+        if response.__class__.__module__ == "__builtin__":
+            exc_class_name = response.__class__.__name__
+        else:
+            exc_class_name = "%s.%s" % (
+                response.__class__.__module__, response.__class__.__name__)
+
+        class httperror_seek_wrapper(response_seek_wrapper, response.__class__):
+            # this only derives from HTTPError in order to be a subclass --
+            # the HTTPError behaviour comes from delegation
+
+            _exc_class_name = exc_class_name
+
+            def __init__(self, wrapped):
+                response_seek_wrapper.__init__(self, wrapped)
+                # be compatible with undocumented HTTPError attributes :-(
+                self.hdrs = wrapped.info()
+                self.filename = wrapped.geturl()
+
+            def __repr__(self):
+                return (
+                    "<%s (%s instance) at %s "
+                    "whose wrapped object = %r>" % (
+                    self.__class__.__name__, self._exc_class_name,
+                    hex(abs(id(self))), self.wrapped)
+                    )
+        wrapper_class = httperror_seek_wrapper
+    else:
+        wrapper_class = response_seek_wrapper
+    return wrapper_class
+
+def seek_wrapped_response(response):
+    """Return a copy of response that supports seekable response interface.
+
+    Accepts responses from both mechanize and urllib2 handlers.
+
+    Copes with both oridinary response instances and HTTPError instances (which
+    can't be simply wrapped due to the requirement of preserving the exception
+    base class).
+    """
+    if not hasattr(response, "seek"):
+        wrapper_class = get_seek_wrapper_class(response)
+        response = wrapper_class(response)
+    assert hasattr(response, "get_data")
+    return response
+
+def upgrade_response(response):
+    """Return a copy of response that supports Browser response interface.
+
+    Browser response interface is that of "seekable responses"
+    (response_seek_wrapper), plus the requirement that responses must be
+    useable after .close() (closeable_response).
+
+    Accepts responses from both mechanize and urllib2 handlers.
+
+    Copes with both ordinary response instances and HTTPError instances (which
+    can't be simply wrapped due to the requirement of preserving the exception
+    base class).
+    """
+    wrapper_class = get_seek_wrapper_class(response)
+    if hasattr(response, "closeable_response"):
+        if not hasattr(response, "seek"):
+            response = wrapper_class(response)
+        assert hasattr(response, "get_data")
+        return copy.copy(response)
+
+    # a urllib2 handler constructed the response, i.e. the response is an
+    # urllib.addinfourl or a urllib2.HTTPError, instead of a
+    # _Util.closeable_response as returned by e.g. mechanize.HTTPHandler
+    try:
+        code = response.code
+    except AttributeError:
+        code = None
+    try:
+        msg = response.msg
+    except AttributeError:
+        msg = None
+
+    # may have already-.read() data from .seek() cache
+    data = None
+    get_data = getattr(response, "get_data", None)
+    if get_data:
+        data = get_data()
+
+    response = closeable_response(
+        response.fp, response.info(), response.geturl(), code, msg)
+    response = wrapper_class(response)
+    if data:
+        response.set_data(data)
+    return response

Added: mechanize/tags/0.1.10/mechanize/_rfc3986.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_rfc3986.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_rfc3986.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,241 @@
+"""RFC 3986 URI parsing and relative reference resolution / absolutization.
+
+(aka splitting and joining)
+
+Copyright 2006 John J. Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it under
+the terms of the BSD or ZPL 2.1 licenses (see the file COPYING.txt
+included with the distribution).
+
+"""
+
+# XXX Wow, this is ugly.  Overly-direct translation of the RFC ATM.
+
+import re, urllib
+
+## def chr_range(a, b):
+##     return "".join(map(chr, range(ord(a), ord(b)+1)))
+
+## UNRESERVED_URI_CHARS = ("ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+##                         "abcdefghijklmnopqrstuvwxyz"
+##                         "0123456789"
+##                         "-_.~")
+## RESERVED_URI_CHARS = "!*'();:@&=+$,/?#[]"
+## URI_CHARS = RESERVED_URI_CHARS+UNRESERVED_URI_CHARS+'%'
+# this re matches any character that's not in URI_CHARS
+BAD_URI_CHARS_RE = re.compile("[^A-Za-z0-9\-_.~!*'();:@&=+$,/?%#[\]]")
+
+
+def clean_url(url, encoding):
+    # percent-encode illegal URI characters
+    # Trying to come up with test cases for this gave me a headache, revisit
+    # when do switch to unicode.
+    # Somebody else's comments (lost the attribution):
+##     - IE will return you the url in the encoding you send it
+##     - Mozilla/Firefox will send you latin-1 if there's no non latin-1
+##     characters in your link. It will send you utf-8 however if there are...
+    if type(url) == type(""):
+        url = url.decode(encoding, "replace")
+    url = url.strip()
+    # for second param to urllib.quote(), we want URI_CHARS, minus the
+    # 'always_safe' characters that urllib.quote() never percent-encodes
+    return urllib.quote(url.encode(encoding), "!*'();:@&=+$,/?%#[]~")
+
+def is_clean_uri(uri):
+    """
+    >>> is_clean_uri("ABC!")
+    True
+    >>> is_clean_uri(u"ABC!")
+    True
+    >>> is_clean_uri("ABC|")
+    False
+    >>> is_clean_uri(u"ABC|")
+    False
+    >>> is_clean_uri("http://example.com/0")
+    True
+    >>> is_clean_uri(u"http://example.com/0")
+    True
+    """
+    # note module re treats bytestrings as through they were decoded as latin-1
+    # so this function accepts both unicode and bytestrings
+    return not bool(BAD_URI_CHARS_RE.search(uri))
+
+
+SPLIT_MATCH = re.compile(
+    r"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?").match
+def urlsplit(absolute_uri):
+    """Return scheme, authority, path, query, fragment."""
+    match = SPLIT_MATCH(absolute_uri)
+    if match:
+        g = match.groups()
+        return g[1], g[3], g[4], g[6], g[8]
+
+def urlunsplit(parts):
+    scheme, authority, path, query, fragment = parts
+    r = []
+    append = r.append
+    if scheme is not None:
+        append(scheme)
+        append(":")
+    if authority is not None:
+        append("//")
+        append(authority)
+    append(path)
+    if query is not None:
+        append("?")
+        append(query)
+    if fragment is not None:
+        append("#")
+        append(fragment)
+    return "".join(r)
+
+def urljoin(base_uri, uri_reference):
+    return urlunsplit(urljoin_parts(urlsplit(base_uri),
+                                    urlsplit(uri_reference)))
+
+# oops, this doesn't do the same thing as the literal translation
+# from the RFC below
+## import posixpath
+## def urljoin_parts(base_parts, reference_parts):
+##     scheme, authority, path, query, fragment = base_parts
+##     rscheme, rauthority, rpath, rquery, rfragment = reference_parts
+
+##     # compute target URI path
+##     if rpath == "":
+##         tpath = path
+##     else:
+##         tpath = rpath
+##         if not tpath.startswith("/"):
+##             tpath = merge(authority, path, tpath)
+##         tpath = posixpath.normpath(tpath)
+
+##     if rscheme is not None:
+##         return (rscheme, rauthority, tpath, rquery, rfragment)
+##     elif rauthority is not None:
+##         return (scheme, rauthority, tpath, rquery, rfragment)
+##     elif rpath == "":
+##         if rquery is not None:
+##             tquery = rquery
+##         else:
+##             tquery = query
+##         return (scheme, authority, tpath, tquery, rfragment)
+##     else:
+##         return (scheme, authority, tpath, rquery, rfragment)
+
+def urljoin_parts(base_parts, reference_parts):
+    scheme, authority, path, query, fragment = base_parts
+    rscheme, rauthority, rpath, rquery, rfragment = reference_parts
+
+    if rscheme == scheme:
+        rscheme = None
+
+    if rscheme is not None:
+        tscheme, tauthority, tpath, tquery = (
+            rscheme, rauthority, remove_dot_segments(rpath), rquery)
+    else:
+        if rauthority is not None:
+            tauthority, tpath, tquery = (
+                rauthority, remove_dot_segments(rpath), rquery)
+        else:
+            if rpath == "":
+                tpath = path
+                if rquery is not None:
+                    tquery = rquery
+                else:
+                    tquery = query
+            else:
+                if rpath.startswith("/"):
+                    tpath = remove_dot_segments(rpath)
+                else:
+                    tpath = merge(authority, path, rpath)
+                    tpath = remove_dot_segments(tpath)
+                tquery = rquery
+            tauthority = authority
+        tscheme = scheme
+    tfragment = rfragment
+    return (tscheme, tauthority, tpath, tquery, tfragment)
+
+# um, something *vaguely* like this is what I want, but I have to generate
+# lots of test cases first, if only to understand what it is that
+# remove_dot_segments really does...
+## def remove_dot_segments(path):
+##     if path == '':
+##         return ''
+##     comps = path.split('/')
+##     new_comps = []
+##     for comp in comps:
+##         if comp in ['.', '']:
+##             if not new_comps or new_comps[-1]:
+##                 new_comps.append('')
+##             continue
+##         if comp != '..':
+##             new_comps.append(comp)
+##         elif new_comps:
+##             new_comps.pop()
+##     return '/'.join(new_comps)
+
+
+def remove_dot_segments(path):
+    r = []
+    while path:
+        # A
+        if path.startswith("../"):
+            path = path[3:]
+            continue
+        if path.startswith("./"):
+            path = path[2:]
+            continue
+        # B
+        if path.startswith("/./"):
+            path = path[2:]
+            continue
+        if path == "/.":
+            path = "/"
+            continue
+        # C
+        if path.startswith("/../"):
+            path = path[3:]
+            if r:
+                r.pop()
+            continue
+        if path == "/..":
+            path = "/"
+            if r:
+                r.pop()
+            continue
+        # D
+        if path == ".":
+            path = path[1:]
+            continue
+        if path == "..":
+            path = path[2:]
+            continue
+        # E
+        start = 0
+        if path.startswith("/"):
+            start = 1
+        ii = path.find("/", start)
+        if ii < 0:
+            ii = None
+        r.append(path[:ii])
+        if ii is None:
+            break
+        path = path[ii:]
+    return "".join(r)
+
+def merge(base_authority, base_path, ref_path):
+    # XXXX Oddly, the sample Perl implementation of this by Roy Fielding
+    # doesn't even take base_authority as a parameter, despite the wording in
+    # the RFC suggesting otherwise.  Perhaps I'm missing some obvious identity.
+    #if base_authority is not None and base_path == "":
+    if base_path == "":
+        return "/" + ref_path
+    ii = base_path.rfind("/")
+    if ii >= 0:
+        return base_path[:ii+1] + ref_path
+    return ref_path
+
+if __name__ == "__main__":
+    import doctest
+    doctest.testmod()

Added: mechanize/tags/0.1.10/mechanize/_seek.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_seek.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_seek.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,16 @@
+from urllib2 import BaseHandler
+from _util import deprecation
+from _response import response_seek_wrapper
+
+
+class SeekableProcessor(BaseHandler):
+    """Deprecated: Make responses seekable."""
+
+    def __init__(self):
+        deprecation(
+            "See http://wwwsearch.sourceforge.net/mechanize/doc.html#seekable")
+
+    def any_response(self, request, response):
+        if not hasattr(response, "seek"):
+            return response_seek_wrapper(response)
+        return response

Added: mechanize/tags/0.1.10/mechanize/_sockettimeout.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_sockettimeout.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_sockettimeout.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,6 @@
+import socket
+
+try:
+    _GLOBAL_DEFAULT_TIMEOUT = socket._GLOBAL_DEFAULT_TIMEOUT
+except AttributeError:
+    _GLOBAL_DEFAULT_TIMEOUT = object()

Added: mechanize/tags/0.1.10/mechanize/_testcase.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_testcase.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_testcase.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,32 @@
+import shutil
+import tempfile
+import unittest
+
+
+class TestCase(unittest.TestCase):
+
+    def setUp(self):
+        super(TestCase, self).setUp()
+        self._on_teardown = []
+
+    def make_temp_dir(self):
+        temp_dir = tempfile.mkdtemp(prefix="tmp-%s-" % self.__class__.__name__)
+        def tear_down():
+            shutil.rmtree(temp_dir)
+        self._on_teardown.append(tear_down)
+        return temp_dir
+
+    def monkey_patch(self, obj, name, value):
+        orig_value = getattr(obj, name)
+        setattr(obj, name, value)
+        def reverse_patch():
+            setattr(obj, name, orig_value)
+        self._on_teardown.append(reverse_patch)
+
+    def assert_contains(self, container, containee):
+        self.assertTrue(containee in container, "%r not in %r" %
+                        (containee, container))
+
+    def tearDown(self):
+        for func in reversed(self._on_teardown):
+            func()

Added: mechanize/tags/0.1.10/mechanize/_upgrade.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_upgrade.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_upgrade.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,40 @@
+from urllib2 import BaseHandler
+
+from _request import Request
+from _response import upgrade_response
+from _util import deprecation
+
+
+class HTTPRequestUpgradeProcessor(BaseHandler):
+    # upgrade urllib2.Request to this module's Request
+    # yuck!
+    handler_order = 0  # before anything else
+
+    def http_request(self, request):
+        if not hasattr(request, "add_unredirected_header"):
+            newrequest = Request(request.get_full_url(), request.data,
+                                 request.headers)
+            try: newrequest.origin_req_host = request.origin_req_host
+            except AttributeError: pass
+            try: newrequest.unverifiable = request.unverifiable
+            except AttributeError: pass
+            try: newrequest.visit = request.visit
+            except AttributeError: pass
+            request = newrequest
+        return request
+
+    https_request = http_request
+
+
+class ResponseUpgradeProcessor(BaseHandler):
+    # upgrade responses to be .close()able without becoming unusable
+    handler_order = 0  # before anything else
+
+    def __init__(self):
+        deprecation(
+            "See http://wwwsearch.sourceforge.net/mechanize/doc.html#seekable")
+
+    def any_response(self, request, response):
+        if not hasattr(response, 'closeable_response'):
+            response = upgrade_response(response)
+        return response

Added: mechanize/tags/0.1.10/mechanize/_urllib2.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_urllib2.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_urllib2.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,55 @@
+# urllib2 work-alike interface
+# ...from urllib2...
+from urllib2 import \
+     URLError, \
+     HTTPError, \
+     BaseHandler, \
+     UnknownHandler, \
+     FTPHandler, \
+     CacheFTPHandler
+# ...and from mechanize
+from _auth import \
+     HTTPPasswordMgr, \
+     HTTPPasswordMgrWithDefaultRealm, \
+     AbstractBasicAuthHandler, \
+     AbstractDigestAuthHandler, \
+     HTTPProxyPasswordMgr, \
+     ProxyHandler, \
+     ProxyBasicAuthHandler, \
+     ProxyDigestAuthHandler, \
+     HTTPBasicAuthHandler, \
+     HTTPDigestAuthHandler, \
+     HTTPSClientCertMgr
+from _debug import \
+     HTTPResponseDebugProcessor, \
+     HTTPRedirectDebugProcessor
+from _file import \
+     FileHandler
+# crap ATM
+## from _gzip import \
+##      HTTPGzipProcessor
+from _http import \
+     HTTPHandler, \
+     HTTPDefaultErrorHandler, \
+     HTTPRedirectHandler, \
+     HTTPEquivProcessor, \
+     HTTPCookieProcessor, \
+     HTTPRefererProcessor, \
+     HTTPRefreshProcessor, \
+     HTTPErrorProcessor, \
+     HTTPRobotRulesProcessor, \
+     RobotExclusionError
+import httplib
+if hasattr(httplib, 'HTTPS'):
+    from _http import HTTPSHandler
+del httplib
+from _opener import OpenerDirector, \
+     SeekableResponseOpener, \
+     build_opener, install_opener, urlopen
+from _request import \
+     Request
+from _seek import \
+     SeekableProcessor
+from _upgrade import \
+     HTTPRequestUpgradeProcessor, \
+     ResponseUpgradeProcessor

Added: mechanize/tags/0.1.10/mechanize/_useragent.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_useragent.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_useragent.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,352 @@
+"""Convenient HTTP UserAgent class.
+
+This is a subclass of urllib2.OpenerDirector.
+
+
+Copyright 2003-2006 John J. Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it under
+the terms of the BSD or ZPL 2.1 licenses (see the file COPYING.txt
+included with the distribution).
+
+"""
+
+import warnings
+
+import _auth
+import _gzip
+import _opener
+import _response
+import _sockettimeout
+import _urllib2
+
+
+class UserAgentBase(_opener.OpenerDirector):
+    """Convenient user-agent class.
+
+    Do not use .add_handler() to add a handler for something already dealt with
+    by this code.
+
+    The only reason at present for the distinction between UserAgent and
+    UserAgentBase is so that classes that depend on .seek()able responses
+    (e.g. mechanize.Browser) can inherit from UserAgentBase.  The subclass
+    UserAgent exposes a .set_seekable_responses() method that allows switching
+    off the adding of a .seek() method to responses.
+
+    Public attributes:
+
+    addheaders: list of (name, value) pairs specifying headers to send with
+     every request, unless they are overridden in the Request instance.
+
+     >>> ua = UserAgentBase()
+     >>> ua.addheaders = [
+     ...  ("User-agent", "Mozilla/5.0 (compatible)"),
+     ...  ("From", "responsible.person at example.com")]
+
+    """
+
+    handler_classes = {
+        # scheme handlers
+        "http": _urllib2.HTTPHandler,
+        # CacheFTPHandler is buggy, at least in 2.3, so we don't use it
+        "ftp": _urllib2.FTPHandler,
+        "file": _urllib2.FileHandler,
+
+        # other handlers
+        "_unknown": _urllib2.UnknownHandler,
+        # HTTP{S,}Handler depend on HTTPErrorProcessor too
+        "_http_error": _urllib2.HTTPErrorProcessor,
+        "_http_request_upgrade": _urllib2.HTTPRequestUpgradeProcessor,
+        "_http_default_error": _urllib2.HTTPDefaultErrorHandler,
+
+        # feature handlers
+        "_basicauth": _urllib2.HTTPBasicAuthHandler,
+        "_digestauth": _urllib2.HTTPDigestAuthHandler,
+        "_redirect": _urllib2.HTTPRedirectHandler,
+        "_cookies": _urllib2.HTTPCookieProcessor,
+        "_refresh": _urllib2.HTTPRefreshProcessor,
+        "_equiv": _urllib2.HTTPEquivProcessor,
+        "_proxy": _urllib2.ProxyHandler,
+        "_proxy_basicauth": _urllib2.ProxyBasicAuthHandler,
+        "_proxy_digestauth": _urllib2.ProxyDigestAuthHandler,
+        "_robots": _urllib2.HTTPRobotRulesProcessor,
+        "_gzip": _gzip.HTTPGzipProcessor,  # experimental!
+
+        # debug handlers
+        "_debug_redirect": _urllib2.HTTPRedirectDebugProcessor,
+        "_debug_response_body": _urllib2.HTTPResponseDebugProcessor,
+        }
+
+    default_schemes = ["http", "ftp", "file"]
+    default_others = ["_unknown", "_http_error", "_http_request_upgrade",
+                      "_http_default_error",
+                      ]
+    default_features = ["_redirect", "_cookies",
+                        "_refresh", "_equiv",
+                        "_basicauth", "_digestauth",
+                        "_proxy", "_proxy_basicauth", "_proxy_digestauth",
+                        "_robots",
+                        ]
+    if hasattr(_urllib2, 'HTTPSHandler'):
+        handler_classes["https"] = _urllib2.HTTPSHandler
+        default_schemes.append("https")
+
+    def __init__(self):
+        _opener.OpenerDirector.__init__(self)
+
+        ua_handlers = self._ua_handlers = {}
+        for scheme in (self.default_schemes+
+                       self.default_others+
+                       self.default_features):
+            klass = self.handler_classes[scheme]
+            ua_handlers[scheme] = klass()
+        for handler in ua_handlers.itervalues():
+            self.add_handler(handler)
+
+        # Yuck.
+        # Ensure correct default constructor args were passed to
+        # HTTPRefreshProcessor and HTTPEquivProcessor.
+        if "_refresh" in ua_handlers:
+            self.set_handle_refresh(True)
+        if "_equiv" in ua_handlers:
+            self.set_handle_equiv(True)
+        # Ensure default password managers are installed.
+        pm = ppm = None
+        if "_basicauth" in ua_handlers or "_digestauth" in ua_handlers:
+            pm = _urllib2.HTTPPasswordMgrWithDefaultRealm()
+        if ("_proxy_basicauth" in ua_handlers or
+            "_proxy_digestauth" in ua_handlers):
+            ppm = _auth.HTTPProxyPasswordMgr()
+        self.set_password_manager(pm)
+        self.set_proxy_password_manager(ppm)
+        # set default certificate manager
+        if "https" in ua_handlers:
+            cm = _urllib2.HTTPSClientCertMgr()
+            self.set_client_cert_manager(cm)
+
+    def close(self):
+        _opener.OpenerDirector.close(self)
+        self._ua_handlers = None
+
+    # XXX
+##     def set_timeout(self, timeout):
+##         self._timeout = timeout
+##     def set_http_connection_cache(self, conn_cache):
+##         self._http_conn_cache = conn_cache
+##     def set_ftp_connection_cache(self, conn_cache):
+##         # XXX ATM, FTP has cache as part of handler; should it be separate?
+##         self._ftp_conn_cache = conn_cache
+
+    def set_handled_schemes(self, schemes):
+        """Set sequence of URL scheme (protocol) strings.
+
+        For example: ua.set_handled_schemes(["http", "ftp"])
+
+        If this fails (with ValueError) because you've passed an unknown
+        scheme, the set of handled schemes will not be changed.
+
+        """
+        want = {}
+        for scheme in schemes:
+            if scheme.startswith("_"):
+                raise ValueError("not a scheme '%s'" % scheme)
+            if scheme not in self.handler_classes:
+                raise ValueError("unknown scheme '%s'")
+            want[scheme] = None
+
+        # get rid of scheme handlers we don't want
+        for scheme, oldhandler in self._ua_handlers.items():
+            if scheme.startswith("_"): continue  # not a scheme handler
+            if scheme not in want:
+                self._replace_handler(scheme, None)
+            else:
+                del want[scheme]  # already got it
+        # add the scheme handlers that are missing
+        for scheme in want.keys():
+            self._set_handler(scheme, True)
+
+    def set_cookiejar(self, cookiejar):
+        """Set a mechanize.CookieJar, or None."""
+        self._set_handler("_cookies", obj=cookiejar)
+
+    # XXX could use Greg Stein's httpx for some of this instead?
+    # or httplib2??
+    def set_proxies(self, proxies):
+        """Set a dictionary mapping URL scheme to proxy specification, or None.
+
+        e.g. {"http": "joe:password at myproxy.example.com:3128",
+              "ftp": "proxy.example.com"}
+
+        """
+        self._set_handler("_proxy", obj=proxies)
+
+    def add_password(self, url, user, password, realm=None):
+        self._password_manager.add_password(realm, url, user, password)
+    def add_proxy_password(self, user, password, hostport=None, realm=None):
+        self._proxy_password_manager.add_password(
+            realm, hostport, user, password)
+
+    def add_client_certificate(self, url, key_file, cert_file):
+        """Add an SSL client certificate, for HTTPS client auth.
+
+        key_file and cert_file must be filenames of the key and certificate
+        files, in PEM format.  You can use e.g. OpenSSL to convert a p12 (PKCS
+        12) file to PEM format:
+
+        openssl pkcs12 -clcerts -nokeys -in cert.p12 -out cert.pem
+        openssl pkcs12 -nocerts -in cert.p12 -out key.pem
+
+
+        Note that client certificate password input is very inflexible ATM.  At
+        the moment this seems to be console only, which is presumably the
+        default behaviour of libopenssl.  In future mechanize may support
+        third-party libraries that (I assume) allow more options here.
+
+        """
+        self._client_cert_manager.add_key_cert(url, key_file, cert_file)
+
+    # the following are rarely useful -- use add_password / add_proxy_password
+    # instead
+    def set_password_manager(self, password_manager):
+        """Set a mechanize.HTTPPasswordMgrWithDefaultRealm, or None."""
+        self._password_manager = password_manager
+        self._set_handler("_basicauth", obj=password_manager)
+        self._set_handler("_digestauth", obj=password_manager)
+    def set_proxy_password_manager(self, password_manager):
+        """Set a mechanize.HTTPProxyPasswordMgr, or None."""
+        self._proxy_password_manager = password_manager
+        self._set_handler("_proxy_basicauth", obj=password_manager)
+        self._set_handler("_proxy_digestauth", obj=password_manager)
+    def set_client_cert_manager(self, cert_manager):
+        """Set a mechanize.HTTPClientCertMgr, or None."""
+        self._client_cert_manager = cert_manager
+        handler = self._ua_handlers["https"]
+        handler.client_cert_manager = cert_manager
+
+    # these methods all take a boolean parameter
+    def set_handle_robots(self, handle):
+        """Set whether to observe rules from robots.txt."""
+        self._set_handler("_robots", handle)
+    def set_handle_redirect(self, handle):
+        """Set whether to handle HTTP 30x redirections."""
+        self._set_handler("_redirect", handle)
+    def set_handle_refresh(self, handle, max_time=None, honor_time=True):
+        """Set whether to handle HTTP Refresh headers."""
+        self._set_handler("_refresh", handle, constructor_kwds=
+                          {"max_time": max_time, "honor_time": honor_time})
+    def set_handle_equiv(self, handle, head_parser_class=None):
+        """Set whether to treat HTML http-equiv headers like HTTP headers.
+
+        Response objects may be .seek()able if this is set (currently returned
+        responses are, raised HTTPError exception responses are not).
+
+        """
+        if head_parser_class is not None:
+            constructor_kwds = {"head_parser_class": head_parser_class}
+        else:
+            constructor_kwds={}
+        self._set_handler("_equiv", handle, constructor_kwds=constructor_kwds)
+    def set_handle_gzip(self, handle):
+        """Handle gzip transfer encoding.
+
+        """
+        if handle:
+            warnings.warn(
+                "gzip transfer encoding is experimental!", stacklevel=2)
+        self._set_handler("_gzip", handle)
+    def set_debug_redirects(self, handle):
+        """Log information about HTTP redirects (including refreshes).
+
+        Logging is performed using module logging.  The logger name is
+        "mechanize.http_redirects".  To actually print some debug output,
+        eg:
+
+        import sys, logging
+        logger = logging.getLogger("mechanize.http_redirects")
+        logger.addHandler(logging.StreamHandler(sys.stdout))
+        logger.setLevel(logging.INFO)
+
+        Other logger names relevant to this module:
+
+        "mechanize.http_responses"
+        "mechanize.cookies" (or "cookielib" if running Python 2.4)
+
+        To turn on everything:
+
+        import sys, logging
+        logger = logging.getLogger("mechanize")
+        logger.addHandler(logging.StreamHandler(sys.stdout))
+        logger.setLevel(logging.INFO)
+
+        """
+        self._set_handler("_debug_redirect", handle)
+    def set_debug_responses(self, handle):
+        """Log HTTP response bodies.
+
+        See docstring for .set_debug_redirects() for details of logging.
+
+        Response objects may be .seek()able if this is set (currently returned
+        responses are, raised HTTPError exception responses are not).
+
+        """
+        self._set_handler("_debug_response_body", handle)
+    def set_debug_http(self, handle):
+        """Print HTTP headers to sys.stdout."""
+        level = int(bool(handle))
+        for scheme in "http", "https":
+            h = self._ua_handlers.get(scheme)
+            if h is not None:
+                h.set_http_debuglevel(level)
+
+    def _set_handler(self, name, handle=None, obj=None,
+                     constructor_args=(), constructor_kwds={}):
+        if handle is None:
+            handle = obj is not None
+        if handle:
+            handler_class = self.handler_classes[name]
+            if obj is not None:
+                newhandler = handler_class(obj)
+            else:
+                newhandler = handler_class(
+                    *constructor_args, **constructor_kwds)
+        else:
+            newhandler = None
+        self._replace_handler(name, newhandler)
+
+    def _replace_handler(self, name, newhandler=None):
+        # first, if handler was previously added, remove it
+        if name is not None:
+            handler = self._ua_handlers.get(name)
+            if handler:
+                try:
+                    self.handlers.remove(handler)
+                except ValueError:
+                    pass
+        # then add the replacement, if any
+        if newhandler is not None:
+            self.add_handler(newhandler)
+            self._ua_handlers[name] = newhandler
+
+
+class UserAgent(UserAgentBase):
+
+    def __init__(self):
+        UserAgentBase.__init__(self)
+        self._seekable = False
+
+    def set_seekable_responses(self, handle):
+        """Make response objects .seek()able."""
+        self._seekable = bool(handle)
+
+    def open(self, fullurl, data=None,
+             timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+        if self._seekable:
+            def bound_open(fullurl, data=None,
+                           timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+                return UserAgentBase.open(self, fullurl, data, timeout)
+            response = _opener.wrapped_open(
+                bound_open, _response.seek_wrapped_response, fullurl, data,
+                timeout)
+        else:
+            response = UserAgentBase.open(self, fullurl, data)
+        return response

Added: mechanize/tags/0.1.10/mechanize/_util.py
===================================================================
--- mechanize/tags/0.1.10/mechanize/_util.py	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize/_util.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,291 @@
+"""Utility functions and date/time routines.
+
+ Copyright 2002-2006 John J Lee <jjl at pobox.com>
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+import re, time, warnings
+
+
+class ExperimentalWarning(UserWarning):
+    pass
+
+def experimental(message):
+    warnings.warn(message, ExperimentalWarning, stacklevel=3)
+def hide_experimental_warnings():
+    warnings.filterwarnings("ignore", category=ExperimentalWarning)
+def reset_experimental_warnings():
+    warnings.filterwarnings("default", category=ExperimentalWarning)
+
+def deprecation(message):
+    warnings.warn(message, DeprecationWarning, stacklevel=3)
+def hide_deprecations():
+    warnings.filterwarnings("ignore", category=DeprecationWarning)
+def reset_deprecations():
+    warnings.filterwarnings("default", category=DeprecationWarning)
+
+
+def isstringlike(x):
+    try: x+""
+    except: return False
+    else: return True
+
+## def caller():
+##     try:
+##         raise SyntaxError
+##     except:
+##         import sys
+##     return sys.exc_traceback.tb_frame.f_back.f_back.f_code.co_name
+
+
+from calendar import timegm
+
+# Date/time conversion routines for formats used by the HTTP protocol.
+
+EPOCH = 1970
+def my_timegm(tt):
+    year, month, mday, hour, min, sec = tt[:6]
+    if ((year >= EPOCH) and (1 <= month <= 12) and (1 <= mday <= 31) and
+        (0 <= hour <= 24) and (0 <= min <= 59) and (0 <= sec <= 61)):
+        return timegm(tt)
+    else:
+        return None
+
+days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"]
+months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun",
+          "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
+months_lower = []
+for month in months: months_lower.append(month.lower())
+
+
+def time2isoz(t=None):
+    """Return a string representing time in seconds since epoch, t.
+
+    If the function is called without an argument, it will use the current
+    time.
+
+    The format of the returned string is like "YYYY-MM-DD hh:mm:ssZ",
+    representing Universal Time (UTC, aka GMT).  An example of this format is:
+
+    1994-11-24 08:49:37Z
+
+    """
+    if t is None: t = time.time()
+    year, mon, mday, hour, min, sec = time.gmtime(t)[:6]
+    return "%04d-%02d-%02d %02d:%02d:%02dZ" % (
+        year, mon, mday, hour, min, sec)
+
+def time2netscape(t=None):
+    """Return a string representing time in seconds since epoch, t.
+
+    If the function is called without an argument, it will use the current
+    time.
+
+    The format of the returned string is like this:
+
+    Wed, DD-Mon-YYYY HH:MM:SS GMT
+
+    """
+    if t is None: t = time.time()
+    year, mon, mday, hour, min, sec, wday = time.gmtime(t)[:7]
+    return "%s %02d-%s-%04d %02d:%02d:%02d GMT" % (
+        days[wday], mday, months[mon-1], year, hour, min, sec)
+
+
+UTC_ZONES = {"GMT": None, "UTC": None, "UT": None, "Z": None}
+
+timezone_re = re.compile(r"^([-+])?(\d\d?):?(\d\d)?$")
+def offset_from_tz_string(tz):
+    offset = None
+    if UTC_ZONES.has_key(tz):
+        offset = 0
+    else:
+        m = timezone_re.search(tz)
+        if m:
+            offset = 3600 * int(m.group(2))
+            if m.group(3):
+                offset = offset + 60 * int(m.group(3))
+            if m.group(1) == '-':
+                offset = -offset
+    return offset
+
+def _str2time(day, mon, yr, hr, min, sec, tz):
+    # translate month name to number
+    # month numbers start with 1 (January)
+    try:
+        mon = months_lower.index(mon.lower())+1
+    except ValueError:
+        # maybe it's already a number
+        try:
+            imon = int(mon)
+        except ValueError:
+            return None
+        if 1 <= imon <= 12:
+            mon = imon
+        else:
+            return None
+
+    # make sure clock elements are defined
+    if hr is None: hr = 0
+    if min is None: min = 0
+    if sec is None: sec = 0
+
+    yr = int(yr)
+    day = int(day)
+    hr = int(hr)
+    min = int(min)
+    sec = int(sec)
+
+    if yr < 1000:
+        # find "obvious" year
+        cur_yr = time.localtime(time.time())[0]
+        m = cur_yr % 100
+        tmp = yr
+        yr = yr + cur_yr - m
+        m = m - tmp
+        if abs(m) > 50:
+            if m > 0: yr = yr + 100
+            else: yr = yr - 100
+
+    # convert UTC time tuple to seconds since epoch (not timezone-adjusted)
+    t = my_timegm((yr, mon, day, hr, min, sec, tz))
+
+    if t is not None:
+        # adjust time using timezone string, to get absolute time since epoch
+        if tz is None:
+            tz = "UTC"
+        tz = tz.upper()
+        offset = offset_from_tz_string(tz)
+        if offset is None:
+            return None
+        t = t - offset
+
+    return t
+
+
+strict_re = re.compile(r"^[SMTWF][a-z][a-z], (\d\d) ([JFMASOND][a-z][a-z]) "
+                       r"(\d\d\d\d) (\d\d):(\d\d):(\d\d) GMT$")
+wkday_re = re.compile(
+    r"^(?:Sun|Mon|Tue|Wed|Thu|Fri|Sat)[a-z]*,?\s*", re.I)
+loose_http_re = re.compile(
+    r"""^
+    (\d\d?)            # day
+       (?:\s+|[-\/])
+    (\w+)              # month
+        (?:\s+|[-\/])
+    (\d+)              # year
+    (?:
+          (?:\s+|:)    # separator before clock
+       (\d\d?):(\d\d)  # hour:min
+       (?::(\d\d))?    # optional seconds
+    )?                 # optional clock
+       \s*
+    ([-+]?\d{2,4}|(?![APap][Mm]\b)[A-Za-z]+)? # timezone
+       \s*
+    (?:\(\w+\))?       # ASCII representation of timezone in parens.
+       \s*$""", re.X)
+def http2time(text):
+    """Returns time in seconds since epoch of time represented by a string.
+
+    Return value is an integer.
+
+    None is returned if the format of str is unrecognized, the time is outside
+    the representable range, or the timezone string is not recognized.  If the
+    string contains no timezone, UTC is assumed.
+
+    The timezone in the string may be numerical (like "-0800" or "+0100") or a
+    string timezone (like "UTC", "GMT", "BST" or "EST").  Currently, only the
+    timezone strings equivalent to UTC (zero offset) are known to the function.
+
+    The function loosely parses the following formats:
+
+    Wed, 09 Feb 1994 22:23:32 GMT       -- HTTP format
+    Tuesday, 08-Feb-94 14:15:29 GMT     -- old rfc850 HTTP format
+    Tuesday, 08-Feb-1994 14:15:29 GMT   -- broken rfc850 HTTP format
+    09 Feb 1994 22:23:32 GMT            -- HTTP format (no weekday)
+    08-Feb-94 14:15:29 GMT              -- rfc850 format (no weekday)
+    08-Feb-1994 14:15:29 GMT            -- broken rfc850 format (no weekday)
+
+    The parser ignores leading and trailing whitespace.  The time may be
+    absent.
+
+    If the year is given with only 2 digits, the function will select the
+    century that makes the year closest to the current date.
+
+    """
+    # fast exit for strictly conforming string
+    m = strict_re.search(text)
+    if m:
+        g = m.groups()
+        mon = months_lower.index(g[1].lower()) + 1
+        tt = (int(g[2]), mon, int(g[0]),
+              int(g[3]), int(g[4]), float(g[5]))
+        return my_timegm(tt)
+
+    # No, we need some messy parsing...
+
+    # clean up
+    text = text.lstrip()
+    text = wkday_re.sub("", text, 1)  # Useless weekday
+
+    # tz is time zone specifier string
+    day, mon, yr, hr, min, sec, tz = [None]*7
+
+    # loose regexp parse
+    m = loose_http_re.search(text)
+    if m is not None:
+        day, mon, yr, hr, min, sec, tz = m.groups()
+    else:
+        return None  # bad format
+
+    return _str2time(day, mon, yr, hr, min, sec, tz)
+
+
+iso_re = re.compile(
+    """^
+    (\d{4})              # year
+       [-\/]?
+    (\d\d?)              # numerical month
+       [-\/]?
+    (\d\d?)              # day
+   (?:
+         (?:\s+|[-:Tt])  # separator before clock
+      (\d\d?):?(\d\d)    # hour:min
+      (?::?(\d\d(?:\.\d*)?))?  # optional seconds (and fractional)
+   )?                    # optional clock
+      \s*
+   ([-+]?\d\d?:?(:?\d\d)?
+    |Z|z)?               # timezone  (Z is "zero meridian", i.e. GMT)
+      \s*$""", re.X)
+def iso2time(text):
+    """
+    As for http2time, but parses the ISO 8601 formats:
+
+    1994-02-03 14:15:29 -0100    -- ISO 8601 format
+    1994-02-03 14:15:29          -- zone is optional
+    1994-02-03                   -- only date
+    1994-02-03T14:15:29          -- Use T as separator
+    19940203T141529Z             -- ISO 8601 compact format
+    19940203                     -- only date
+
+    """
+    # clean up
+    text = text.lstrip()
+
+    # tz is time zone specifier string
+    day, mon, yr, hr, min, sec, tz = [None]*7
+
+    # loose regexp parse
+    m = iso_re.search(text)
+    if m is not None:
+        # XXX there's an extra bit of the timezone I'm ignoring here: is
+        #   this the right thing to do?
+        yr, mon, day, hr, min, sec, tz, _ = m.groups()
+    else:
+        return None  # bad format
+
+    return _str2time(day, mon, yr, hr, min, sec, tz)

Added: mechanize/tags/0.1.10/mechanize.egg-info/PKG-INFO
===================================================================
--- mechanize/tags/0.1.10/mechanize.egg-info/PKG-INFO	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize.egg-info/PKG-INFO	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,57 @@
+Metadata-Version: 1.0
+Name: mechanize
+Version: 0.1.10
+Summary: Stateful programmatic web browsing.
+Home-page: http://wwwsearch.sourceforge.net/mechanize/
+Author: John J. Lee
+Author-email: jjl at pobox.com
+License: BSD
+Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.10.tar.gz
+Description: Stateful programmatic web browsing, after Andy Lester's Perl module
+        WWW::Mechanize.
+        
+        The library is layered: mechanize.Browser (stateful web browser),
+        mechanize.UserAgent (configurable URL opener), plus urllib2 handlers.
+        
+        Features include: ftp:, http: and file: URL schemes, browser history,
+        high-level hyperlink and HTML form support, HTTP cookies, HTTP-EQUIV and
+        Refresh, Referer [sic] header, robots.txt, redirections, proxies, and
+        Basic and Digest HTTP authentication.  mechanize's response objects are
+        (lazily-) .seek()able and still work after .close().
+        
+        Much of the code originally derived from Perl code by Gisle Aas
+        (libwww-perl), Johnny Lee (MSIE Cookie support) and last but not least
+        Andy Lester (WWW::Mechanize).  urllib2 was written by Jeremy Hylton.
+        
+        
+Platform: any
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Intended Audience :: Developers
+Classifier: Intended Audience :: System Administrators
+Classifier: License :: OSI Approved :: BSD License
+Classifier: License :: OSI Approved :: Zope Public License
+Classifier: Natural Language :: English
+Classifier: Operating System :: OS Independent
+Classifier: Programming Language :: Python
+Classifier: Programming Language :: Python :: 2
+Classifier: Programming Language :: Python :: 2.4
+Classifier: Programming Language :: Python :: 2.5
+Classifier: Programming Language :: Python :: 2.6
+Classifier: Topic :: Internet
+Classifier: Topic :: Internet :: File Transfer Protocol (FTP)
+Classifier: Topic :: Internet :: WWW/HTTP
+Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
+Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
+Classifier: Topic :: Internet :: WWW/HTTP :: Site Management
+Classifier: Topic :: Internet :: WWW/HTTP :: Site Management :: Link Checking
+Classifier: Topic :: Software Development :: Libraries
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Classifier: Topic :: Software Development :: Testing
+Classifier: Topic :: Software Development :: Testing :: Traffic Generation
+Classifier: Topic :: System :: Archiving :: Mirroring
+Classifier: Topic :: System :: Networking :: Monitoring
+Classifier: Topic :: System :: Systems Administration
+Classifier: Topic :: Text Processing
+Classifier: Topic :: Text Processing :: Markup
+Classifier: Topic :: Text Processing :: Markup :: HTML
+Classifier: Topic :: Text Processing :: Markup :: XML

Added: mechanize/tags/0.1.10/mechanize.egg-info/SOURCES.txt
===================================================================
--- mechanize/tags/0.1.10/mechanize.egg-info/SOURCES.txt	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize.egg-info/SOURCES.txt	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,78 @@
+0.1-changes.txt
+COPYING.txt
+ChangeLog.txt
+GeneralFAQ.html
+INSTALL.txt
+MANIFEST.in
+README.html
+README.html.in
+README.txt
+doc.html
+doc.html.in
+ez_setup.py
+functional_tests.py
+setup.cfg
+setup.py
+test.py
+attic/BSDDBCookieJar.py
+attic/MSIEDBCookieJar.py
+examples/hack21.py
+examples/pypi.py
+mechanize/__init__.py
+mechanize/_auth.py
+mechanize/_beautifulsoup.py
+mechanize/_clientcookie.py
+mechanize/_debug.py
+mechanize/_file.py
+mechanize/_firefox3cookiejar.py
+mechanize/_gzip.py
+mechanize/_headersutil.py
+mechanize/_html.py
+mechanize/_http.py
+mechanize/_lwpcookiejar.py
+mechanize/_mechanize.py
+mechanize/_mozillacookiejar.py
+mechanize/_msiecookiejar.py
+mechanize/_opener.py
+mechanize/_pullparser.py
+mechanize/_request.py
+mechanize/_response.py
+mechanize/_rfc3986.py
+mechanize/_seek.py
+mechanize/_sockettimeout.py
+mechanize/_testcase.py
+mechanize/_upgrade.py
+mechanize/_urllib2.py
+mechanize/_useragent.py
+mechanize/_util.py
+mechanize.egg-info/PKG-INFO
+mechanize.egg-info/SOURCES.txt
+mechanize.egg-info/dependency_links.txt
+mechanize.egg-info/requires.txt
+mechanize.egg-info/top_level.txt
+mechanize.egg-info/zip-safe
+test/test_browser.doctest
+test/test_browser.py
+test/test_cookies.py
+test/test_date.py
+test/test_forms.doctest
+test/test_headers.py
+test/test_history.doctest
+test/test_html.doctest
+test/test_html.py
+test/test_opener.doctest
+test/test_opener.py
+test/test_password_manager.special_doctest
+test/test_pullparser.py
+test/test_request.doctest
+test/test_response.doctest
+test/test_response.py
+test/test_rfc3986.doctest
+test/test_robotfileparser.special_doctest
+test/test_urllib2.py
+test/test_useragent.py
+test-tools/cookietest.cgi
+test-tools/doctest.py
+test-tools/linecache_copy.py
+test-tools/testprogram.py
+test-tools/twisted-localserver.py
\ No newline at end of file

Added: mechanize/tags/0.1.10/mechanize.egg-info/dependency_links.txt
===================================================================
--- mechanize/tags/0.1.10/mechanize.egg-info/dependency_links.txt	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize.egg-info/dependency_links.txt	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1 @@
+

Added: mechanize/tags/0.1.10/mechanize.egg-info/requires.txt
===================================================================
--- mechanize/tags/0.1.10/mechanize.egg-info/requires.txt	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize.egg-info/requires.txt	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1 @@
+ClientForm>=0.2.6, ==dev
\ No newline at end of file

Added: mechanize/tags/0.1.10/mechanize.egg-info/top_level.txt
===================================================================
--- mechanize/tags/0.1.10/mechanize.egg-info/top_level.txt	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize.egg-info/top_level.txt	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1 @@
+mechanize

Added: mechanize/tags/0.1.10/mechanize.egg-info/zip-safe
===================================================================
--- mechanize/tags/0.1.10/mechanize.egg-info/zip-safe	                        (rev 0)
+++ mechanize/tags/0.1.10/mechanize.egg-info/zip-safe	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1 @@
+

Added: mechanize/tags/0.1.10/setup.cfg
===================================================================
--- mechanize/tags/0.1.10/setup.cfg	                        (rev 0)
+++ mechanize/tags/0.1.10/setup.cfg	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,5 @@
+[egg_info]
+tag_build = 
+tag_date = 0
+tag_svn_revision = 0
+

Added: mechanize/tags/0.1.10/setup.py
===================================================================
--- mechanize/tags/0.1.10/setup.py	                        (rev 0)
+++ mechanize/tags/0.1.10/setup.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,156 @@
+#!/usr/bin/env python
+"""Stateful programmatic web browsing.
+
+Stateful programmatic web browsing, after Andy Lester's Perl module
+WWW::Mechanize.
+
+The library is layered: mechanize.Browser (stateful web browser),
+mechanize.UserAgent (configurable URL opener), plus urllib2 handlers.
+
+Features include: ftp:, http: and file: URL schemes, browser history,
+high-level hyperlink and HTML form support, HTTP cookies, HTTP-EQUIV and
+Refresh, Referer [sic] header, robots.txt, redirections, proxies, and
+Basic and Digest HTTP authentication.  mechanize's response objects are
+(lazily-) .seek()able and still work after .close().
+
+Much of the code originally derived from Perl code by Gisle Aas
+(libwww-perl), Johnny Lee (MSIE Cookie support) and last but not least
+Andy Lester (WWW::Mechanize).  urllib2 was written by Jeremy Hylton.
+
+"""
+
+def unparse_version(tup):
+    major, minor, bugfix, state_char, pre = tup
+    fmt = "%s.%s.%s"
+    args = [major, minor, bugfix]
+    if state_char is not None:
+        fmt += "%s"
+        args.append(state_char)
+    if pre is not None:
+        fmt += "-pre%s"
+        args.append(pre)
+    return fmt % tuple(args)
+
+def str_to_tuple(text):
+    if text.startswith("("):
+        text = text[1:-1]
+    els = [el.strip() for el in text.split(",")]
+    newEls = []
+    for ii in range(len(els)):
+        el = els[ii]
+        if el == "None":
+            newEls.append(None)
+        elif 0 <= ii < 3:
+            newEls.append(int(el))
+        else:
+            if el.startswith("'") or el.startswith('"'):
+                el = el[1:-1]
+            newEls.append(el)
+    return tuple(newEls)
+
+import re
+## VERSION_MATCH = re.search(r'__version__ = \((.*)\)',
+##                           open("mechanize/_mechanize.py").read())
+## VERSION = unparse_version(str_to_tuple(VERSION_MATCH.group(1)))
+VERSION = "0.1.10"
+INSTALL_REQUIRES = ["ClientForm>=0.2.6, ==dev"]
+NAME = "mechanize"
+PACKAGE = True
+LICENSE = "BSD"  # or ZPL 2.1
+PLATFORMS = ["any"]
+ZIP_SAFE = True
+CLASSIFIERS = """\
+Development Status :: 5 - Production/Stable
+Intended Audience :: Developers
+Intended Audience :: System Administrators
+License :: OSI Approved :: BSD License
+License :: OSI Approved :: Zope Public License
+Natural Language :: English
+Operating System :: OS Independent
+Programming Language :: Python
+Programming Language :: Python :: 2
+Programming Language :: Python :: 2.4
+Programming Language :: Python :: 2.5
+Programming Language :: Python :: 2.6
+Topic :: Internet
+Topic :: Internet :: File Transfer Protocol (FTP)
+Topic :: Internet :: WWW/HTTP
+Topic :: Internet :: WWW/HTTP :: Browsers
+Topic :: Internet :: WWW/HTTP :: Indexing/Search
+Topic :: Internet :: WWW/HTTP :: Site Management
+Topic :: Internet :: WWW/HTTP :: Site Management :: Link Checking
+Topic :: Software Development :: Libraries
+Topic :: Software Development :: Libraries :: Python Modules
+Topic :: Software Development :: Testing
+Topic :: Software Development :: Testing :: Traffic Generation
+Topic :: System :: Archiving :: Mirroring
+Topic :: System :: Networking :: Monitoring
+Topic :: System :: Systems Administration
+Topic :: Text Processing
+Topic :: Text Processing :: Markup
+Topic :: Text Processing :: Markup :: HTML
+Topic :: Text Processing :: Markup :: XML
+"""
+
+#-------------------------------------------------------
+# the rest is constant for most of my released packages:
+
+import sys
+
+if PACKAGE:
+    packages, py_modules = [NAME], None
+else:
+    packages, py_modules = None, [NAME]
+
+doclines = __doc__.split("\n")
+
+if not hasattr(sys, "version_info") or sys.version_info < (2, 3):
+    from distutils.core import setup
+    _setup = setup
+    def setup(**kwargs):
+        for key in [
+            # distutils >= Python 2.3 args
+            # XXX probably download_url came in earlier than 2.3
+            "classifiers", "download_url",
+            # setuptools args
+            "install_requires", "zip_safe", "test_suite",
+            ]:
+            if kwargs.has_key(key):
+                del kwargs[key]
+        # Only want packages keyword if this is a package,
+        # only want py_modules keyword if this is a single-file module,
+        # so get rid of packages or py_modules keyword as appropriate.
+        if kwargs["packages"] is None:
+            del kwargs["packages"]
+        else:
+            del kwargs["py_modules"]
+        apply(_setup, (), kwargs)
+else:
+    import ez_setup
+    ez_setup.use_setuptools()
+    from setuptools import setup
+
+def main():
+    setup(
+        name = NAME,
+        version = VERSION,
+        license = LICENSE,
+        platforms = PLATFORMS,
+        classifiers = [c for c in CLASSIFIERS.split("\n") if c],
+        install_requires = INSTALL_REQUIRES,
+        zip_safe = ZIP_SAFE,
+        test_suite = "test",
+        author = "John J. Lee",
+        author_email = "jjl at pobox.com",
+        description = doclines[0],
+        long_description = "\n".join(doclines[2:]),
+        url = "http://wwwsearch.sourceforge.net/%s/" % NAME,
+        download_url = ("http://wwwsearch.sourceforge.net/%s/src/"
+                        "%s-%s.tar.gz" % (NAME, NAME, VERSION)),
+        py_modules = py_modules,
+        packages = packages,
+        )
+
+
+if __name__ == "__main__":
+    main()


Property changes on: mechanize/tags/0.1.10/setup.py
___________________________________________________________________
Added: svn:executable
   + 

Added: mechanize/tags/0.1.10/test/test_browser.doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_browser.doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_browser.doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,296 @@
+>>> import mechanize
+>>> from mechanize._response import test_response
+>>> from test_browser import TestBrowser2, make_mock_handler
+
+
+Opening a new response should close the old one.
+
+>>> class TestHttpHandler(mechanize.BaseHandler):
+...     def http_open(self, request):
+...         return test_response(url=request.get_full_url())
+>>> class TestHttpBrowser(TestBrowser2):
+...     handler_classes = TestBrowser2.handler_classes.copy()
+...     handler_classes["http"] = TestHttpHandler
+...     default_schemes = ["http"]
+>>> def response_impl(response):
+...     return response.wrapped.fp.__class__.__name__
+
+>>> br = TestHttpBrowser()
+>>> r = br.open("http://example.com")
+>>> print response_impl(r)
+StringI
+>>> r2 = br.open("http://example.com")
+>>> print response_impl(r2)
+StringI
+>>> print response_impl(r)
+eofresponse
+
+So should .set_response()
+
+>>> br.set_response(test_response())
+>>> print response_impl(r2)
+eofresponse
+
+
+.visit_response() works very similarly to .open()
+
+>>> br = TestHttpBrowser()
+>>> r = br.open("http://example.com")
+>>> r2 = test_response(url="http://example.com/2")
+>>> print response_impl(r2)
+StringI
+>>> br.visit_response(r2)
+>>> print response_impl(r)
+eofresponse
+>>> br.geturl() == br.request.get_full_url() == "http://example.com/2"
+True
+>>> junk = br.back()
+>>> br.geturl() == br.request.get_full_url() == "http://example.com"
+True
+
+
+.back() may reload if the complete response was not read.  If so, it
+should return the new response, not the old one
+
+>>> class ReloadCheckBrowser(TestHttpBrowser):
+...     reloaded = False
+...     def reload(self):
+...         self.reloaded = True
+...         return TestHttpBrowser.reload(self)
+>>> br = ReloadCheckBrowser()
+>>> old = br.open("http://example.com")
+>>> junk = br.open("http://example.com/2")
+>>> new = br.back()
+>>> br.reloaded
+True
+>>> new.wrapped is not old.wrapped
+True
+
+
+Warn early about some mistakes setting a response object
+
+>>> import StringIO
+>>> br = TestBrowser2()
+>>> br.set_response("blah")
+Traceback (most recent call last):
+...
+ValueError: not a response object
+>>> br.set_response(StringIO.StringIO())
+Traceback (most recent call last):
+...
+ValueError: not a response object
+
+
+.open() without an appropriate scheme handler should fail with
+URLError
+
+>>> br = TestBrowser2()
+>>> br.open("http://example.com")
+Traceback (most recent call last):
+...
+URLError: <urlopen error unknown url type: http>
+
+Reload after failed .open() should fail due to failure to open, not
+with BrowserStateError
+
+>>> br.reload()
+Traceback (most recent call last):
+...
+URLError: <urlopen error unknown url type: http>
+
+
+.clear_history() should do what it says on the tin.  Note that the
+history does not include the current response!
+
+>>> br = TestBrowser2()
+>>> br.add_handler(make_mock_handler(test_response)([("http_open", None)]))
+
+>>> br.response() is None
+True
+>>> len(br._history._history)
+0
+
+>>> r = br.open("http://example.com/1")
+>>> br.response() is not None
+True
+>>> len(br._history._history)
+0
+
+>>> br.clear_history()
+>>> br.response() is not None
+True
+>>> len(br._history._history)
+0
+
+>>> r = br.open("http://example.com/2")
+>>> br.response() is not None
+True
+>>> len(br._history._history)
+1
+
+>>> br.clear_history()
+>>> br.response() is not None
+True
+>>> len(br._history._history)
+0
+
+
+.open()ing a Request with False .visit does not affect Browser state.
+Redirections during such a non-visiting request should also be
+non-visiting.
+
+>>> from mechanize import BrowserStateError, Request, HTTPRedirectHandler
+>>> from test_urllib2 import MockHTTPHandler
+
+>>> def make_browser_with_redirect():
+...     br = TestBrowser2()
+...     hh = MockHTTPHandler(302, "Location: http://example.com/\r\n\r\n")
+...     br.add_handler(hh)
+...     br.add_handler(HTTPRedirectHandler())
+...     return br
+>>> def raises(exc_class, fn, *args, **kwds):
+...     try:
+...         fn(*args, **kwds)
+...     except exc_class, exc:
+...         return True
+...     return False
+>>> def test_state(br):
+...     return (br.request is None and
+...             br.response() is None and
+...             raises(BrowserStateError, br.back)
+...             )
+>>> br = make_browser_with_redirect()
+>>> test_state(br)
+True
+>>> req = Request("http://example.com")
+>>> req.visit = False
+>>> r = br.open(req)
+>>> test_state(br)
+True
+
+.open_novisit() mutates the request object
+
+>>> br = make_browser_with_redirect()
+>>> test_state(br)
+True
+>>> req = Request("http://example.com")
+>>> print req.visit
+None
+>>> r = br.open_novisit(req)
+>>> test_state(br)
+True
+>>> req.visit
+False
+
+
+...in fact, any redirection (but not refresh), proxy request, basic or
+digest auth request, or robots.txt request should be non-visiting,
+even if .visit is True:
+
+>>> from test_urllib2 import MockPasswordManager
+>>> def test_one_visit(handlers):
+...     br = TestBrowser2()
+...     for handler in handlers: br.add_handler(handler)
+...     req = Request("http://example.com")
+...     req.visit = True
+...     br.open(req)
+...     return br
+>>> def test_state(br):
+...     # XXX the _history._history check is needed because of the weird
+...     # throwing-away of history entries by .back() where response is
+...     # None, which makes the .back() check insufficient to tell if a
+...     # history entry was .add()ed.  I don't want to change this until
+...     # post-stable.
+...     return (
+...         br.response() and
+...         br.request and
+...         len(br._history._history) == 0 and
+...         raises(BrowserStateError, br.back))
+
+>>> hh = MockHTTPHandler(302, "Location: http://example.com/\r\n\r\n")
+>>> br = test_one_visit([hh, HTTPRedirectHandler()])
+>>> test_state(br)
+True
+
+>>> class MockPasswordManager:
+...     def add_password(self, realm, uri, user, password): pass
+...     def find_user_password(self, realm, authuri): return '', ''
+
+>>> ah = mechanize.HTTPBasicAuthHandler(MockPasswordManager())
+>>> hh = MockHTTPHandler(
+...     401, 'WWW-Authenticate: Basic realm="realm"\r\n\r\n')
+>>> test_state(test_one_visit([hh, ah]))
+True
+
+>>> ph = mechanize.ProxyHandler(dict(http="proxy.example.com:3128"))
+>>> ah = mechanize.ProxyBasicAuthHandler(MockPasswordManager())
+>>> hh = MockHTTPHandler(
+...     407, 'Proxy-Authenticate: Basic realm="realm"\r\n\r\n')
+>>> test_state(test_one_visit([ph, hh, ah]))
+True
+
+XXX Can't really fix this one properly without significant changes --
+the refresh should go onto the history *after* the call, but currently
+all redirects, including refreshes, are done by recursive .open()
+calls, which gets the history wrong in this case.  Will have to wait
+until after stable release:
+
+#>>> hh = MockHTTPHandler(
+#...     "refresh", 'Location: http://example.com/\r\n\r\n')
+#>>> br = test_one_visit([hh, HTTPRedirectHandler()])
+#>>> br.response() is not None
+#True
+#>>> br.request is not None
+#True
+#>>> r = br.back()
+
+XXX digest, robots
+
+
+.global_form() is separate from the other forms (partly for backwards-
+compatibility reasons).
+
+>>> from mechanize._response import test_response
+>>> br = TestBrowser2()
+>>> html = """\
+... <html><body>
+... <input type="text" name="a" />
+... <form><input type="text" name="b" /></form>
+... </body></html>
+... """
+>>> response = test_response(html, headers=[("Content-type", "text/html")])
+>>> br.global_form()
+Traceback (most recent call last):
+BrowserStateError: not viewing any document
+>>> br.set_response(response)
+>>> br.global_form().find_control(nr=0).name
+'a'
+>>> len(list(br.forms()))
+1
+>>> iter(br.forms()).next().find_control(nr=0).name
+'b'
+
+
+
+.select_form() works with the global form
+
+>>> import ClientForm
+>>> from mechanize._response import test_html_response
+>>> br = TestBrowser2()
+>>> br.visit_response(test_html_response("""\
+... <html><head><title></title></head><body>
+... <input type="text" name="a" value="b"></input>
+... <form>
+...     <input type="text" name="p" value="q"></input>
+... </form>
+... </body></html>"""))
+>>> def has_a(form):
+...     try:
+...         form.find_control(name="a")
+...     except ClientForm.ControlNotFoundError:
+...         return False
+...     else:
+...         return True
+>>> br.select_form(predicate=has_a)
+>>> br.form.find_control(name="a").value
+'b'

Added: mechanize/tags/0.1.10/test/test_browser.py
===================================================================
--- mechanize/tags/0.1.10/test/test_browser.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_browser.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,774 @@
+#!/usr/bin/env python
+"""Tests for mechanize.Browser."""
+
+import sys, os, random
+from unittest import TestCase
+import StringIO, re, urllib2
+
+import mechanize
+from mechanize._response import test_html_response
+FACTORY_CLASSES = [mechanize.DefaultFactory, mechanize.RobustFactory]
+
+
+# XXX these 'mock' classes are badly in need of simplification / removal
+# (note this stuff is also used by test_useragent.py and test_browser.doctest)
+class MockMethod:
+    def __init__(self, meth_name, action, handle):
+        self.meth_name = meth_name
+        self.handle = handle
+        self.action = action
+    def __call__(self, *args):
+        return apply(self.handle, (self.meth_name, self.action)+args)
+
+class MockHeaders(dict):
+    def getheaders(self, name):
+        name = name.lower()
+        return [v for k, v in self.iteritems() if name == k.lower()]
+
+class MockResponse:
+    closeable_response = None
+    def __init__(self, url="http://example.com/", data=None, info=None):
+        self.url = url
+        self.fp = StringIO.StringIO(data)
+        if info is None: info = {}
+        self._info = MockHeaders(info)
+    def info(self): return self._info
+    def geturl(self): return self.url
+    def read(self, size=-1): return self.fp.read(size)
+    def seek(self, whence):
+        assert whence == 0
+        self.fp.seek(0)
+    def close(self): pass
+    def get_data(self): pass
+
+def make_mock_handler(response_class=MockResponse):
+    class MockHandler:
+        processor_order = 500
+        handler_order = -1
+        def __init__(self, methods):
+            self._define_methods(methods)
+        def _define_methods(self, methods):
+            for name, action in methods:
+                if name.endswith("_open"):
+                    meth = MockMethod(name, action, self.handle)
+                else:
+                    meth = MockMethod(name, action, self.process)
+                setattr(self.__class__, name, meth)
+        def handle(self, fn_name, response, *args, **kwds):
+            self.parent.calls.append((self, fn_name, args, kwds))
+            if response:
+                if isinstance(response, urllib2.HTTPError):
+                    raise response
+                r = response
+                r.seek(0)
+            else:
+                r = response_class()
+            req = args[0]
+            r.url = req.get_full_url()
+            return r
+        def process(self, fn_name, action, *args, **kwds):
+            self.parent.calls.append((self, fn_name, args, kwds))
+            if fn_name.endswith("_request"):
+                return args[0]
+            else:
+                return args[1]
+        def close(self): pass
+        def add_parent(self, parent):
+            self.parent = parent
+            self.parent.calls = []
+        def __lt__(self, other):
+            if not hasattr(other, "handler_order"):
+                # Try to preserve the old behavior of having custom classes
+                # inserted after default ones (works only for custom user
+                # classes which are not aware of handler_order).
+                return True
+            return self.handler_order < other.handler_order
+    return MockHandler
+
+class TestBrowser(mechanize.Browser):
+    default_features = []
+    default_others = []
+    default_schemes = []
+
+class TestBrowser2(mechanize.Browser):
+    # XXX better name!
+    # As TestBrowser, this is neutered so doesn't know about protocol handling,
+    # but still knows what to do with unknown schemes, etc., because
+    # UserAgent's default_others list is left intact, including classes like
+    # UnknownHandler
+    default_features = []
+    default_schemes = []
+
+
+class BrowserTests(TestCase):
+
+    def test_referer(self):
+        b = TestBrowser()
+        url = "http://www.example.com/"
+        r = MockResponse(url,
+"""<html>
+<head><title>Title</title></head>
+<body>
+<form name="form1">
+ <input type="hidden" name="foo" value="bar"></input>
+ <input type="submit"></input>
+ </form>
+<a href="http://example.com/foo/bar.html" name="apples"></a>
+<a href="https://example.com/spam/eggs.html" name="secure"></a>
+<a href="blah://example.com/" name="pears"></a>
+</body>
+</html>
+""", {"content-type": "text/html"})
+        b.add_handler(make_mock_handler()([("http_open", r)]))
+
+        # Referer not added by .open()...
+        req = mechanize.Request(url)
+        b.open(req)
+        self.assert_(req.get_header("Referer") is None)
+        # ...even if we're visiting a document
+        b.open(req)
+        self.assert_(req.get_header("Referer") is None)
+        # Referer added by .click_link() and .click()
+        b.select_form("form1")
+        req2 = b.click()
+        self.assertEqual(req2.get_header("Referer"), url)
+        r2 = b.open(req2)
+        req3 = b.click_link(name="apples")
+        self.assertEqual(req3.get_header("Referer"), url+"?foo=bar")
+        # Referer not added when going from https to http URL
+        b.add_handler(make_mock_handler()([("https_open", r)]))
+        r3 = b.open(req3)
+        req4 = b.click_link(name="secure")
+        self.assertEqual(req4.get_header("Referer"),
+                         "http://example.com/foo/bar.html")
+        r4 = b.open(req4)
+        req5 = b.click_link(name="apples")
+        self.assert_(not req5.has_header("Referer"))
+        # Referer not added for non-http, non-https requests
+        b.add_handler(make_mock_handler()([("blah_open", r)]))
+        req6 = b.click_link(name="pears")
+        self.assert_(not req6.has_header("Referer"))
+        # Referer not added when going from non-http, non-https URL
+        r4 = b.open(req6)
+        req7 = b.click_link(name="apples")
+        self.assert_(not req7.has_header("Referer"))
+
+        # XXX Referer added for redirect
+
+    def test_encoding(self):
+        import mechanize
+        from StringIO import StringIO
+        import urllib, mimetools
+        # always take first encoding, since that's the one from the real HTTP
+        # headers, rather than from HTTP-EQUIV
+        b = mechanize.Browser()
+        for s, ct in [("", mechanize._html.DEFAULT_ENCODING),
+
+                      ("Foo: Bar\r\n\r\n", mechanize._html.DEFAULT_ENCODING),
+
+                      ("Content-Type: text/html; charset=UTF-8\r\n\r\n",
+                       "UTF-8"),
+
+                      ("Content-Type: text/html; charset=UTF-8\r\n"
+                       "Content-Type: text/html; charset=KOI8-R\r\n\r\n",
+                       "UTF-8"),
+                      ]:
+            msg = mimetools.Message(StringIO(s))
+            r = urllib.addinfourl(StringIO(""), msg, "http://www.example.com/")
+            b.set_response(r)
+            self.assertEqual(b.encoding(), ct)
+
+    def test_history(self):
+        import mechanize
+        from mechanize import _response
+
+        def same_response(ra, rb):
+            return ra.wrapped is rb.wrapped
+
+        class Handler(mechanize.BaseHandler):
+            def http_open(self, request):
+                r = _response.test_response(url=request.get_full_url())
+                # these tests aren't interested in auto-.reload() behaviour of
+                # .back(), so read the response to prevent that happening
+                r.get_data()
+                return r
+
+        b = TestBrowser2()
+        b.add_handler(Handler())
+        self.assertRaises(mechanize.BrowserStateError, b.back)
+        r1 = b.open("http://example.com/")
+        self.assertRaises(mechanize.BrowserStateError, b.back)
+        r2 = b.open("http://example.com/foo")
+        self.assert_(same_response(b.back(), r1))
+        r3 = b.open("http://example.com/bar")
+        r4 = b.open("http://example.com/spam")
+        self.assert_(same_response(b.back(), r3))
+        self.assert_(same_response(b.back(), r1))
+        self.assertEquals(b.geturl(), "http://example.com/")
+        self.assertRaises(mechanize.BrowserStateError, b.back)
+        # reloading does a real HTTP fetch rather than using history cache
+        r5 = b.reload()
+        self.assert_(not same_response(r5, r1))
+        # .geturl() gets fed through to b.response
+        self.assertEquals(b.geturl(), "http://example.com/")
+        # can go back n times
+        r6 = b.open("spam")
+        self.assertEquals(b.geturl(), "http://example.com/spam")
+        r7 = b.open("/spam")
+        self.assert_(same_response(b.response(), r7))
+        self.assertEquals(b.geturl(), "http://example.com/spam")
+        self.assert_(same_response(b.back(2), r5))
+        self.assertEquals(b.geturl(), "http://example.com/")
+        self.assertRaises(mechanize.BrowserStateError, b.back, 2)
+        r8 = b.open("/spam")
+
+        # even if we get an HTTPError, history, .response() and .request should
+        # still get updated
+        class Handler2(mechanize.BaseHandler):
+            def https_open(self, request):
+                r = urllib2.HTTPError(
+                    "https://example.com/bad", 503, "Oops",
+                    MockHeaders(), StringIO.StringIO())
+                return r
+        b.add_handler(Handler2())
+        self.assertRaises(urllib2.HTTPError, b.open, "https://example.com/badreq")
+        self.assertEqual(b.response().geturl(), "https://example.com/bad")
+        self.assertEqual(b.request.get_full_url(), "https://example.com/badreq")
+        self.assert_(same_response(b.back(), r8))
+
+        # .close() should make use of Browser methods and attributes complain
+        # noisily, since they should not be called after .close()
+        b.form = "blah"
+        b.close()
+        for attr in ("form open error retrieve add_handler "
+                     "request response set_response geturl reload back "
+                     "clear_history set_cookie links forms viewing_html "
+                     "encoding title select_form click submit click_link "
+                     "follow_link find_link".split()
+                     ):
+            self.assert_(getattr(b, attr) is None)
+
+    def test_reload_read_incomplete(self):
+        import mechanize
+        from mechanize._response import test_response
+        class Browser(TestBrowser):
+            def __init__(self):
+                TestBrowser.__init__(self)
+                self.reloaded = False
+            def reload(self):
+                self.reloaded = True
+                TestBrowser.reload(self)
+        br = Browser()
+        data = "<html><head><title></title></head><body>%s</body></html>"
+        data = data % ("The quick brown fox jumps over the lazy dog."*100)
+        class Handler(mechanize.BaseHandler):
+            def http_open(self, requst):
+                return test_response(data, [("content-type", "text/html")])
+        br.add_handler(Handler())
+
+        # .reload() on .back() if the whole response hasn't already been read
+        # (.read_incomplete is True)
+        r = br.open("http://example.com")
+        r.read(10)
+        br.open('http://www.example.com/blah')
+        self.failIf(br.reloaded)
+        br.back()
+        self.assert_(br.reloaded)
+
+        # don't reload if already read
+        br.reloaded = False
+        br.response().read()
+        br.open('http://www.example.com/blah')
+        br.back()
+        self.failIf(br.reloaded)
+
+    def test_viewing_html(self):
+        # XXX not testing multiple Content-Type headers
+        import mechanize
+        url = "http://example.com/"
+
+        for allow_xhtml in False, True:
+            for ct, expect in [
+                (None, False),
+                ("text/plain", False),
+                ("text/html", True),
+
+                # don't try to handle XML until we can do it right!
+                ("text/xhtml", allow_xhtml),
+                ("text/xml", allow_xhtml),
+                ("application/xml", allow_xhtml),
+                ("application/xhtml+xml", allow_xhtml),
+
+                ("text/html; charset=blah", True),
+                (" text/html ; charset=ook ", True),
+                ]:
+                b = TestBrowser(mechanize.DefaultFactory(
+                    i_want_broken_xhtml_support=allow_xhtml))
+                hdrs = {}
+                if ct is not None:
+                    hdrs["Content-Type"] = ct
+                b.add_handler(make_mock_handler()([("http_open",
+                                            MockResponse(url, "", hdrs))]))
+                r = b.open(url)
+                self.assertEqual(b.viewing_html(), expect)
+
+        for allow_xhtml in False, True:
+            for ext, expect in [
+                (".htm", True),
+                (".html", True),
+
+                # don't try to handle XML until we can do it right!
+                (".xhtml", allow_xhtml),
+
+                (".html?foo=bar&a=b;whelk#kool", True),
+                (".txt", False),
+                (".xml", False),
+                ("", False),
+                ]:
+                b = TestBrowser(mechanize.DefaultFactory(
+                    i_want_broken_xhtml_support=allow_xhtml))
+                url = "http://example.com/foo"+ext
+                b.add_handler(make_mock_handler()(
+                    [("http_open", MockResponse(url, "", {}))]))
+                r = b.open(url)
+                self.assertEqual(b.viewing_html(), expect)
+
+    def test_empty(self):
+        for factory_class in FACTORY_CLASSES:
+            self._test_empty(factory_class())
+
+    def _test_empty(self, factory):
+        import mechanize
+        url = "http://example.com/"
+
+        b = TestBrowser(factory=factory)
+
+        self.assert_(b.response() is None)
+
+        # To open a relative reference (often called a "relative URL"), you
+        # have to have already opened a URL for it "to be relative to".
+        self.assertRaises(mechanize.BrowserStateError, b.open, "relative_ref")
+
+        # we can still clear the history even if we've not visited any URL
+        b.clear_history()
+
+        # most methods raise BrowserStateError...
+        def test_state_error(method_names):
+            for attr in method_names:
+                method = getattr(b, attr)
+                #print attr
+                self.assertRaises(mechanize.BrowserStateError, method)
+            self.assertRaises(mechanize.BrowserStateError, b.select_form,
+                              name="blah")
+            self.assertRaises(mechanize.BrowserStateError, b.find_link,
+                              name="blah")
+        # ...if not visiting a URL...
+        test_state_error(("geturl reload back viewing_html encoding "
+                          "click links forms title select_form".split()))
+        self.assertRaises(mechanize.BrowserStateError, b.set_cookie, "foo=bar")
+        self.assertRaises(mechanize.BrowserStateError, b.submit, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.click_link, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.follow_link, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.find_link, nr=0)
+        # ...and lots do so if visiting a non-HTML URL
+        b.add_handler(make_mock_handler()(
+            [("http_open", MockResponse(url, "", {}))]))
+        r = b.open(url)
+        self.assert_(not b.viewing_html())
+        test_state_error("click links forms title select_form".split())
+        self.assertRaises(mechanize.BrowserStateError, b.submit, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.click_link, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.follow_link, nr=0)
+        self.assertRaises(mechanize.BrowserStateError, b.find_link, nr=0)
+
+        b = TestBrowser()
+        r = MockResponse(url,
+"""<html>
+<head><title>Title</title></head>
+<body>
+</body>
+</html>
+""", {"content-type": "text/html"})
+        b.add_handler(make_mock_handler()([("http_open", r)]))
+        r = b.open(url)
+        self.assertEqual(b.title(), "Title")
+        self.assertEqual(len(list(b.links())), 0)
+        self.assertEqual(len(list(b.forms())), 0)
+        self.assertRaises(ValueError, b.select_form)
+        self.assertRaises(mechanize.FormNotFoundError, b.select_form,
+                          name="blah")
+        self.assertRaises(mechanize.FormNotFoundError, b.select_form,
+                          predicate=lambda form: form is not b.global_form())
+        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
+                          name="blah")
+        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
+                          predicate=lambda x: True)
+
+    def test_forms(self):
+        for factory_class in FACTORY_CLASSES:
+            self._test_forms(factory_class())
+    def _test_forms(self, factory):
+        import mechanize
+        url = "http://example.com"
+
+        b = TestBrowser(factory=factory)
+        r = test_html_response(
+            url=url,
+            headers=[("content-type", "text/html")],
+            data="""\
+<html>
+<head><title>Title</title></head>
+<body>
+<form name="form1">
+ <input type="text"></input>
+ <input type="checkbox" name="cheeses" value="cheddar"></input>
+ <input type="checkbox" name="cheeses" value="edam"></input>
+ <input type="submit" name="one"></input>
+</form>
+<a href="http://example.com/foo/bar.html" name="apples">
+<form name="form2">
+ <input type="submit" name="two">
+</form>
+</body>
+</html>
+"""
+            )
+        b.add_handler(make_mock_handler()([("http_open", r)]))
+        r = b.open(url)
+
+        forms = list(b.forms())
+        self.assertEqual(len(forms), 2)
+        for got, expect in zip([f.name for f in forms], [
+            "form1", "form2"]):
+            self.assertEqual(got, expect)
+
+        self.assertRaises(mechanize.FormNotFoundError, b.select_form, "foo")
+
+        # no form is set yet
+        self.assertRaises(AttributeError, getattr, b, "possible_items")
+        b.select_form("form1")
+        # now unknown methods are fed through to selected ClientForm.HTMLForm
+        self.assertEqual(
+            [i.name for i in b.find_control("cheeses").items],
+            ["cheddar", "edam"])
+        b["cheeses"] = ["cheddar", "edam"]
+        self.assertEqual(b.click_pairs(), [
+            ("cheeses", "cheddar"), ("cheeses", "edam"), ("one", "")])
+
+        b.select_form(nr=1)
+        self.assertEqual(b.name, "form2")
+        self.assertEqual(b.click_pairs(), [("two", "")])
+
+    def test_link_encoding(self):
+        for factory_class in FACTORY_CLASSES:
+            self._test_link_encoding(factory_class())
+    def _test_link_encoding(self, factory):
+        import urllib
+        import mechanize
+        from mechanize._rfc3986 import clean_url
+        url = "http://example.com/"
+        for encoding in ["UTF-8", "latin-1"]:
+            encoding_decl = "; charset=%s" % encoding
+            b = TestBrowser(factory=factory)
+            r = MockResponse(url, """\
+<a href="http://example.com/foo/bar&mdash;&#x2014;.html"
+   name="name0&mdash;&#x2014;">blah&mdash;&#x2014;</a>
+""", #"
+{"content-type": "text/html%s" % encoding_decl})
+            b.add_handler(make_mock_handler()([("http_open", r)]))
+            r = b.open(url)
+
+            Link = mechanize.Link
+            try:
+                mdashx2 = u"\u2014".encode(encoding)*2
+            except UnicodeError:
+                mdashx2 = '&mdash;&#x2014;'
+            qmdashx2 = clean_url(mdashx2, encoding)
+            # base_url, url, text, tag, attrs
+            exp = Link(url, "http://example.com/foo/bar%s.html" % qmdashx2,
+                       "blah"+mdashx2, "a",
+                       [("href", "http://example.com/foo/bar%s.html" % mdashx2),
+                        ("name", "name0%s" % mdashx2)])
+            # nr
+            link = b.find_link()
+##             print
+##             print exp
+##             print link
+            self.assertEqual(link, exp)
+
+    def test_link_whitespace(self):
+        from mechanize import Link
+        for factory_class in FACTORY_CLASSES:
+            base_url = "http://example.com/"
+            url = "  http://example.com/foo.html%20+ "
+            stripped_url = url.strip()
+            html = '<a href="%s"></a>' % url
+            b = TestBrowser(factory=factory_class())
+            r = MockResponse(base_url, html, {"content-type": "text/html"})
+            b.add_handler(make_mock_handler()([("http_open", r)]))
+            r = b.open(base_url)
+            link = b.find_link(nr=0)
+            self.assertEqual(
+                link,
+                Link(base_url, stripped_url, "", "a", [("href", url)])
+                )
+
+    def test_links(self):
+        for factory_class in FACTORY_CLASSES:
+            self._test_links(factory_class())
+    def _test_links(self, factory):
+        import mechanize
+        from mechanize import Link
+        url = "http://example.com/"
+
+        b = TestBrowser(factory=factory)
+        r = MockResponse(url,
+"""<html>
+<head><title>Title</title></head>
+<body>
+<a href="http://example.com/foo/bar.html" name="apples"></a>
+<a name="pears"></a>
+<a href="spam" name="pears"></a>
+<area href="blah" name="foo"></area>
+<form name="form2">
+ <input type="submit" name="two">
+</form>
+<frame name="name" href="href" src="src"></frame>
+<iframe name="name2" href="href" src="src"></iframe>
+<a name="name3" href="one">yada yada</a>
+<a name="pears" href="two" weird="stuff">rhubarb</a>
+<a></a>
+<iframe src="foo"></iframe>
+</body>
+</html>
+""", {"content-type": "text/html"})
+        b.add_handler(make_mock_handler()([("http_open", r)]))
+        r = b.open(url)
+
+        exp_links = [
+            # base_url, url, text, tag, attrs
+            Link(url, "http://example.com/foo/bar.html", "", "a",
+                 [("href", "http://example.com/foo/bar.html"),
+                  ("name", "apples")]),
+            Link(url, "spam", "", "a", [("href", "spam"), ("name", "pears")]),
+            Link(url, "blah", None, "area",
+                 [("href", "blah"), ("name", "foo")]),
+            Link(url, "src", None, "frame",
+                 [("name", "name"), ("href", "href"), ("src", "src")]),
+            Link(url, "src", None, "iframe",
+                 [("name", "name2"), ("href", "href"), ("src", "src")]),
+            Link(url, "one", "yada yada", "a",
+                 [("name", "name3"), ("href", "one")]),
+            Link(url, "two", "rhubarb", "a",
+                 [("name", "pears"), ("href", "two"), ("weird", "stuff")]),
+            Link(url, "foo", None, "iframe",
+                 [("src", "foo")]),
+            ]
+        links = list(b.links())
+        self.assertEqual(len(links), len(exp_links))
+        for got, expect in zip(links, exp_links):
+            self.assertEqual(got, expect)
+        # nr
+        l = b.find_link()
+        self.assertEqual(l.url, "http://example.com/foo/bar.html")
+        l = b.find_link(nr=1)
+        self.assertEqual(l.url, "spam")
+        # text
+        l = b.find_link(text="yada yada")
+        self.assertEqual(l.url, "one")
+        self.assertRaises(mechanize.LinkNotFoundError,
+                          b.find_link, text="da ya")
+        l = b.find_link(text_regex=re.compile("da ya"))
+        self.assertEqual(l.url, "one")
+        l = b.find_link(text_regex="da ya")
+        self.assertEqual(l.url, "one")
+        # name
+        l = b.find_link(name="name3")
+        self.assertEqual(l.url, "one")
+        l = b.find_link(name_regex=re.compile("oo"))
+        self.assertEqual(l.url, "blah")
+        l = b.find_link(name_regex="oo")
+        self.assertEqual(l.url, "blah")
+        # url
+        l = b.find_link(url="spam")
+        self.assertEqual(l.url, "spam")
+        l = b.find_link(url_regex=re.compile("pam"))
+        self.assertEqual(l.url, "spam")
+        l = b.find_link(url_regex="pam")
+        self.assertEqual(l.url, "spam")
+        # tag
+        l = b.find_link(tag="area")
+        self.assertEqual(l.url, "blah")
+        # predicate
+        l = b.find_link(predicate=
+                        lambda l: dict(l.attrs).get("weird") == "stuff")
+        self.assertEqual(l.url, "two")
+        # combinations
+        l = b.find_link(name="pears", nr=1)
+        self.assertEqual(l.text, "rhubarb")
+        l = b.find_link(url="src", nr=0, name="name2")
+        self.assertEqual(l.tag, "iframe")
+        self.assertEqual(l.url, "src")
+        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
+                          url="src", nr=1, name="name2")
+        l = b.find_link(tag="a", predicate=
+                        lambda l: dict(l.attrs).get("weird") == "stuff")
+        self.assertEqual(l.url, "two")
+
+        # .links()
+        self.assertEqual(list(b.links(url="src")), [
+            Link(url, url="src", text=None, tag="frame",
+                 attrs=[("name", "name"), ("href", "href"), ("src", "src")]),
+            Link(url, url="src", text=None, tag="iframe",
+                 attrs=[("name", "name2"), ("href", "href"), ("src", "src")]),
+            ])
+
+    def test_base_uri(self):
+        import mechanize
+        url = "http://example.com/"
+
+        for html, urls in [
+            (
+"""<base href="http://www.python.org/foo/">
+<a href="bar/baz.html"></a>
+<a href="/bar/baz.html"></a>
+<a href="http://example.com/bar %2f%2Fblah;/baz@~._-.html"></a>
+""",
+            [
+            "http://www.python.org/foo/bar/baz.html",
+            "http://www.python.org/bar/baz.html",
+            "http://example.com/bar%20%2f%2Fblah;/baz@~._-.html",
+            ]),
+            (
+"""<a href="bar/baz.html"></a>
+<a href="/bar/baz.html"></a>
+<a href="http://example.com/bar/baz.html"></a>
+""",
+            [
+            "http://example.com/bar/baz.html",
+            "http://example.com/bar/baz.html",
+            "http://example.com/bar/baz.html",
+            ]
+            ),
+            ]:
+            b = TestBrowser()
+            r = MockResponse(url, html, {"content-type": "text/html"})
+            b.add_handler(make_mock_handler()([("http_open", r)]))
+            r = b.open(url)
+            self.assertEqual([link.absolute_url for link in b.links()], urls)
+
+    def test_set_cookie(self):
+        class CookieTestBrowser(TestBrowser):
+            default_features = list(TestBrowser.default_features)+["_cookies"]
+
+        # have to be visiting HTTP/HTTPS URL
+        url = "ftp://example.com/"
+        br = CookieTestBrowser()
+        r = mechanize.make_response(
+            "<html><head><title>Title</title></head><body></body></html>",
+            [("content-type", "text/html")],
+            url,
+            200, "OK",
+            )
+        br.add_handler(make_mock_handler()([("http_open", r)]))
+        handler = br._ua_handlers["_cookies"]
+        cj = handler.cookiejar
+        self.assertRaises(mechanize.BrowserStateError,
+                          br.set_cookie, "foo=bar")
+        self.assertEqual(len(cj), 0)
+
+
+        url = "http://example.com/"
+        br = CookieTestBrowser()
+        r = mechanize.make_response(
+            "<html><head><title>Title</title></head><body></body></html>",
+            [("content-type", "text/html")],
+            url,
+            200, "OK",
+            )
+        br.add_handler(make_mock_handler()([("http_open", r)]))
+        handler = br._ua_handlers["_cookies"]
+        cj = handler.cookiejar
+
+        # have to be visiting a URL
+        self.assertRaises(mechanize.BrowserStateError,
+                          br.set_cookie, "foo=bar")
+        self.assertEqual(len(cj), 0)
+
+
+        # normal case
+        br.open(url)
+        br.set_cookie("foo=bar")
+        self.assertEqual(len(cj), 1)
+        self.assertEqual(cj._cookies["example.com"]["/"]["foo"].value, "bar")
+
+
+class ResponseTests(TestCase):
+
+    def test_set_response(self):
+        import copy
+        from mechanize import response_seek_wrapper
+
+        br = TestBrowser()
+        url = "http://example.com/"
+        html = """<html><body><a href="spam">click me</a></body></html>"""
+        headers = {"content-type": "text/html"}
+        r = response_seek_wrapper(MockResponse(url, html, headers))
+        br.add_handler(make_mock_handler()([("http_open", r)]))
+
+        r = br.open(url)
+        self.assertEqual(r.read(), html)
+        r.seek(0)
+        self.assertEqual(copy.copy(r).read(), html)
+        self.assertEqual(list(br.links())[0].url, "spam")
+
+        newhtml = """<html><body><a href="eggs">click me</a></body></html>"""
+
+        r.set_data(newhtml)
+        self.assertEqual(r.read(), newhtml)
+        self.assertEqual(br.response().read(), html)
+        br.response().set_data(newhtml)
+        self.assertEqual(br.response().read(), html)
+        self.assertEqual(list(br.links())[0].url, "spam")
+        r.seek(0)
+
+        br.set_response(r)
+        self.assertEqual(br.response().read(), newhtml)
+        self.assertEqual(list(br.links())[0].url, "eggs")
+
+    def test_str(self):
+        import mimetools
+        from mechanize import _response
+
+        br = TestBrowser()
+        self.assertEqual(
+            str(br),
+            "<TestBrowser (not visiting a URL)>"
+            )
+
+        fp = StringIO.StringIO('<html><form name="f"><input /></form></html>')
+        headers = mimetools.Message(
+            StringIO.StringIO("Content-type: text/html"))
+        response = _response.response_seek_wrapper(
+            _response.closeable_response(
+            fp, headers, "http://example.com/", 200, "OK"))
+        br.set_response(response)
+        self.assertEqual(
+            str(br),
+            "<TestBrowser visiting http://example.com/>"
+            )
+
+        br.select_form(nr=0)
+        self.assertEqual(
+            str(br),
+            """\
+<TestBrowser visiting http://example.com/
+ selected form:
+ <f GET http://example.com/ application/x-www-form-urlencoded
+  <TextControl(<None>=)>>
+>""")
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: mechanize/tags/0.1.10/test/test_cookies.py
===================================================================
--- mechanize/tags/0.1.10/test/test_cookies.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_cookies.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,1894 @@
+"""Tests for _ClientCookie."""
+
+import sys, urllib2, re, os, StringIO, mimetools, time, tempfile, errno, inspect
+from time import localtime
+from unittest import TestCase
+
+from mechanize._util import hide_experimental_warnings, \
+    reset_experimental_warnings
+
+
+class FakeResponse:
+    def __init__(self, headers=[], url=None):
+        """
+        headers: list of RFC822-style 'Key: value' strings
+        """
+        f = StringIO.StringIO("\n".join(headers))
+        self._headers = mimetools.Message(f)
+        self._url = url
+    def info(self): return self._headers
+    def url(): return self._url
+
+def interact_2965(cookiejar, url, *set_cookie_hdrs):
+    return _interact(cookiejar, url, set_cookie_hdrs, "Set-Cookie2")
+
+def interact_netscape(cookiejar, url, *set_cookie_hdrs):
+    return _interact(cookiejar, url, set_cookie_hdrs, "Set-Cookie")
+
+def _interact(cookiejar, url, set_cookie_hdrs, hdr_name):
+    """Perform a single request / response cycle, returning Cookie: header."""
+    from mechanize import Request
+    req = Request(url)
+    cookiejar.add_cookie_header(req)
+    cookie_hdr = req.get_header("Cookie", "")
+    headers = []
+    for hdr in set_cookie_hdrs:
+        headers.append("%s: %s" % (hdr_name, hdr))
+    res = FakeResponse(headers, url)
+    cookiejar.extract_cookies(res, req)
+    return cookie_hdr
+
+
+class TempfileTestMixin:
+
+    def setUp(self):
+        self._tempfiles = []
+
+    def tearDown(self):
+        for fn in self._tempfiles:
+            try:
+                os.remove(fn)
+            except IOError, exc:
+                if exc.errno != errno.ENOENT:
+                    raise
+
+    def mktemp(self):
+        fn = tempfile.mktemp()
+        self._tempfiles.append(fn)
+        return fn
+
+
+def caller():
+    return sys._getframe().f_back.f_back.f_code.co_name
+
+def attribute_names(obj):
+    return set([spec[0] for spec in inspect.getmembers(obj)
+                if not spec[0].startswith("__")])
+
+class CookieJarInterfaceTests(TestCase):
+
+    def test_add_cookie_header(self):
+        from mechanize import CookieJar
+        # verify only these methods are used
+        class MockRequest(object):
+            def __init__(self):
+                self.added_headers = []
+                self.called = set()
+            def log_called(self):
+                self.called.add(caller())
+            def get_full_url(self):
+                self.log_called()
+                return "https://example.com:443"
+            def get_host(self):
+                self.log_called()
+                return "example.com:443"
+            def get_type(self):
+                self.log_called()
+                return "https"
+            def has_header(self, header_name):
+                self.log_called()
+                return False
+            def get_header(self, header_name, default=None):
+                self.log_called()
+                pass  # currently not called
+            def header_items(self):
+                self.log_called()
+                pass  # currently not called
+            def add_unredirected_header(self, key, val):
+                self.log_called()
+                self.added_headers.append((key, val))
+            def is_unverifiable(self):
+                self.log_called()
+                return False
+            @property
+            def port(self):
+                import traceback; traceback.print_stack()
+                self.log_called()
+                pass # currently not used, since urllib2 always sets .port None
+        jar = CookieJar()
+        interact_netscape(jar, "https://example.com:443",
+                          "foo=bar; port=443; secure")
+        request = MockRequest()
+        jar.add_cookie_header(request)
+        expect_called = attribute_names(MockRequest) - set(
+            ["port", "get_header", "header_items", "log_called"])
+        self.assertEquals(request.called, expect_called)
+        self.assertEquals(request.added_headers, [("Cookie", "foo=bar")])
+
+    def test_extract_cookies(self):
+        from mechanize import CookieJar
+
+        # verify only these methods are used
+
+        class StubMessage(object):
+            def getheaders(self, name):
+                return ["foo=bar; port=443"]
+
+        class StubResponse(object):
+            def info(self):
+                return StubMessage()
+
+        class StubRequest(object):
+            def __init__(self):
+                self.added_headers = []
+                self.called = set()
+            def log_called(self):
+                self.called.add(caller())
+            def get_full_url(self):
+                self.log_called()
+                return "https://example.com:443"
+            def get_host(self):
+                self.log_called()
+                return "example.com:443"
+            def is_unverifiable(self):
+                self.log_called()
+                return False
+            @property
+            def port(self):
+                import traceback; traceback.print_stack()
+                self.log_called()
+                pass # currently not used, since urllib2 always sets .port None
+        jar = CookieJar()
+        response = StubResponse()
+        request = StubRequest()
+        jar.extract_cookies(response, request)
+        expect_called = attribute_names(StubRequest) - set(
+            ["port", "log_called"])
+        self.assertEquals(request.called, expect_called)
+        self.assertEquals([(cookie.name, cookie.value) for cookie in jar],
+                          [("foo", "bar")])
+
+    def test_unverifiable(self):
+        from mechanize._clientcookie import request_is_unverifiable
+        # .unverifiable was added in mechanize, .is_unverifiable() later got
+        # added in cookielib.  XXX deprecate .unverifiable
+        class StubRequest(object):
+            def __init__(self, attrs):
+                self._attrs = attrs
+                self.accessed = set()
+            def __getattr__(self, name):
+                self.accessed.add(name)
+                try:
+                    return self._attrs[name]
+                except KeyError:
+                    raise AttributeError(name)
+
+        request = StubRequest(dict(is_unverifiable=lambda: False))
+        self.assertEquals(request_is_unverifiable(request), False)
+
+        request = StubRequest(dict(is_unverifiable=lambda: False,
+                                   unverifiable=True))
+        self.assertEquals(request_is_unverifiable(request), False)
+
+        request = StubRequest(dict(unverifiable=False))
+        self.assertEquals(request_is_unverifiable(request), False)
+
+
+class CookieTests(TestCase):
+    # XXX
+    # Get rid of string comparisons where not actually testing str / repr.
+    # .clear() etc.
+    # IP addresses like 50 (single number, no dot) and domain-matching
+    #  functions (and is_HDN)?  See draft RFC 2965 errata.
+    # Strictness switches
+    # is_third_party()
+    # unverifiability / third_party blocking
+    # Netscape cookies work the same as RFC 2965 with regard to port.
+    # Set-Cookie with negative max age.
+    # If turn RFC 2965 handling off, Set-Cookie2 cookies should not clobber
+    #  Set-Cookie cookies.
+    # Cookie2 should be sent if *any* cookies are not V1 (ie. V0 OR V2 etc.).
+    # Cookies (V1 and V0) with no expiry date should be set to be discarded.
+    # RFC 2965 Quoting:
+    #  Should accept unquoted cookie-attribute values?  check errata draft.
+    #   Which are required on the way in and out?
+    #  Should always return quoted cookie-attribute values?
+    # Proper testing of when RFC 2965 clobbers Netscape (waiting for errata).
+    # Path-match on return (same for V0 and V1).
+    # RFC 2965 acceptance and returning rules
+    #  Set-Cookie2 without version attribute is rejected.
+
+    # Netscape peculiarities list from Ronald Tschalar.
+    # The first two still need tests, the rest are covered.
+## - Quoting: only quotes around the expires value are recognized as such
+##   (and yes, some folks quote the expires value); quotes around any other
+##   value are treated as part of the value.
+## - White space: white space around names and values is ignored
+## - Default path: if no path parameter is given, the path defaults to the
+##   path in the request-uri up to, but not including, the last '/'. Note
+##   that this is entirely different from what the spec says.
+## - Commas and other delimiters: Netscape just parses until the next ';'.
+##   This means it will allow commas etc inside values (and yes, both
+##   commas and equals are commonly appear in the cookie value). This also
+##   means that if you fold multiple Set-Cookie header fields into one,
+##   comma-separated list, it'll be a headache to parse (at least my head
+##   starts hurting everytime I think of that code).
+## - Expires: You'll get all sorts of date formats in the expires,
+##   including emtpy expires attributes ("expires="). Be as flexible as you
+##   can, and certainly don't expect the weekday to be there; if you can't
+##   parse it, just ignore it and pretend it's a session cookie.
+## - Domain-matching: Netscape uses the 2-dot rule for _all_ domains, not
+##   just the 7 special TLD's listed in their spec. And folks rely on
+##   that...
+
+    def test_policy(self):
+        import mechanize
+        policy = mechanize.DefaultCookiePolicy()
+        jar = mechanize.CookieJar()
+        jar.set_policy(policy)
+        self.assertEquals(jar.get_policy(), policy)
+
+    def test_make_cookies_doesnt_change_jar_state(self):
+        from mechanize import CookieJar, Request, Cookie
+        from mechanize._util import time2netscape
+        from mechanize._response import test_response
+        cookie = Cookie(0, "spam", "eggs",
+                        "80", False,
+                        "example.com", False, False,
+                        "/", False,
+                        False,
+                        None,
+                        False,
+                        "",
+                        "",
+                        {})
+        jar = CookieJar()
+        jar._policy._now = jar._now = int(time.time())
+        jar.set_cookie(cookie)
+        self.assertEquals(len(jar), 1)
+        set_cookie = "spam=eggs; expires=%s" % time2netscape(time.time()- 1000)
+        url = "http://example.com/"
+        response = test_response(url=url, headers=[("Set-Cookie", set_cookie)])
+        jar.make_cookies(response, Request(url))
+        self.assertEquals(len(jar), 1)
+
+    def test_domain_return_ok(self):
+        # test optimization: .domain_return_ok() should filter out most
+        # domains in the CookieJar before we try to access them (because that
+        # may require disk access -- in particular, with MSIECookieJar)
+        # This is only a rough check for performance reasons, so it's not too
+        # critical as long as it's sufficiently liberal.
+        import mechanize
+        pol = mechanize.DefaultCookiePolicy()
+        for url, domain, ok in [
+            ("http://foo.bar.com/", "blah.com", False),
+            ("http://foo.bar.com/", "rhubarb.blah.com", False),
+            ("http://foo.bar.com/", "rhubarb.foo.bar.com", False),
+            ("http://foo.bar.com/", ".foo.bar.com", True),
+            ("http://foo.bar.com/", "foo.bar.com", True),
+            ("http://foo.bar.com/", ".bar.com", True),
+            ("http://foo.bar.com/", "com", True),
+            ("http://foo.com/", "rhubarb.foo.com", False),
+            ("http://foo.com/", ".foo.com", True),
+            ("http://foo.com/", "foo.com", True),
+            ("http://foo.com/", "com", True),
+            ("http://foo/", "rhubarb.foo", False),
+            ("http://foo/", ".foo", True),
+            ("http://foo/", "foo", True),
+            ("http://foo/", "foo.local", True),
+            ("http://foo/", ".local", True),
+            ]:
+            request = mechanize.Request(url)
+            r = pol.domain_return_ok(domain, request)
+            if ok: self.assert_(r)
+            else: self.assert_(not r)
+
+    def test_missing_name(self):
+        from mechanize import MozillaCookieJar, lwp_cookie_str
+
+        # missing = sign in Cookie: header is regarded by Mozilla as a missing
+        # NAME.  WE regard it as a missing VALUE.
+        filename = tempfile.mktemp()
+        c = MozillaCookieJar(filename)
+        interact_netscape(c, "http://www.acme.com/", 'eggs')
+        interact_netscape(c, "http://www.acme.com/", '"spam"; path=/foo/')
+        cookie = c._cookies["www.acme.com"]["/"]['eggs']
+        assert cookie.name == "eggs"
+        assert cookie.value is None
+        cookie = c._cookies["www.acme.com"]['/foo/']['"spam"']
+        assert cookie.name == '"spam"'
+        assert cookie.value is None
+        assert lwp_cookie_str(cookie) == (
+            r'"spam"; path="/foo/"; domain="www.acme.com"; '
+            'path_spec; discard; version=0')
+        old_str = repr(c)
+        c.save(ignore_expires=True, ignore_discard=True)
+        try:
+            c = MozillaCookieJar(filename)
+            c.revert(ignore_expires=True, ignore_discard=True)
+        finally:
+            os.unlink(c.filename)
+        # cookies unchanged apart from lost info re. whether path was specified
+        assert repr(c) == \
+               re.sub("path_specified=%s" % True, "path_specified=%s" % False,
+                      old_str)
+        assert interact_netscape(c, "http://www.acme.com/foo/") == \
+               '"spam"; eggs'
+
+    def test_rfc2109_handling(self):
+        # 2109 cookies have rfc2109 attr set correctly, and are handled
+        # as 2965 or Netscape cookies depending on policy settings
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        for policy, version in [
+            (DefaultCookiePolicy(), 0),
+            (DefaultCookiePolicy(rfc2965=True), 1),
+            (DefaultCookiePolicy(rfc2109_as_netscape=True), 0),
+            (DefaultCookiePolicy(rfc2965=True, rfc2109_as_netscape=True), 0),
+            ]:
+            c = CookieJar(policy)
+            interact_netscape(c, "http://www.example.com/", "ni=ni; Version=1")
+            cookie = c._cookies["www.example.com"]["/"]["ni"]
+            self.assert_(cookie.rfc2109)
+            self.assertEqual(cookie.version, version)
+
+    def test_ns_parser(self):
+        from mechanize import CookieJar
+        from mechanize._clientcookie import DEFAULT_HTTP_PORT
+
+        c = CookieJar()
+        interact_netscape(c, "http://www.acme.com/",
+                          'spam=eggs; DoMain=.acme.com; port; blArgh="feep"')
+        interact_netscape(c, "http://www.acme.com/", 'ni=ni; port=80,8080')
+        interact_netscape(c, "http://www.acme.com:80/", 'nini=ni')
+        interact_netscape(c, "http://www.acme.com:80/", 'foo=bar; expires=')
+        interact_netscape(c, "http://www.acme.com:80/", 'spam=eggs; '
+                          'expires="Foo Bar 25 33:22:11 3022"')
+
+        cookie = c._cookies[".acme.com"]["/"]["spam"]
+        assert cookie.domain == ".acme.com"
+        assert cookie.domain_specified
+        assert cookie.port == DEFAULT_HTTP_PORT
+        assert not cookie.port_specified
+        # case is preserved
+        assert (cookie.has_nonstandard_attr("blArgh") and
+                not cookie.has_nonstandard_attr("blargh"))
+
+        cookie = c._cookies["www.acme.com"]["/"]["ni"]
+        assert cookie.domain == "www.acme.com"
+        assert not cookie.domain_specified
+        assert cookie.port == "80,8080"
+        assert cookie.port_specified
+
+        cookie = c._cookies["www.acme.com"]["/"]["nini"]
+        assert cookie.port is None
+        assert not cookie.port_specified
+
+        # invalid expires should not cause cookie to be dropped
+        foo = c._cookies["www.acme.com"]["/"]["foo"]
+        spam = c._cookies["www.acme.com"]["/"]["foo"]
+        assert foo.expires is None
+        assert spam.expires is None
+
+    def test_ns_parser_special_names(self):
+        # names such as 'expires' are not special in first name=value pair
+        # of Set-Cookie: header
+        from mechanize import CookieJar
+
+        c = CookieJar()
+        interact_netscape(c, "http://www.acme.com/", 'expires=eggs')
+        interact_netscape(c, "http://www.acme.com/", 'version=eggs; spam=eggs')
+
+        cookies = c._cookies["www.acme.com"]["/"]
+        self.assert_(cookies.has_key('expires'))
+        self.assert_(cookies.has_key('version'))
+
+    def test_expires(self):
+        from mechanize._util import time2netscape
+        from mechanize import CookieJar
+
+        # if expires is in future, keep cookie...
+        c = CookieJar()
+        future = time2netscape(time.time()+3600)
+        interact_netscape(c, "http://www.acme.com/", 'spam="bar"; expires=%s' %
+                          future)
+        assert len(c) == 1
+        now = time2netscape(time.time()-1)
+        # ... and if in past or present, discard it
+        interact_netscape(c, "http://www.acme.com/", 'foo="eggs"; expires=%s' %
+                          now)
+        h = interact_netscape(c, "http://www.acme.com/")
+        assert len(c) == 1
+        assert h.find('spam="bar"') != -1 and h.find("foo") == -1
+
+        # max-age takes precedence over expires, and zero max-age is request to
+        # delete both new cookie and any old matching cookie
+        interact_netscape(c, "http://www.acme.com/", 'eggs="bar"; expires=%s' %
+                          future)
+        interact_netscape(c, "http://www.acme.com/", 'bar="bar"; expires=%s' %
+                          future)
+        assert len(c) == 3
+        interact_netscape(c, "http://www.acme.com/", 'eggs="bar"; '
+                          'expires=%s; max-age=0' % future)
+        interact_netscape(c, "http://www.acme.com/", 'bar="bar"; '
+                          'max-age=0; expires=%s' % future)
+        h = interact_netscape(c, "http://www.acme.com/")
+        assert len(c) == 1
+
+        # test expiry at end of session for cookies with no expires attribute
+        interact_netscape(c, "http://www.rhubarb.net/", 'whum="fizz"')
+        assert len(c) == 2
+        c.clear_session_cookies()
+        assert len(c) == 1
+        assert h.find('spam="bar"') != -1
+
+        # XXX RFC 2965 expiry rules (some apply to V0 too)
+
+    def test_default_path(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        # RFC 2965
+        pol = DefaultCookiePolicy(rfc2965=True)
+
+        c = CookieJar(pol)
+        interact_2965(c, "http://www.acme.com/", 'spam="bar"; Version="1"')
+        assert c._cookies["www.acme.com"].has_key("/")
+
+        c = CookieJar(pol)
+        interact_2965(c, "http://www.acme.com/blah", 'eggs="bar"; Version="1"')
+        assert c._cookies["www.acme.com"].has_key("/")
+  
+        c = CookieJar(pol)
+        interact_2965(c, "http://www.acme.com/blah/rhubarb",
+                      'eggs="bar"; Version="1"')
+        assert c._cookies["www.acme.com"].has_key("/blah/")
+
+        c = CookieJar(pol)
+        interact_2965(c, "http://www.acme.com/blah/rhubarb/",
+                      'eggs="bar"; Version="1"')
+        assert c._cookies["www.acme.com"].has_key("/blah/rhubarb/")
+
+        # Netscape
+
+        c = CookieJar()
+        interact_netscape(c, "http://www.acme.com/", 'spam="bar"')
+        assert c._cookies["www.acme.com"].has_key("/")
+
+        c = CookieJar()
+        interact_netscape(c, "http://www.acme.com/blah", 'eggs="bar"')
+        assert c._cookies["www.acme.com"].has_key("/")
+  
+        c = CookieJar()
+        interact_netscape(c, "http://www.acme.com/blah/rhubarb", 'eggs="bar"')
+        assert c._cookies["www.acme.com"].has_key("/blah")
+
+        c = CookieJar()
+        interact_netscape(c, "http://www.acme.com/blah/rhubarb/", 'eggs="bar"')
+        assert c._cookies["www.acme.com"].has_key("/blah/rhubarb")
+
+    def test_escape_path(self):
+        from mechanize._clientcookie import escape_path
+        cases = [
+            # quoted safe
+            ("/foo%2f/bar", "/foo%2F/bar"),
+            ("/foo%2F/bar", "/foo%2F/bar"),
+            # quoted %
+            ("/foo%%/bar", "/foo%%/bar"),
+            # quoted unsafe
+            ("/fo%19o/bar", "/fo%19o/bar"),
+            ("/fo%7do/bar", "/fo%7Do/bar"),
+            # unquoted safe
+            ("/foo/bar&", "/foo/bar&"),
+            ("/foo//bar", "/foo//bar"),
+            ("\176/foo/bar", "\176/foo/bar"),
+            # unquoted unsafe
+            ("/foo\031/bar", "/foo%19/bar"),
+            ("/\175foo/bar", "/%7Dfoo/bar"),
+            # unicode
+            (u"/foo/bar\uabcd", "/foo/bar%EA%AF%8D"),  # UTF-8 encoded
+            ]
+        for arg, result in cases:
+            self.assert_(escape_path(arg) == result)
+
+    def test_request_path(self):
+        from urllib2 import Request
+        from mechanize._clientcookie import request_path
+        # with parameters
+        req = Request("http://www.example.com/rheum/rhaponicum;"
+                      "foo=bar;sing=song?apples=pears&spam=eggs#ni")
+        self.assert_(request_path(req) == "/rheum/rhaponicum;"
+                     "foo=bar;sing=song?apples=pears&spam=eggs#ni")
+        # without parameters
+        req = Request("http://www.example.com/rheum/rhaponicum?"
+                      "apples=pears&spam=eggs#ni")
+        self.assert_(request_path(req) == "/rheum/rhaponicum?"
+                     "apples=pears&spam=eggs#ni")
+        # missing final slash
+        req = Request("http://www.example.com")
+        self.assert_(request_path(req) == "/")
+
+    def test_request_port(self):
+        from urllib2 import Request
+        from mechanize._clientcookie import request_port, DEFAULT_HTTP_PORT
+        req = Request("http://www.acme.com:1234/",
+                      headers={"Host": "www.acme.com:4321"})
+        assert request_port(req) == "1234"
+        req = Request("http://www.acme.com/",
+                      headers={"Host": "www.acme.com:4321"})
+        assert request_port(req) == DEFAULT_HTTP_PORT
+
+    def test_request_host_lc(self):
+        from mechanize import Request
+        from mechanize._clientcookie import request_host_lc
+        # this request is illegal (RFC2616, 14.2.3)
+        req = Request("http://1.1.1.1/",
+                      headers={"Host": "www.acme.com:80"})
+        # libwww-perl wants this response, but that seems wrong (RFC 2616,
+        # section 5.2, point 1., and RFC 2965 section 1, paragraph 3)
+        #assert request_host_lc(req) == "www.acme.com"
+        assert request_host_lc(req) == "1.1.1.1"
+        req = Request("http://www.acme.com/",
+                      headers={"Host": "irrelevant.com"})
+        assert request_host_lc(req) == "www.acme.com"
+        # not actually sure this one is valid Request object, so maybe should
+        # remove test for no host in url in request_host_lc function?
+        req = Request("/resource.html",
+                      headers={"Host": "www.acme.com"})
+        assert request_host_lc(req) == "www.acme.com"
+        # port shouldn't be in request-host
+        req = Request("http://www.acme.com:2345/resource.html",
+                      headers={"Host": "www.acme.com:5432"})
+        assert request_host_lc(req) == "www.acme.com"
+        # the _lc function lower-cases the result
+        req = Request("http://EXAMPLE.com")
+        assert request_host_lc(req) == "example.com"
+
+    def test_effective_request_host(self):
+        from mechanize import Request, effective_request_host
+        self.assertEquals(
+            effective_request_host(Request("http://www.EXAMPLE.com/spam")),
+            "www.EXAMPLE.com")
+        self.assertEquals(
+            effective_request_host(Request("http://bob/spam")),
+            "bob.local")
+
+    def test_is_HDN(self):
+        from mechanize._clientcookie import is_HDN
+        assert is_HDN("foo.bar.com")
+        assert is_HDN("1foo2.3bar4.5com")
+        assert not is_HDN("192.168.1.1")
+        assert not is_HDN("")
+        assert not is_HDN(".")
+        assert not is_HDN(".foo.bar.com")
+        assert not is_HDN("..foo")
+        assert not is_HDN("foo.")
+
+    def test_reach(self):
+        from mechanize._clientcookie import reach
+        assert reach("www.acme.com") == ".acme.com"
+        assert reach("acme.com") == "acme.com"
+        assert reach("acme.local") == ".local"
+        assert reach(".local") == ".local"
+        assert reach(".com") == ".com"
+        assert reach(".") == "."
+        assert reach("") == ""
+        assert reach("192.168.0.1") == "192.168.0.1"
+
+    def test_domain_match(self):
+        from mechanize._clientcookie import domain_match, user_domain_match
+        assert domain_match("192.168.1.1", "192.168.1.1")
+        assert not domain_match("192.168.1.1", ".168.1.1")
+        assert domain_match("x.y.com", "x.Y.com")
+        assert domain_match("x.y.com", ".Y.com")
+        assert not domain_match("x.y.com", "Y.com")
+        assert domain_match("a.b.c.com", ".c.com")
+        assert not domain_match(".c.com", "a.b.c.com")
+        assert domain_match("example.local", ".local")
+        assert not domain_match("blah.blah", "")
+        assert not domain_match("", ".rhubarb.rhubarb")
+        assert domain_match("", "")
+
+        assert user_domain_match("acme.com", "acme.com")
+        assert not user_domain_match("acme.com", ".acme.com")
+        assert user_domain_match("rhubarb.acme.com", ".acme.com")
+        assert user_domain_match("www.rhubarb.acme.com", ".acme.com")
+        assert user_domain_match("x.y.com", "x.Y.com")
+        assert user_domain_match("x.y.com", ".Y.com")
+        assert not user_domain_match("x.y.com", "Y.com")
+        assert user_domain_match("y.com", "Y.com")
+        assert not user_domain_match(".y.com", "Y.com")
+        assert user_domain_match(".y.com", ".Y.com")
+        assert user_domain_match("x.y.com", ".com")
+        assert not user_domain_match("x.y.com", "com")
+        assert not user_domain_match("x.y.com", "m")
+        assert not user_domain_match("x.y.com", ".m")
+        assert not user_domain_match("x.y.com", "")
+        assert not user_domain_match("x.y.com", ".")
+        assert user_domain_match("192.168.1.1", "192.168.1.1")
+        # not both HDNs, so must string-compare equal to match
+        assert not user_domain_match("192.168.1.1", ".168.1.1")
+        assert not user_domain_match("192.168.1.1", ".")
+        # empty string is a special case
+        assert not user_domain_match("192.168.1.1", "")
+
+    def test_wrong_domain(self):
+        """Cookies whose ERH does not domain-match the domain are rejected.
+
+        ERH = effective request-host.
+
+        """
+        # XXX far from complete
+        from mechanize import CookieJar
+        c = CookieJar()
+        interact_2965(c, "http://www.nasty.com/", 'foo=bar; domain=friendly.org; Version="1"')
+        assert len(c) == 0
+
+    def test_strict_domain(self):
+        # Cookies whose domain is a country-code tld like .co.uk should
+        # not be set if CookiePolicy.strict_domain is true.
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        cp = DefaultCookiePolicy(strict_domain=True)
+        cj = CookieJar(policy=cp)
+        interact_netscape(cj, "http://example.co.uk/", 'no=problemo')
+        interact_netscape(cj, "http://example.co.uk/",
+                          'okey=dokey; Domain=.example.co.uk')
+        self.assertEquals(len(cj), 2)
+        for pseudo_tld in [".co.uk", ".org.za", ".tx.us", ".name.us"]:
+            interact_netscape(cj, "http://example.%s/" % pseudo_tld,
+                              'spam=eggs; Domain=.co.uk')
+            self.assertEquals(len(cj), 2)
+        # XXXX This should be compared with the Konqueror (kcookiejar.cpp) and
+        # Mozilla implementations.
+
+    def test_two_component_domain_ns(self):
+        # Netscape: .www.bar.com, www.bar.com, .bar.com, bar.com, no domain should
+        #  all get accepted, as should .acme.com, acme.com and no domain for
+        #  2-component domains like acme.com.
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        c = CookieJar()
+
+        # two-component V0 domain is OK
+        interact_netscape(c, "http://foo.net/", 'ns=bar')
+        assert len(c) == 1
+        assert c._cookies["foo.net"]["/"]["ns"].value == "bar"
+        assert interact_netscape(c, "http://foo.net/") == "ns=bar"
+        # *will* be returned to any other domain (unlike RFC 2965)...
+        assert interact_netscape(c, "http://www.foo.net/") == "ns=bar"
+        # ...unless requested otherwise
+        pol = DefaultCookiePolicy(
+            strict_ns_domain=DefaultCookiePolicy.DomainStrictNonDomain)
+        c.set_policy(pol)
+        assert interact_netscape(c, "http://www.foo.net/") == ""
+
+        # unlike RFC 2965, even explicit two-component domain is OK,
+        # because .foo.net matches foo.net
+        interact_netscape(c, "http://foo.net/foo/",
+                          'spam1=eggs; domain=foo.net')
+        # even if starts with a dot -- in NS rules, .foo.net matches foo.net!
+        interact_netscape(c, "http://foo.net/foo/bar/",
+                          'spam2=eggs; domain=.foo.net')
+        assert len(c) == 3
+        assert c._cookies[".foo.net"]["/foo"]["spam1"].value == "eggs"
+        assert c._cookies[".foo.net"]["/foo/bar"]["spam2"].value == "eggs"
+        assert interact_netscape(c, "http://foo.net/foo/bar/") == \
+               "spam2=eggs; spam1=eggs; ns=bar"
+
+        # top-level domain is too general
+        interact_netscape(c, "http://foo.net/", 'nini="ni"; domain=.net')
+        assert len(c) == 3
+
+##         # Netscape protocol doesn't allow non-special top level domains (such
+##         # as co.uk) in the domain attribute unless there are at least three
+##         # dots in it.
+        # Oh yes it does!  Real implementations don't check this, and real
+        # cookies (of course) rely on that behaviour.
+        interact_netscape(c, "http://foo.co.uk", 'nasty=trick; domain=.co.uk')
+##         assert len(c) == 2
+        assert len(c) == 4
+
+    def test_two_component_domain_rfc2965(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        pol = DefaultCookiePolicy(rfc2965=True)
+        c = CookieJar(pol)
+
+        # two-component V1 domain is OK
+        interact_2965(c, "http://foo.net/", 'foo=bar; Version="1"')
+        assert len(c) == 1
+        assert c._cookies["foo.net"]["/"]["foo"].value == "bar"
+        assert interact_2965(c, "http://foo.net/") == "$Version=1; foo=bar"
+        # won't be returned to any other domain (because domain was implied)
+        assert interact_2965(c, "http://www.foo.net/") == ""
+
+        # unless domain is given explicitly, because then it must be
+        # rewritten to start with a dot: foo.net --> .foo.net, which does
+        # not domain-match foo.net
+        interact_2965(c, "http://foo.net/foo",
+                      'spam=eggs; domain=foo.net; path=/foo; Version="1"')
+        assert len(c) == 1
+        assert interact_2965(c, "http://foo.net/foo") == "$Version=1; foo=bar"
+
+        # explicit foo.net from three-component domain www.foo.net *does* get
+        # set, because .foo.net domain-matches .foo.net
+        interact_2965(c, "http://www.foo.net/foo/",
+                      'spam=eggs; domain=foo.net; Version="1"')
+        assert c._cookies[".foo.net"]["/foo/"]["spam"].value == "eggs"
+        assert len(c) == 2
+        assert interact_2965(c, "http://foo.net/foo/") == "$Version=1; foo=bar"
+        assert interact_2965(c, "http://www.foo.net/foo/") == \
+               '$Version=1; spam=eggs; $Domain="foo.net"'
+
+        # top-level domain is too general
+        interact_2965(c, "http://foo.net/",
+                      'ni="ni"; domain=".net"; Version="1"')
+        assert len(c) == 2
+
+        # RFC 2965 doesn't require blocking this
+        interact_2965(c, "http://foo.co.uk/",
+                      'nasty=trick; domain=.co.uk; Version="1"')
+        assert len(c) == 3
+
+    def test_domain_allow(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+        from mechanize import Request
+
+        c = CookieJar(policy=DefaultCookiePolicy(
+            blocked_domains=["acme.com"],
+            allowed_domains=["www.acme.com"]))
+
+        req = Request("http://acme.com/")
+        headers = ["Set-Cookie: CUSTOMER=WILE_E_COYOTE; path=/"]
+        res = FakeResponse(headers, "http://acme.com/")
+        c.extract_cookies(res, req)
+        assert len(c) == 0
+
+        req = Request("http://www.acme.com/")
+        res = FakeResponse(headers, "http://www.acme.com/")
+        c.extract_cookies(res, req)
+        assert len(c) == 1
+
+        req = Request("http://www.coyote.com/")
+        res = FakeResponse(headers, "http://www.coyote.com/")
+        c.extract_cookies(res, req)
+        assert len(c) == 1
+
+        # set a cookie with non-allowed domain...
+        req = Request("http://www.coyote.com/")
+        res = FakeResponse(headers, "http://www.coyote.com/")
+        cookies = c.make_cookies(res, req)
+        c.set_cookie(cookies[0])
+        assert len(c) == 2
+        # ... and check is doesn't get returned
+        c.add_cookie_header(req)
+        assert not req.has_header("Cookie")
+
+    def test_domain_block(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+        from mechanize import Request
+
+        #import logging; logging.getLogger("mechanize").setLevel(logging.DEBUG)
+
+        pol = DefaultCookiePolicy(
+            rfc2965=True, blocked_domains=[".acme.com"])
+        c = CookieJar(policy=pol)
+        headers = ["Set-Cookie: CUSTOMER=WILE_E_COYOTE; path=/"]
+
+        req = Request("http://www.acme.com/")
+        res = FakeResponse(headers, "http://www.acme.com/")
+        c.extract_cookies(res, req)
+        assert len(c) == 0
+
+        pol.set_blocked_domains(["acme.com"])
+        c.extract_cookies(res, req)
+        assert len(c) == 1
+
+        c.clear()
+        req = Request("http://www.roadrunner.net/")
+        res = FakeResponse(headers, "http://www.roadrunner.net/")
+        c.extract_cookies(res, req)
+        assert len(c) == 1
+        req = Request("http://www.roadrunner.net/")
+        c.add_cookie_header(req)
+        assert (req.has_header("Cookie") and
+                req.has_header("Cookie2"))
+
+        c.clear()
+        pol.set_blocked_domains([".acme.com"])
+        c.extract_cookies(res, req)
+        assert len(c) == 1
+
+        # set a cookie with blocked domain...
+        req = Request("http://www.acme.com/")
+        res = FakeResponse(headers, "http://www.acme.com/")
+        cookies = c.make_cookies(res, req)
+        c.set_cookie(cookies[0])
+        assert len(c) == 2
+        # ... and check it doesn't get returned
+        c.add_cookie_header(req)
+        assert not req.has_header("Cookie")
+
+    def test_secure(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        for ns in True, False:
+            for whitespace in " ", "":
+                c = CookieJar()
+                if ns:
+                    pol = DefaultCookiePolicy(rfc2965=False)
+                    int = interact_netscape
+                    vs = ""
+                else:
+                    pol = DefaultCookiePolicy(rfc2965=True)
+                    int = interact_2965
+                    vs = "; Version=1"
+                c.set_policy(pol)
+                url = "http://www.acme.com/"
+                int(c, url, "foo1=bar%s%s" % (vs, whitespace))
+                int(c, url, "foo2=bar%s; secure%s" %  (vs, whitespace))
+                assert not c._cookies["www.acme.com"]["/"]["foo1"].secure, \
+                       "non-secure cookie registered secure"
+                assert c._cookies["www.acme.com"]["/"]["foo2"].secure, \
+                       "secure cookie registered non-secure"
+
+    def test_quote_cookie_value(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+        c = CookieJar(policy=DefaultCookiePolicy(rfc2965=True))
+        interact_2965(c, "http://www.acme.com/", r'foo=\b"a"r; Version=1')
+        h = interact_2965(c, "http://www.acme.com/")
+        assert h == r'$Version=1; foo=\\b\"a\"r'
+
+    def test_missing_final_slash(self):
+        # Missing slash from request URL's abs_path should be assumed present.
+        from mechanize import CookieJar, Request, DefaultCookiePolicy
+        url = "http://www.acme.com"
+        c = CookieJar(DefaultCookiePolicy(rfc2965=True))
+        interact_2965(c, url, "foo=bar; Version=1")
+        req = Request(url)
+        assert len(c) == 1
+        c.add_cookie_header(req)
+        assert req.has_header("Cookie")
+
+    def test_domain_mirror(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        pol = DefaultCookiePolicy(rfc2965=True)
+
+        c = CookieJar(pol)
+        url = "http://foo.bar.com/"
+        interact_2965(c, url, "spam=eggs; Version=1")
+        h = interact_2965(c, url)
+        assert h.find( "Domain") == -1, \
+               "absent domain returned with domain present"
+
+        c = CookieJar(pol)
+        url = "http://foo.bar.com/"
+        interact_2965(c, url, 'spam=eggs; Version=1; Domain=.bar.com')
+        h = interact_2965(c, url)
+        assert h.find('$Domain=".bar.com"') != -1, \
+               "domain not returned"
+
+        c = CookieJar(pol)
+        url = "http://foo.bar.com/"
+        # note missing initial dot in Domain
+        interact_2965(c, url, 'spam=eggs; Version=1; Domain=bar.com')
+        h = interact_2965(c, url)
+        assert h.find('$Domain="bar.com"') != -1, \
+               "domain not returned"
+
+    def test_path_mirror(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        pol = DefaultCookiePolicy(rfc2965=True)
+
+        c = CookieJar(pol)
+        url = "http://foo.bar.com/"
+        interact_2965(c, url, "spam=eggs; Version=1")
+        h = interact_2965(c, url)
+        assert h.find("Path") == -1, \
+               "absent path returned with path present"
+
+        c = CookieJar(pol)
+        url = "http://foo.bar.com/"
+        interact_2965(c, url, 'spam=eggs; Version=1; Path=/')
+        h = interact_2965(c, url)
+        assert h.find('$Path="/"') != -1, "path not returned"
+
+    def test_port_mirror(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        pol = DefaultCookiePolicy(rfc2965=True)
+
+        c = CookieJar(pol)
+        url = "http://foo.bar.com/"
+        interact_2965(c, url, "spam=eggs; Version=1")
+        h = interact_2965(c, url)
+        assert h.find("Port") == -1, \
+               "absent port returned with port present"
+
+        c = CookieJar(pol)
+        url = "http://foo.bar.com/"
+        interact_2965(c, url, "spam=eggs; Version=1; Port")
+        h = interact_2965(c, url)
+        assert re.search("\$Port([^=]|$)", h), \
+               "port with no value not returned with no value"
+
+        c = CookieJar(pol)
+        url = "http://foo.bar.com/"
+        interact_2965(c, url, 'spam=eggs; Version=1; Port="80"')
+        h = interact_2965(c, url)
+        assert h.find('$Port="80"') != -1, \
+               "port with single value not returned with single value"
+
+        c = CookieJar(pol)
+        url = "http://foo.bar.com/"
+        interact_2965(c, url, 'spam=eggs; Version=1; Port="80,8080"')
+        h = interact_2965(c, url)
+        assert h.find('$Port="80,8080"') != -1, \
+               "port with multiple values not returned with multiple values"
+
+    def test_no_return_comment(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        c = CookieJar(DefaultCookiePolicy(rfc2965=True))
+        url = "http://foo.bar.com/"
+        interact_2965(c, url, 'spam=eggs; Version=1; '
+                      'Comment="does anybody read these?"; '
+                      'CommentURL="http://foo.bar.net/comment.html"')
+        h = interact_2965(c, url)
+        assert h.find("Comment") == -1, \
+               "Comment or CommentURL cookie-attributes returned to server"
+
+# just pondering security here -- this isn't really a test (yet)
+##     def test_hack(self):
+##         from mechanize import CookieJar
+
+##         c = CookieJar()
+##         interact_netscape(c, "http://victim.mall.com/",
+##                           'prefs="foo"')
+##         interact_netscape(c, "http://cracker.mall.com/",
+##                           'prefs="bar"; Domain=.mall.com')
+##         interact_netscape(c, "http://cracker.mall.com/",
+##                           '$Version="1"; Domain=.mall.com')
+##         h = interact_netscape(c, "http://victim.mall.com/")
+##         print h
+
+    def test_Cookie_iterator(self):
+        from mechanize import CookieJar, Cookie, DefaultCookiePolicy
+
+        cs = CookieJar(DefaultCookiePolicy(rfc2965=True))
+        # add some random cookies
+        interact_2965(cs, "http://blah.spam.org/", 'foo=eggs; Version=1; '
+                      'Comment="does anybody read these?"; '
+                      'CommentURL="http://foo.bar.net/comment.html"')
+        interact_netscape(cs, "http://www.acme.com/blah/", "spam=bar; secure")
+        interact_2965(cs, "http://www.acme.com/blah/", "foo=bar; secure; Version=1")
+        interact_2965(cs, "http://www.acme.com/blah/", "foo=bar; path=/; Version=1")
+        interact_2965(cs, "http://www.sol.no",
+                      r'bang=wallop; version=1; domain=".sol.no"; '
+                      r'port="90,100, 80,8080"; '
+                      r'max-age=100; Comment = "Just kidding! (\"|\\\\) "')
+
+        versions = [1, 1, 1, 0, 1]
+        names = ["bang", "foo", "foo", "spam", "foo"]
+        domains = [".sol.no", "blah.spam.org", "www.acme.com",
+                   "www.acme.com", "www.acme.com"]
+        paths = ["/", "/", "/", "/blah", "/blah/"]
+
+        # sequential iteration
+        for i in range(4):
+            i = 0
+            for c in cs:
+                assert isinstance(c, Cookie)
+                assert c.version == versions[i]
+                assert c.name == names[i]
+                assert c.domain == domains[i]
+                assert c.path == paths[i]
+                i = i + 1
+
+        self.assertRaises(IndexError, lambda cs=cs : cs[5])
+
+        # can't skip
+        cs[0]
+        cs[1]
+        self.assertRaises(IndexError, lambda cs=cs : cs[3])
+
+        # can't go backwards
+        cs[0]
+        cs[1]
+        cs[2]
+        self.assertRaises(IndexError, lambda cs=cs : cs[1])
+
+    def test_parse_ns_headers(self):
+        from mechanize._headersutil import parse_ns_headers
+
+        # missing domain value (invalid cookie)
+        assert parse_ns_headers(["foo=bar; path=/; domain"]) == [
+            [("foo", "bar"),
+             ("path", "/"), ("domain", None), ("version", "0")]]
+        # invalid expires value
+        assert parse_ns_headers(
+            ["foo=bar; expires=Foo Bar 12 33:22:11 2000"]) == \
+            [[("foo", "bar"), ("expires", None), ("version", "0")]]
+        # missing cookie name (valid cookie)
+        assert parse_ns_headers(["foo"]) == [[("foo", None), ("version", "0")]]
+        # shouldn't add version if header is empty
+        assert parse_ns_headers([""]) == []
+
+    def test_bad_cookie_header(self):
+
+        def cookiejar_from_cookie_headers(headers):
+            from mechanize import CookieJar, Request
+            c = CookieJar()
+            req = Request("http://www.example.com/")
+            r = FakeResponse(headers, "http://www.example.com/")
+            c.extract_cookies(r, req)
+            return c
+
+        # none of these bad headers should cause an exception to be raised
+        for headers in [
+            ["Set-Cookie: "],  # actually, nothing wrong with this
+            ["Set-Cookie2: "],  # ditto
+            # missing domain value
+            ["Set-Cookie2: a=foo; path=/; Version=1; domain"],
+            # bad max-age
+            ["Set-Cookie: b=foo; max-age=oops"],
+            # bad version
+            ["Set-Cookie: b=foo; version=spam"],
+            ]:
+            c = cookiejar_from_cookie_headers(headers)
+            # these bad cookies shouldn't be set
+            assert len(c) == 0
+
+        # cookie with invalid expires is treated as session cookie
+        headers = ["Set-Cookie: c=foo; expires=Foo Bar 12 33:22:11 2000"]
+        c = cookiejar_from_cookie_headers(headers)
+        cookie = c._cookies["www.example.com"]["/"]["c"]
+        assert cookie.expires is None
+
+    def test_cookies_for_request(self):
+        from mechanize import CookieJar, Request
+
+        cj = CookieJar()
+        interact_netscape(cj, "http://example.com/", "short=path")
+        interact_netscape(cj, "http://example.com/longer/path", "longer=path")
+        for_short_path = cj.cookies_for_request(Request("http://example.com/"))
+        self.assertEquals([cookie.name for cookie in for_short_path],
+                          ["short"])
+        for_long_path = cj.cookies_for_request(Request(
+                "http://example.com/longer/path"))
+        self.assertEquals([cookie.name for cookie in for_long_path],
+                          ["longer", "short"])
+
+
+class CookieJarPersistenceTests(TempfileTestMixin, TestCase):
+
+    def _interact(self, cj):
+        year_plus_one = localtime(time.time())[0] + 1
+        interact_2965(cj, "http://www.acme.com/",
+                      "foo1=bar; max-age=100; Version=1")
+        interact_2965(cj, "http://www.acme.com/",
+                      'foo2=bar; port="80"; max-age=100; Discard; Version=1')
+        interact_2965(cj, "http://www.acme.com/", "foo3=bar; secure; Version=1")
+
+        expires = "expires=09-Nov-%d 23:12:40 GMT" % (year_plus_one,)
+        interact_netscape(cj, "http://www.foo.com/",
+                          "fooa=bar; %s" % expires)
+        interact_netscape(cj, "http://www.foo.com/",
+                          "foob=bar; Domain=.foo.com; %s" % expires)
+        interact_netscape(cj, "http://www.foo.com/",
+                          "fooc=bar; Domain=www.foo.com; %s" % expires)
+
+    def test_firefox3_cookiejar_restore(self):
+        try:
+            from mechanize import Firefox3CookieJar
+        except ImportError:
+            pass
+        else:
+            from mechanize import DefaultCookiePolicy
+            filename = self.mktemp()
+            def create_cookiejar():
+                hide_experimental_warnings()
+                try:
+                    cj = Firefox3CookieJar(
+                        filename, policy=DefaultCookiePolicy(rfc2965=True))
+                finally:
+                    reset_experimental_warnings()
+                cj.connect()
+                return cj
+            cj = create_cookiejar()
+            self._interact(cj)
+            self.assertEquals(len(cj), 6)
+            cj.close()
+            cj = create_cookiejar()
+            self.assert_("name='foo1', value='bar'" in repr(cj))
+            self.assertEquals(len(cj), 4)
+
+    def test_firefox3_cookiejar_iteration(self):
+        try:
+            from mechanize import Firefox3CookieJar
+        except ImportError:
+            pass
+        else:
+            from mechanize import DefaultCookiePolicy, Cookie
+            filename = self.mktemp()
+            hide_experimental_warnings()
+            try:
+                cj = Firefox3CookieJar(
+                    filename, policy=DefaultCookiePolicy(rfc2965=True))
+            finally:
+                reset_experimental_warnings()
+            cj.connect()
+            self._interact(cj)
+            summary = "\n".join([str(cookie) for cookie in cj])
+            self.assertEquals(summary,
+                              """\
+<Cookie foo2=bar for www.acme.com:80/>
+<Cookie foo3=bar for www.acme.com/>
+<Cookie foo1=bar for www.acme.com/>
+<Cookie fooa=bar for www.foo.com/>
+<Cookie foob=bar for .foo.com/>
+<Cookie fooc=bar for .www.foo.com/>""")
+
+    def test_firefox3_cookiejar_clear(self):
+        try:
+            from mechanize import Firefox3CookieJar
+        except ImportError:
+            pass
+        else:
+            from mechanize import DefaultCookiePolicy, Cookie
+            filename = self.mktemp()
+            hide_experimental_warnings()
+            try:
+                cj = Firefox3CookieJar(
+                    filename, policy=DefaultCookiePolicy(rfc2965=True))
+            finally:
+                reset_experimental_warnings()
+            cj.connect()
+            self._interact(cj)
+            cj.clear("www.acme.com", "/", "foo2")
+            def summary(): return "\n".join([str(cookie) for cookie in cj])
+            self.assertEquals(summary(),
+                              """\
+<Cookie foo3=bar for www.acme.com/>
+<Cookie foo1=bar for www.acme.com/>
+<Cookie fooa=bar for www.foo.com/>
+<Cookie foob=bar for .foo.com/>
+<Cookie fooc=bar for .www.foo.com/>""")
+            cj.clear("www.acme.com")
+            self.assertEquals(summary(),
+                              """\
+<Cookie fooa=bar for www.foo.com/>
+<Cookie foob=bar for .foo.com/>
+<Cookie fooc=bar for .www.foo.com/>""")
+            # if name is given, so must path and domain
+            self.assertRaises(ValueError, cj.clear, domain=".foo.com",
+                              name="foob")
+            # nonexistent domain
+            self.assertRaises(KeyError, cj.clear, domain=".spam.com")
+
+    def test_firefox3_cookiejar_add_cookie_header(self):
+        try:
+            from mechanize import Firefox3CookieJar
+        except ImportError:
+            pass
+        else:
+            from mechanize import DefaultCookiePolicy, Request
+            filename = self.mktemp()
+            hide_experimental_warnings()
+            try:
+                cj = Firefox3CookieJar(filename)
+            finally:
+                reset_experimental_warnings()
+            cj.connect()
+            # Session cookies (true .discard) and persistent cookies (false
+            # .discard) are stored differently.  Check they both get sent.
+            year_plus_one = localtime(time.time())[0] + 1
+            expires = "expires=09-Nov-%d 23:12:40 GMT" % (year_plus_one,)
+            interact_netscape(cj, "http://www.foo.com/", "fooa=bar")
+            interact_netscape(cj, "http://www.foo.com/",
+                              "foob=bar; %s" % expires)
+            ca, cb = cj
+            self.assert_(ca.discard)
+            self.assertFalse(cb.discard)
+            request = Request("http://www.foo.com/")
+            cj.add_cookie_header(request)
+            self.assertEquals(request.get_header("Cookie"),
+                              "fooa=bar; foob=bar")
+
+    def test_mozilla_cookiejar(self):
+        # Save / load Mozilla/Netscape cookie file format.
+        from mechanize import MozillaCookieJar, DefaultCookiePolicy
+        filename = tempfile.mktemp()
+        c = MozillaCookieJar(filename,
+                             policy=DefaultCookiePolicy(rfc2965=True))
+        self._interact(c)
+
+        def save_and_restore(cj, ignore_discard, filename=filename):
+            from mechanize import MozillaCookieJar, DefaultCookiePolicy
+            try:
+                cj.save(ignore_discard=ignore_discard)
+                new_c = MozillaCookieJar(filename,
+                                         DefaultCookiePolicy(rfc2965=True))
+                new_c.load(ignore_discard=ignore_discard)
+            finally:
+                try: os.unlink(filename)
+                except OSError: pass
+            return new_c
+
+        new_c = save_and_restore(c, True)
+        assert len(new_c) == 6  # none discarded
+        assert repr(new_c).find("name='foo1', value='bar'") != -1
+
+        new_c = save_and_restore(c, False)
+        assert len(new_c) == 4  # 2 of them discarded on save
+        assert repr(new_c).find("name='foo1', value='bar'") != -1
+
+    def test_mozilla_cookiejar_embedded_tab(self):
+        from mechanize import MozillaCookieJar
+        filename = tempfile.mktemp()
+        fh = open(filename, "w")
+        try:
+            fh.write(
+                MozillaCookieJar.header + "\n" +
+                "a.com\tFALSE\t/\tFALSE\t\tname\tval\tstillthevalue\n"
+                "a.com\tFALSE\t/\tFALSE\t\tname2\tvalue\n")
+            fh.close()
+            cj = MozillaCookieJar(filename)
+            cj.revert(ignore_discard=True)
+            cookies = cj._cookies["a.com"]["/"]
+            self.assertEquals(cookies["name"].value, "val\tstillthevalue")
+            self.assertEquals(cookies["name2"].value, "value")
+        finally:
+            try:
+                os.remove(filename)
+            except IOError, exc:
+                if exc.errno != errno.ENOENT:
+                    raise
+
+    def test_mozilla_cookiejar_initial_dot_violation(self):
+        from mechanize import MozillaCookieJar, LoadError
+        filename = tempfile.mktemp()
+        fh = open(filename, "w")
+        try:
+            fh.write(
+                MozillaCookieJar.header + "\n" +
+                ".a.com\tFALSE\t/\tFALSE\t\tname\tvalue\n")
+            fh.close()
+            cj = MozillaCookieJar(filename)
+            self.assertRaises(LoadError, cj.revert, ignore_discard=True)
+        finally:
+            try:
+                os.remove(filename)
+            except IOError, exc:
+                if exc.errno != errno.ENOENT:
+                    raise
+
+
+
+class LWPCookieTests(TestCase, TempfileTestMixin):
+    # Tests taken from libwww-perl, with a few modifications.
+
+    def test_netscape_example_1(self):
+        from mechanize import CookieJar, Request, DefaultCookiePolicy
+
+        #-------------------------------------------------------------------
+        # First we check that it works for the original example at
+        # http://www.netscape.com/newsref/std/cookie_spec.html
+
+        # Client requests a document, and receives in the response:
+        # 
+        #       Set-Cookie: CUSTOMER=WILE_E_COYOTE; path=/; expires=Wednesday, 09-Nov-99 23:12:40 GMT
+        # 
+        # When client requests a URL in path "/" on this server, it sends:
+        # 
+        #       Cookie: CUSTOMER=WILE_E_COYOTE
+        # 
+        # Client requests a document, and receives in the response:
+        # 
+        #       Set-Cookie: PART_NUMBER=ROCKET_LAUNCHER_0001; path=/
+        # 
+        # When client requests a URL in path "/" on this server, it sends:
+        # 
+        #       Cookie: CUSTOMER=WILE_E_COYOTE; PART_NUMBER=ROCKET_LAUNCHER_0001
+        # 
+        # Client receives:
+        # 
+        #       Set-Cookie: SHIPPING=FEDEX; path=/fo
+        # 
+        # When client requests a URL in path "/" on this server, it sends:
+        # 
+        #       Cookie: CUSTOMER=WILE_E_COYOTE; PART_NUMBER=ROCKET_LAUNCHER_0001
+        # 
+        # When client requests a URL in path "/foo" on this server, it sends:
+        # 
+        #       Cookie: CUSTOMER=WILE_E_COYOTE; PART_NUMBER=ROCKET_LAUNCHER_0001; SHIPPING=FEDEX
+        # 
+        # The last Cookie is buggy, because both specifications say that the
+        # most specific cookie must be sent first.  SHIPPING=FEDEX is the
+        # most specific and should thus be first.
+
+        year_plus_one = localtime(time.time())[0] + 1
+
+        headers = []
+
+        c = CookieJar(DefaultCookiePolicy(rfc2965 = True))
+
+        #req = Request("http://1.1.1.1/",
+        #              headers={"Host": "www.acme.com:80"})
+        req = Request("http://www.acme.com:80/",
+                      headers={"Host": "www.acme.com:80"})
+
+        headers.append(
+            "Set-Cookie: CUSTOMER=WILE_E_COYOTE; path=/ ; "
+            "expires=Wednesday, 09-Nov-%d 23:12:40 GMT" % year_plus_one)
+        res = FakeResponse(headers, "http://www.acme.com/")
+        c.extract_cookies(res, req)
+
+        req = Request("http://www.acme.com/")
+        c.add_cookie_header(req)
+
+        assert (req.get_header("Cookie") == "CUSTOMER=WILE_E_COYOTE" and
+                req.get_header("Cookie2") == '$Version="1"')
+
+        headers.append("Set-Cookie: PART_NUMBER=ROCKET_LAUNCHER_0001; path=/")
+        res = FakeResponse(headers, "http://www.acme.com/")
+        c.extract_cookies(res, req)
+
+        req = Request("http://www.acme.com/foo/bar")
+        c.add_cookie_header(req)
+
+        h = req.get_header("Cookie")
+        assert (h.find("PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
+                h.find("CUSTOMER=WILE_E_COYOTE") != -1)
+
+
+        headers.append('Set-Cookie: SHIPPING=FEDEX; path=/foo')
+        res = FakeResponse(headers, "http://www.acme.com")
+        c.extract_cookies(res, req)
+
+        req = Request("http://www.acme.com/")
+        c.add_cookie_header(req)
+
+        h = req.get_header("Cookie")
+        assert (h.find("PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
+                h.find("CUSTOMER=WILE_E_COYOTE") != -1 and
+                not h.find("SHIPPING=FEDEX") != -1)
+
+
+        req = Request("http://www.acme.com/foo/")
+        c.add_cookie_header(req)
+
+        h = req.get_header("Cookie")
+        assert (h.find("PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
+                h.find("CUSTOMER=WILE_E_COYOTE") != -1 and
+                h.startswith("SHIPPING=FEDEX;"))
+
+    def test_netscape_example_2(self):
+        from mechanize import CookieJar, Request
+
+        # Second Example transaction sequence:
+        # 
+        # Assume all mappings from above have been cleared.
+        # 
+        # Client receives:
+        # 
+        #       Set-Cookie: PART_NUMBER=ROCKET_LAUNCHER_0001; path=/
+        # 
+        # When client requests a URL in path "/" on this server, it sends:
+        # 
+        #       Cookie: PART_NUMBER=ROCKET_LAUNCHER_0001
+        # 
+        # Client receives:
+        # 
+        #       Set-Cookie: PART_NUMBER=RIDING_ROCKET_0023; path=/ammo
+        # 
+        # When client requests a URL in path "/ammo" on this server, it sends:
+        # 
+        #       Cookie: PART_NUMBER=RIDING_ROCKET_0023; PART_NUMBER=ROCKET_LAUNCHER_0001
+        # 
+        #       NOTE: There are two name/value pairs named "PART_NUMBER" due to
+        #       the inheritance of the "/" mapping in addition to the "/ammo" mapping. 
+
+        c = CookieJar()
+        headers = []
+
+        req = Request("http://www.acme.com/")
+        headers.append("Set-Cookie: PART_NUMBER=ROCKET_LAUNCHER_0001; path=/")
+        res = FakeResponse(headers, "http://www.acme.com/")
+
+        c.extract_cookies(res, req)
+
+        req = Request("http://www.acme.com/")
+        c.add_cookie_header(req)
+
+        assert (req.get_header("Cookie") == "PART_NUMBER=ROCKET_LAUNCHER_0001")
+
+        headers.append(
+            "Set-Cookie: PART_NUMBER=RIDING_ROCKET_0023; path=/ammo")
+        res = FakeResponse(headers, "http://www.acme.com/")
+        c.extract_cookies(res, req)
+
+        req = Request("http://www.acme.com/ammo")
+        c.add_cookie_header(req)
+
+        assert re.search(r"PART_NUMBER=RIDING_ROCKET_0023;\s*"
+                         "PART_NUMBER=ROCKET_LAUNCHER_0001",
+                         req.get_header("Cookie"))
+
+    def test_ietf_example_1(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+        #-------------------------------------------------------------------
+        # Then we test with the examples from draft-ietf-http-state-man-mec-03.txt
+        #
+        # 5.  EXAMPLES
+
+        c = CookieJar(DefaultCookiePolicy(rfc2965=True))
+
+        # 
+        # 5.1  Example 1
+        # 
+        # Most detail of request and response headers has been omitted.  Assume
+        # the user agent has no stored cookies.
+        # 
+        #   1.  User Agent -> Server
+        # 
+        #       POST /acme/login HTTP/1.1
+        #       [form data]
+        # 
+        #       User identifies self via a form.
+        # 
+        #   2.  Server -> User Agent
+        # 
+        #       HTTP/1.1 200 OK
+        #       Set-Cookie2: Customer="WILE_E_COYOTE"; Version="1"; Path="/acme"
+        # 
+        #       Cookie reflects user's identity.
+
+        cookie = interact_2965(
+            c, 'http://www.acme.com/acme/login',
+            'Customer="WILE_E_COYOTE"; Version="1"; Path="/acme"')
+        assert not cookie
+
+        # 
+        #   3.  User Agent -> Server
+        # 
+        #       POST /acme/pickitem HTTP/1.1
+        #       Cookie: $Version="1"; Customer="WILE_E_COYOTE"; $Path="/acme"
+        #       [form data]
+        # 
+        #       User selects an item for ``shopping basket.''
+        # 
+        #   4.  Server -> User Agent
+        # 
+        #       HTTP/1.1 200 OK
+        #       Set-Cookie2: Part_Number="Rocket_Launcher_0001"; Version="1";
+        #               Path="/acme"
+        # 
+        #       Shopping basket contains an item.
+
+        cookie = interact_2965(c, 'http://www.acme.com/acme/pickitem',
+                               'Part_Number="Rocket_Launcher_0001"; '
+                               'Version="1"; Path="/acme"');
+        assert re.search(
+            r'^\$Version="?1"?; Customer="?WILE_E_COYOTE"?; \$Path="/acme"$',
+            cookie)
+
+        # 
+        #   5.  User Agent -> Server
+        # 
+        #       POST /acme/shipping HTTP/1.1
+        #       Cookie: $Version="1";
+        #               Customer="WILE_E_COYOTE"; $Path="/acme";
+        #               Part_Number="Rocket_Launcher_0001"; $Path="/acme"
+        #       [form data]
+        # 
+        #       User selects shipping method from form.
+        # 
+        #   6.  Server -> User Agent
+        # 
+        #       HTTP/1.1 200 OK
+        #       Set-Cookie2: Shipping="FedEx"; Version="1"; Path="/acme"
+        # 
+        #       New cookie reflects shipping method.
+
+        cookie = interact_2965(c, "http://www.acme.com/acme/shipping",
+                               'Shipping="FedEx"; Version="1"; Path="/acme"')
+
+        assert (re.search(r'^\$Version="?1"?;', cookie) and
+                re.search(r'Part_Number="?Rocket_Launcher_0001"?;'
+                          '\s*\$Path="\/acme"', cookie) and
+                re.search(r'Customer="?WILE_E_COYOTE"?;\s*\$Path="\/acme"',
+                          cookie))
+
+        # 
+        #   7.  User Agent -> Server
+        # 
+        #       POST /acme/process HTTP/1.1
+        #       Cookie: $Version="1";
+        #               Customer="WILE_E_COYOTE"; $Path="/acme";
+        #               Part_Number="Rocket_Launcher_0001"; $Path="/acme";
+        #               Shipping="FedEx"; $Path="/acme"
+        #       [form data]
+        # 
+        #       User chooses to process order.
+        # 
+        #   8.  Server -> User Agent
+        # 
+        #       HTTP/1.1 200 OK
+        # 
+        #       Transaction is complete.
+
+        cookie = interact_2965(c, "http://www.acme.com/acme/process")
+        assert (re.search(r'Shipping="?FedEx"?;\s*\$Path="\/acme"', cookie) and
+                cookie.find("WILE_E_COYOTE") != -1)
+
+        # 
+        # The user agent makes a series of requests on the origin server, after
+        # each of which it receives a new cookie.  All the cookies have the same
+        # Path attribute and (default) domain.  Because the request URLs all have
+        # /acme as a prefix, and that matches the Path attribute, each request
+        # contains all the cookies received so far.
+
+    def test_ietf_example_2(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        # 5.2  Example 2
+        # 
+        # This example illustrates the effect of the Path attribute.  All detail
+        # of request and response headers has been omitted.  Assume the user agent
+        # has no stored cookies.
+
+        c = CookieJar(DefaultCookiePolicy(rfc2965=True))
+
+        # Imagine the user agent has received, in response to earlier requests,
+        # the response headers
+        # 
+        # Set-Cookie2: Part_Number="Rocket_Launcher_0001"; Version="1";
+        #         Path="/acme"
+        # 
+        # and
+        # 
+        # Set-Cookie2: Part_Number="Riding_Rocket_0023"; Version="1";
+        #         Path="/acme/ammo"
+
+        interact_2965(
+            c, "http://www.acme.com/acme/ammo/specific",
+            'Part_Number="Rocket_Launcher_0001"; Version="1"; Path="/acme"',
+            'Part_Number="Riding_Rocket_0023"; Version="1"; Path="/acme/ammo"')
+
+        # A subsequent request by the user agent to the (same) server for URLs of
+        # the form /acme/ammo/...  would include the following request header:
+        # 
+        # Cookie: $Version="1";
+        #         Part_Number="Riding_Rocket_0023"; $Path="/acme/ammo";
+        #         Part_Number="Rocket_Launcher_0001"; $Path="/acme"
+        # 
+        # Note that the NAME=VALUE pair for the cookie with the more specific Path
+        # attribute, /acme/ammo, comes before the one with the less specific Path
+        # attribute, /acme.  Further note that the same cookie name appears more
+        # than once.
+
+        cookie = interact_2965(c, "http://www.acme.com/acme/ammo/...")
+        assert re.search(r"Riding_Rocket_0023.*Rocket_Launcher_0001", cookie)
+
+        # A subsequent request by the user agent to the (same) server for a URL of
+        # the form /acme/parts/ would include the following request header:
+        # 
+        # Cookie: $Version="1"; Part_Number="Rocket_Launcher_0001"; $Path="/acme"
+        # 
+        # Here, the second cookie's Path attribute /acme/ammo is not a prefix of
+        # the request URL, /acme/parts/, so the cookie does not get forwarded to
+        # the server.
+
+        cookie = interact_2965(c, "http://www.acme.com/acme/parts/")
+        assert (cookie.find("Rocket_Launcher_0001") != -1 and
+                not cookie.find("Riding_Rocket_0023") != -1)
+
+    def test_rejection(self):
+        # Test rejection of Set-Cookie2 responses based on domain, path, port.
+        from mechanize import LWPCookieJar, DefaultCookiePolicy
+
+        pol = DefaultCookiePolicy(rfc2965=True)
+
+        c = LWPCookieJar(policy=pol)
+
+        max_age = "max-age=3600"
+
+        # illegal domain (no embedded dots)
+        cookie = interact_2965(c, "http://www.acme.com",
+                               'foo=bar; domain=".com"; version=1')
+        assert not c
+
+        # legal domain
+        cookie = interact_2965(c, "http://www.acme.com",
+                               'ping=pong; domain="acme.com"; version=1')
+        assert len(c) == 1
+
+        # illegal domain (host prefix "www.a" contains a dot)
+        cookie = interact_2965(c, "http://www.a.acme.com",
+                               'whiz=bang; domain="acme.com"; version=1')
+        assert len(c) == 1
+
+        # legal domain
+        cookie = interact_2965(c, "http://www.a.acme.com",
+                               'wow=flutter; domain=".a.acme.com"; version=1')
+        assert len(c) == 2
+
+        # can't partially match an IP-address
+        cookie = interact_2965(c, "http://125.125.125.125",
+                               'zzzz=ping; domain="125.125.125"; version=1')
+        assert len(c) == 2
+
+        # illegal path (must be prefix of request path)
+        cookie = interact_2965(c, "http://www.sol.no",
+                               'blah=rhubarb; domain=".sol.no"; path="/foo"; '
+                               'version=1')
+        assert len(c) == 2
+
+        # legal path
+        cookie = interact_2965(c, "http://www.sol.no/foo/bar",
+                               'bing=bong; domain=".sol.no"; path="/foo"; '
+                               'version=1')
+        assert len(c) == 3
+
+        # illegal port (request-port not in list)
+        cookie = interact_2965(c, "http://www.sol.no",
+                               'whiz=ffft; domain=".sol.no"; port="90,100"; '
+                               'version=1')
+        assert len(c) == 3
+
+        # legal port
+        cookie = interact_2965(
+            c, "http://www.sol.no",
+            r'bang=wallop; version=1; domain=".sol.no"; '
+            r'port="90,100, 80,8080"; '
+            r'max-age=100; Comment = "Just kidding! (\"|\\\\) "')
+        assert len(c) == 4
+
+        # port attribute without any value (current port)
+        cookie = interact_2965(c, "http://www.sol.no",
+                               'foo9=bar; version=1; domain=".sol.no"; port; '
+                               'max-age=100;')
+        assert len(c) == 5
+
+        # encoded path
+        # LWP has this test, but unescaping allowed path characters seems
+        # like a bad idea, so I think this should fail:
+##         cookie = interact_2965(c, "http://www.sol.no/foo/",
+##                           r'foo8=bar; version=1; path="/%66oo"')
+        # but this is OK, because '<' is not an allowed HTTP URL path
+        # character:
+        cookie = interact_2965(c, "http://www.sol.no/<oo/",
+                               r'foo8=bar; version=1; path="/%3coo"')
+        assert len(c) == 6
+
+        # save and restore
+        filename = tempfile.mktemp()
+
+        try:
+            c.save(filename, ignore_discard=True)
+            old = repr(c)
+
+            c = LWPCookieJar(policy=pol)
+            c.load(filename, ignore_discard=True)
+        finally:
+            try: os.unlink(filename)
+            except OSError: pass
+
+        assert old == repr(c)
+
+    def test_url_encoding(self):
+        # Try some URL encodings of the PATHs.
+        # (the behaviour here has changed from libwww-perl)
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        c = CookieJar(DefaultCookiePolicy(rfc2965=True))
+
+        interact_2965(c, "http://www.acme.com/foo%2f%25/%3c%3c%0Anew%E5/%E5",
+                      "foo  =   bar; version    =   1")
+
+        cookie = interact_2965(
+            c, "http://www.acme.com/foo%2f%25/<<%0anew\345/\346\370\345",
+            'bar=baz; path="/foo/"; version=1');
+        version_re = re.compile(r'^\$version=\"?1\"?', re.I)
+        assert (cookie.find("foo=bar") != -1 and
+                version_re.search(cookie))
+
+        cookie = interact_2965(
+            c, "http://www.acme.com/foo/%25/<<%0anew\345/\346\370\345")
+        assert not cookie
+
+        # unicode URL doesn't raise exception, as it used to!
+        cookie = interact_2965(c, u"http://www.acme.com/\xfc")
+
+    def test_netscape_misc(self):
+        # Some additional Netscape cookies tests.
+        from mechanize import CookieJar, Request
+
+        c = CookieJar()
+        headers = []
+        req = Request("http://foo.bar.acme.com/foo")
+
+        # Netscape allows a host part that contains dots
+        headers.append("Set-Cookie: Customer=WILE_E_COYOTE; domain=.acme.com")
+        res = FakeResponse(headers, "http://www.acme.com/foo")
+        c.extract_cookies(res, req)
+
+        # and that the domain is the same as the host without adding a leading
+        # dot to the domain.  Should not quote even if strange chars are used
+        # in the cookie value.
+        headers.append("Set-Cookie: PART_NUMBER=3,4; domain=foo.bar.acme.com")
+        res = FakeResponse(headers, "http://www.acme.com/foo")
+        c.extract_cookies(res, req)
+
+        req = Request("http://foo.bar.acme.com/foo")
+        c.add_cookie_header(req)
+        assert (
+            req.get_header("Cookie").find("PART_NUMBER=3,4") != -1 and
+            req.get_header("Cookie").find("Customer=WILE_E_COYOTE") != -1)
+
+    def test_intranet_domains_2965(self):
+        # Test handling of local intranet hostnames without a dot.
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        c = CookieJar(DefaultCookiePolicy(rfc2965=True))
+        interact_2965(c, "http://example/",
+                      "foo1=bar; PORT; Discard; Version=1;")
+        cookie = interact_2965(c, "http://example/",
+                               'foo2=bar; domain=".local"; Version=1')
+        assert cookie.find("foo1=bar") >= 0
+
+        interact_2965(c, "http://example/", 'foo3=bar; Version=1')
+        cookie = interact_2965(c, "http://example/")
+        assert cookie.find("foo2=bar") >= 0 and len(c) == 3
+
+    def test_intranet_domains_ns(self):
+        from mechanize import CookieJar, DefaultCookiePolicy
+
+        c = CookieJar(DefaultCookiePolicy(rfc2965 = False))
+        interact_netscape(c, "http://example/", "foo1=bar")
+        cookie = interact_netscape(c, "http://example/",
+                                   'foo2=bar; domain=.local')
+        assert len(c) == 2
+        assert cookie.find("foo1=bar") >= 0
+
+        cookie = interact_netscape(c, "http://example/")
+        assert cookie.find("foo2=bar") >= 0 and len(c) == 2
+
+    def test_empty_path(self):
+        from mechanize import CookieJar, Request, DefaultCookiePolicy
+
+        # Test for empty path
+        # Broken web-server ORION/1.3.38 returns to the client response like
+        #
+        #	Set-Cookie: JSESSIONID=ABCDERANDOM123; Path=
+        #
+        # ie. with Path set to nothing.
+        # In this case, extract_cookies() must set cookie to / (root)
+        c = CookieJar(DefaultCookiePolicy(rfc2965 = True))
+        headers = []
+
+        req = Request("http://www.ants.com/")
+        headers.append("Set-Cookie: JSESSIONID=ABCDERANDOM123; Path=")
+        res = FakeResponse(headers, "http://www.ants.com/")
+        c.extract_cookies(res, req)
+
+        req = Request("http://www.ants.com/")
+        c.add_cookie_header(req)
+
+        assert (req.get_header("Cookie") == "JSESSIONID=ABCDERANDOM123" and
+                req.get_header("Cookie2") == '$Version="1"')
+
+        # missing path in the request URI
+        req = Request("http://www.ants.com:8080")
+        c.add_cookie_header(req)
+
+        assert (req.get_header("Cookie") == "JSESSIONID=ABCDERANDOM123" and
+                req.get_header("Cookie2") == '$Version="1"')
+
+# The correctness of this test is undefined, in the absence of RFC 2965 errata.
+##     def test_netscape_rfc2965_interop(self):
+##         # Test mixing of Set-Cookie and Set-Cookie2 headers.
+##         from mechanize import CookieJar
+
+##         # Example from http://www.trip.com/trs/trip/flighttracker/flight_tracker_home.xsl
+##         # which gives up these headers:
+##         #
+##         # HTTP/1.1 200 OK
+##         # Connection: close
+##         # Date: Fri, 20 Jul 2001 19:54:58 GMT
+##         # Server: Apache/1.3.19 (Unix) ApacheJServ/1.1.2
+##         # Content-Type: text/html
+##         # Content-Type: text/html; charset=iso-8859-1
+##         # Link: </trip/stylesheet.css>; rel="stylesheet"; type="text/css"
+##         # Servlet-Engine: Tomcat Web Server/3.2.1 (JSP 1.1; Servlet 2.2; Java 1.3.0; SunOS 5.8 sparc; java.vendor=Sun Microsystems Inc.)
+##         # Set-Cookie: trip.appServer=1111-0000-x-024;Domain=.trip.com;Path=/
+##         # Set-Cookie: JSESSIONID=fkumjm7nt1.JS24;Path=/trs
+##         # Set-Cookie2: JSESSIONID=fkumjm7nt1.JS24;Version=1;Discard;Path="/trs"
+##         # Title: TRIP.com Travel - FlightTRACKER
+##         # X-Meta-Description: Trip.com privacy policy
+##         # X-Meta-Keywords: privacy policy
+
+##         req = urllib2.Request(
+##             'http://www.trip.com/trs/trip/flighttracker/flight_tracker_home.xsl')
+##         headers = []
+##         headers.append("Set-Cookie: trip.appServer=1111-0000-x-024;Domain=.trip.com;Path=/")
+##         headers.append("Set-Cookie: JSESSIONID=fkumjm7nt1.JS24;Path=/trs")
+##         headers.append('Set-Cookie2: JSESSIONID=fkumjm7nt1.JS24;Version=1;Discard;Path="/trs"')
+##         res = FakeResponse(
+##             headers,
+##             'http://www.trip.com/trs/trip/flighttracker/flight_tracker_home.xsl')
+##         #print res
+
+##         c = CookieJar()
+##         c.extract_cookies(res, req)
+##         #print c
+##         print str(c)
+##         print """Set-Cookie3: trip.appServer="1111-0000-x-024"; path="/"; domain=".trip.com"; path_spec; discard; version=0
+##         Set-Cookie3: JSESSIONID="fkumjm7nt1.JS24"; path="/trs"; domain="www.trip.com"; path_spec; discard; version=1
+##         """
+##         assert c.as_lwp_str() == """Set-Cookie3: trip.appServer="1111-0000-x-024"; path="/"; domain=".trip.com"; path_spec; discard; version=0
+##         Set-Cookie3: JSESSIONID="fkumjm7nt1.JS24"; path="/trs"; domain="www.trip.com"; path_spec; discard; version=1
+##         """
+
+    def test_session_cookies(self):
+        from mechanize import CookieJar, Request
+
+        year_plus_one = localtime(time.time())[0] + 1
+
+        # Check session cookies are deleted properly by
+        # CookieJar.clear_session_cookies method
+
+        req = Request('http://www.perlmeister.com/scripts')
+        headers = []
+        headers.append("Set-Cookie: s1=session;Path=/scripts")
+        headers.append("Set-Cookie: p1=perm; Domain=.perlmeister.com;"
+                       "Path=/;expires=Fri, 02-Feb-%d 23:24:20 GMT" %
+                       year_plus_one)
+        headers.append("Set-Cookie: p2=perm;Path=/;expires=Fri, "
+                       "02-Feb-%d 23:24:20 GMT" % year_plus_one)
+        headers.append("Set-Cookie: s2=session;Path=/scripts;"
+                       "Domain=.perlmeister.com")
+        headers.append('Set-Cookie2: s3=session;Version=1;Discard;Path="/"')
+        res = FakeResponse(headers, 'http://www.perlmeister.com/scripts')
+
+        c = CookieJar()
+        c.extract_cookies(res, req)
+        # How many session/permanent cookies do we have?
+        counter = {"session_after": 0,
+                   "perm_after": 0,
+                   "session_before": 0,
+                   "perm_before": 0}
+        for cookie in c:
+            key = "%s_before" % cookie.value
+            counter[key] = counter[key] + 1
+        c.clear_session_cookies()
+        # How many now?
+        for cookie in c:
+            key = "%s_after" % cookie.value
+            counter[key] = counter[key] + 1
+
+        assert not (
+            # a permanent cookie got lost accidently
+            counter["perm_after"] != counter["perm_before"] or
+            # a session cookie hasn't been cleared
+            counter["session_after"] != 0 or
+            # we didn't have session cookies in the first place
+            counter["session_before"] == 0)
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: mechanize/tags/0.1.10/test/test_date.py
===================================================================
--- mechanize/tags/0.1.10/test/test_date.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_date.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,104 @@
+"""Tests for ClientCookie._HTTPDate."""
+
+import re, time
+from unittest import TestCase
+
+class DateTimeTests(TestCase):
+
+    def test_time2isoz(self):
+        from mechanize._util import time2isoz
+
+        base = 1019227000
+        day = 24*3600
+        assert time2isoz(base) == "2002-04-19 14:36:40Z"
+        assert time2isoz(base+day) == "2002-04-20 14:36:40Z"
+        assert time2isoz(base+2*day) == "2002-04-21 14:36:40Z"
+        assert time2isoz(base+3*day) == "2002-04-22 14:36:40Z"
+
+        az = time2isoz()
+        bz = time2isoz(500000)
+        for text in (az, bz):
+            assert re.search(r"^\d{4}-\d\d-\d\d \d\d:\d\d:\d\dZ$", text), \
+                   "bad time2isoz format: %s %s" % (az, bz)
+
+    def test_parse_date(self):
+        from mechanize._util import http2time
+
+        def parse_date(text, http2time=http2time):
+            return time.gmtime(http2time(text))[:6]
+
+        assert parse_date("01 Jan 2001") == (2001, 1, 1, 0, 0, 0.0)
+
+        # this test will break around year 2070
+        assert parse_date("03-Feb-20") == (2020, 2, 3, 0, 0, 0.0)
+
+        # this test will break around year 2048
+        assert parse_date("03-Feb-98") == (1998, 2, 3, 0, 0, 0.0)
+
+    def test_http2time_formats(self):
+        from mechanize._util import http2time, time2isoz
+
+        # test http2time for supported dates.  Test cases with 2 digit year
+        # will probably break in year 2044.
+        tests = [
+         'Thu, 03 Feb 1994 00:00:00 GMT',  # proposed new HTTP format
+         'Thursday, 03-Feb-94 00:00:00 GMT',  # old rfc850 HTTP format
+         'Thursday, 03-Feb-1994 00:00:00 GMT',  # broken rfc850 HTTP format
+
+         '03 Feb 1994 00:00:00 GMT',  # HTTP format (no weekday)
+         '03-Feb-94 00:00:00 GMT',  # old rfc850 (no weekday)
+         '03-Feb-1994 00:00:00 GMT',  # broken rfc850 (no weekday)
+         '03-Feb-1994 00:00 GMT',  # broken rfc850 (no weekday, no seconds)
+         '03-Feb-1994 00:00',  # broken rfc850 (no weekday, no seconds, no tz)
+
+         '03-Feb-94',  # old rfc850 HTTP format (no weekday, no time)
+         '03-Feb-1994',  # broken rfc850 HTTP format (no weekday, no time)
+         '03 Feb 1994',  # proposed new HTTP format (no weekday, no time)
+
+         # A few tests with extra space at various places
+         '  03   Feb   1994  0:00  ',
+         '  03-Feb-1994  ',
+        ]
+
+        test_t = 760233600  # assume broken POSIX counting of seconds
+        result = time2isoz(test_t)
+        expected = "1994-02-03 00:00:00Z"
+        assert result == expected, \
+               "%s  =>  '%s' (%s)" % (test_t, result, expected)
+
+        for s in tests:
+            t = http2time(s)
+            t2 = http2time(s.lower())
+            t3 = http2time(s.upper())
+
+            assert t == t2 == t3 == test_t, \
+                   "'%s'  =>  %s, %s, %s (%s)" % (s, t, t2, t3, test_t)
+
+    def test_http2time_garbage(self):
+        from mechanize._util import http2time
+
+        for test in [
+            '', 'Garbage',
+            'Mandag 16. September 1996',
+
+            '01-00-1980',
+            '01-13-1980',
+            '00-01-1980',
+            '32-01-1980',
+            '01-01-1980 25:00:00',
+            '01-01-1980 00:61:00',
+            '01-01-1980 00:00:62']:
+
+            bad = False
+
+            if http2time(test) is not None:
+                print "http2time(%s) is not None" % (test,)
+                print "http2time(test)", http2time(test)
+                bad = True
+
+            assert not bad
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: mechanize/tags/0.1.10/test/test_forms.doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_forms.doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_forms.doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,59 @@
+Integration regression test for case where ClientForm handled RFC 3986
+url unparsing incorrectly (it was using "" in place of None for
+fragment, due to continuing to support use of stdlib module urlparse
+as well as mechanize._rfc3986).  Fixed in ClientForm r33622 .
+
+>>> import mechanize
+>>> from mechanize._response import test_response
+
+>>> def forms():
+...     forms = []
+...     for method in ["GET", "POST"]:
+...         data = ('<form action="" method="%s">'
+...         '<input type="submit" name="s"/></form>' % method
+...         )
+...         br = mechanize.Browser()
+...         response = test_response(data, [("content-type", "text/html")])
+...         br.set_response(response)
+...         br.select_form(nr=0)
+...         forms.append(br.form)
+...     return forms
+
+>>> getform, postform = forms()
+>>> getform.click().get_full_url()
+'http://example.com/?s='
+>>> postform.click().get_full_url()
+'http://example.com/'
+
+>>> data = '<form action=""><isindex /></form>'
+>>> br = mechanize.Browser()
+>>> response = test_response(data, [("content-type", "text/html")])
+>>> br.set_response(response)
+>>> br.select_form(nr=0)
+>>> br.find_control(type="isindex").value = "blah"
+>>> br.click(type="isindex").get_full_url()
+'http://example.com/?blah'
+
+
+If something (e.g. calling .forms() triggers parsing, and parsing
+fails, the next attempt should not succeed!  This used to happen
+because the response held by LinksFactory etc was stale, since it had
+already been .read().  Fixed by calling Factory.set_response() on
+error.
+
+>>> import mechanize
+>>> br = mechanize.Browser()
+>>> r = mechanize._response.test_html_response("""\
+... <form>
+... <input type="text" name="foo" value="a"></input><!!!>
+... <input type="text" name="bar" value="b"></input>
+... </form>
+... """)
+>>> br.set_response(r)
+>>> try:
+...     br.select_form(nr=0)
+... except mechanize.ParseError:
+...     pass
+>>> br.select_form(nr=0)  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError: expected name token

Added: mechanize/tags/0.1.10/test/test_headers.py
===================================================================
--- mechanize/tags/0.1.10/test/test_headers.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_headers.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,139 @@
+"""Tests for ClientCookie._HeadersUtil."""
+
+from unittest import TestCase
+
+class IsHtmlTests(TestCase):
+    def test_is_html(self):
+        from mechanize._headersutil import is_html
+        for allow_xhtml in False, True:
+            for cths, ext, expect in [
+                (["text/html"], ".html", True),
+                (["text/html", "text/plain"], ".html", True),
+                # Content-type takes priority over file extension from URL
+                (["text/html"], ".txt", True),
+                (["text/plain"], ".html", False),
+                # use extension if no Content-Type
+                ([], ".html", True),
+                ([], ".gif", False),
+                # don't regard XHTML as HTML (unless user explicitly asks for it),
+                # since we don't yet handle XML properly
+                ([], ".xhtml", allow_xhtml),
+                (["text/xhtml"], ".xhtml", allow_xhtml),
+                ]:
+                url = "http://example.com/foo"+ext
+                self.assertEqual(expect, is_html(cths, url, allow_xhtml))
+
+class HeaderTests(TestCase):
+    def test_parse_ns_headers_expires(self):
+        from mechanize._headersutil import parse_ns_headers
+
+        # quotes should be stripped
+        assert parse_ns_headers(['foo=bar; expires=01 Jan 2040 22:23:32 GMT']) == \
+               [[('foo', 'bar'), ('expires', 2209069412L), ('version', '0')]]
+        assert parse_ns_headers(['foo=bar; expires="01 Jan 2040 22:23:32 GMT"']) == \
+               [[('foo', 'bar'), ('expires', 2209069412L), ('version', '0')]]
+
+    def test_parse_ns_headers_version(self):
+        from mechanize._headersutil import parse_ns_headers
+
+        # quotes should be stripped
+        expected = [[('foo', 'bar'), ('version', '1')]]
+        for hdr in [
+            'foo=bar; version="1"',
+            'foo=bar; Version="1"',
+            ]:
+            self.assertEquals(parse_ns_headers([hdr]), expected)
+
+    def test_parse_ns_headers_special_names(self):
+        # names such as 'expires' are not special in first name=value pair
+        # of Set-Cookie: header
+        from mechanize._headersutil import parse_ns_headers
+
+        # Cookie with name 'expires'
+        hdr = 'expires=01 Jan 2040 22:23:32 GMT'
+        expected = [[("expires", "01 Jan 2040 22:23:32 GMT"), ("version", "0")]]
+        self.assertEquals(parse_ns_headers([hdr]), expected)
+
+    def test_join_header_words(self):
+        from mechanize._headersutil import join_header_words
+
+        assert join_header_words([[
+            ("foo", None), ("bar", "baz"), (None, "value")
+            ]]) == "foo; bar=baz; value"
+
+        assert join_header_words([[]]) == ""
+
+    def test_split_header_words(self):
+        from mechanize._headersutil import split_header_words
+
+        tests = [
+            ("foo", [[("foo", None)]]),
+            ("foo=bar", [[("foo", "bar")]]),
+            ("   foo   ", [[("foo", None)]]),
+            ("   foo=   ", [[("foo", "")]]),
+            ("   foo=", [[("foo", "")]]),
+            ("   foo=   ; ", [[("foo", "")]]),
+            ("   foo=   ; bar= baz ", [[("foo", ""), ("bar", "baz")]]),
+            ("foo=bar bar=baz", [[("foo", "bar"), ("bar", "baz")]]),
+            # doesn't really matter if this next fails, but it works ATM
+            ("foo= bar=baz", [[("foo", "bar=baz")]]),
+            ("foo=bar;bar=baz", [[("foo", "bar"), ("bar", "baz")]]),
+            ('foo bar baz', [[("foo", None), ("bar", None), ("baz", None)]]),
+            ("a, b, c", [[("a", None)], [("b", None)], [("c", None)]]),
+            (r'foo; bar=baz, spam=, foo="\,\;\"", bar= ',
+             [[("foo", None), ("bar", "baz")],
+              [("spam", "")], [("foo", ',;"')], [("bar", "")]]),
+            ]
+
+        for arg, expect in tests:
+            try:
+                result = split_header_words([arg])
+            except:
+                import traceback, StringIO
+                f = StringIO.StringIO()
+                traceback.print_exc(None, f)
+                result = "(error -- traceback follows)\n\n%s" % f.getvalue()
+            assert result == expect, """
+When parsing: '%s'
+Expected:     '%s'
+Got:          '%s'
+""" % (arg, expect, result)
+
+    def test_roundtrip(self):
+        from mechanize._headersutil import split_header_words, join_header_words
+
+        tests = [
+            ("foo", "foo"),
+            ("foo=bar", "foo=bar"),
+            ("   foo   ", "foo"),
+            ("foo=", 'foo=""'),
+            ("foo=bar bar=baz", "foo=bar; bar=baz"),
+            ("foo=bar;bar=baz", "foo=bar; bar=baz"),
+            ('foo bar baz', "foo; bar; baz"),
+            (r'foo="\"" bar="\\"', r'foo="\""; bar="\\"'),
+            ('foo,,,bar', 'foo, bar'),
+            ('foo=bar,bar=baz', 'foo=bar, bar=baz'),
+
+            ('text/html; charset=iso-8859-1',
+             'text/html; charset="iso-8859-1"'),
+
+            ('foo="bar"; port="80,81"; discard, bar=baz',
+             'foo=bar; port="80,81"; discard, bar=baz'),
+
+            (r'Basic realm="\"foo\\\\bar\""',
+             r'Basic; realm="\"foo\\\\bar\""')
+            ]
+
+        for arg, expect in tests:
+            input = split_header_words([arg])
+            res = join_header_words(input)
+            assert res == expect, """
+When parsing: '%s'
+Expected:     '%s'
+Got:          '%s'
+Input was:    '%s'""" % (arg, expect, res, input)
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: mechanize/tags/0.1.10/test/test_history.doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_history.doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_history.doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,12 @@
+>>> from mechanize import History
+
+If nothing has been added, .close should work.
+
+>>> history = History()
+>>> history.close()
+
+Under some circumstances response can be None, in that case
+this method should not raise an exception.
+
+>>> history.add(None, None)
+>>> history.close()

Added: mechanize/tags/0.1.10/test/test_html.doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_html.doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_html.doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,262 @@
+>>> import mechanize
+>>> from mechanize._response import test_html_response
+>>> from mechanize._html import LinksFactory, FormsFactory, TitleFactory, \
+... MechanizeBs, \
+... RobustLinksFactory,  RobustFormsFactory, RobustTitleFactory
+
+mechanize.ParseError should be raised on parsing erroneous HTML.
+
+For backwards compatibility, mechanize.ParseError derives from
+exception classes that mechanize used to raise, prior to version
+0.1.6.
+
+>>> import sgmllib
+>>> import HTMLParser
+>>> import ClientForm
+>>> issubclass(mechanize.ParseError, sgmllib.SGMLParseError)
+True
+>>> issubclass(mechanize.ParseError, HTMLParser.HTMLParseError)
+True
+>>> issubclass(mechanize.ParseError, ClientForm.ParseError)
+True
+
+>>> def create_response(error=True):
+...     extra = ""
+...     if error:
+...         extra = "<!!!>"
+...     html = """\
+... <html>
+... <head>
+...     <title>Title</title>
+...     %s
+... </head>
+... <body>
+...     <p>Hello world
+... </body>
+... </html>
+... """ % extra
+...     return test_html_response(html)
+
+>>> f = LinksFactory()
+>>> f.set_response(create_response(), "http://example.com", "latin-1")
+>>> list(f.links())  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+>>> f = FormsFactory()
+>>> f.set_response(create_response(), "latin-1")
+>>> list(f.forms())  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+>>> f = TitleFactory()
+>>> f.set_response(create_response(), "latin-1")
+>>> f.title()  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+
+
+Accessing attributes on Factory may also raise ParseError
+
+>>> def factory_getattr(attr_name):
+...    fact = mechanize.DefaultFactory()
+...    fact.set_response(create_response())
+...    getattr(fact, attr_name)
+>>> factory_getattr("title")  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+>>> factory_getattr("global_form")  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+
+
+BeautifulSoup ParseErrors:
+
+XXX If I could come up with examples that break links and forms
+parsing, I'd uncomment these!
+
+>>> def create_soup(html):
+...     r = test_html_response(html)
+...     return MechanizeBs("latin-1", r.read())
+
+#>>> f = RobustLinksFactory()
+#>>> html = """\
+#... <a href="a">
+#... <frame src="b">
+#... <a href="c">
+#... <iframe src="d">
+#... </a>
+#... </area>
+#... </frame>
+#... """
+#>>> f.set_soup(create_soup(html), "http://example.com", "latin-1")
+#>>> list(f.links())  # doctest: +IGNORE_EXCEPTION_DETAIL
+#Traceback (most recent call last):
+#ParseError:
+
+>>> html = """\
+... <table>
+... <tr><td>
+... <input name='broken'>
+... </td>
+... </form>
+... </tr>
+... </form>
+... """
+>>> f = RobustFormsFactory()
+>>> f.set_response(create_response(), "latin-1")
+>>> list(f.forms())  # doctest: +IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+ParseError:
+
+#>>> f = RobustTitleFactory()
+#>>> f.set_soup(create_soup(""), "latin-1")
+#>>> f.title()  # doctest: +IGNORE_EXCEPTION_DETAIL
+#Traceback (most recent call last):
+#ParseError:
+
+
+
+Utility class for caching forms etc.
+
+>>> from mechanize._html import CachingGeneratorFunction
+
+>>> i = [1]
+>>> func = CachingGeneratorFunction(i)
+>>> list(func())
+[1]
+>>> list(func())
+[1]
+
+>>> i = [1, 2, 3]
+>>> func = CachingGeneratorFunction(i)
+>>> list(func())
+[1, 2, 3]
+
+>>> i = func()
+>>> i.next()
+1
+>>> i.next()
+2
+>>> i.next()
+3
+
+>>> i = func()
+>>> j = func()
+>>> i.next()
+1
+>>> j.next()
+1
+>>> i.next()
+2
+>>> j.next()
+2
+>>> j.next()
+3
+>>> i.next()
+3
+>>> i.next()
+Traceback (most recent call last):
+...
+StopIteration
+>>> j.next()
+Traceback (most recent call last):
+...
+StopIteration
+
+
+Link text parsing
+
+>>> def get_first_link_text_bs(html):
+...     factory = RobustLinksFactory()
+...     soup = MechanizeBs("utf-8", html)
+...     factory.set_soup(soup, "http://example.com/", "utf-8")
+...     return list(factory.links())[0].text
+
+>>> def get_first_link_text_sgmllib(html):
+...     factory = LinksFactory()
+...     response = test_html_response(html)
+...     factory.set_response(response, "http://example.com/", "utf-8")
+...     return list(factory.links())[0].text
+
+Whitespace gets compressed down to single spaces.  Tags are removed.
+
+>>> html = ("""\
+... <html><head><title>Title</title></head><body>
+... <p><a href="http://example.com/">The  quick\tbrown fox jumps
+...   over the <i><b>lazy</b></i> dog </a>
+... </body></html>
+... """)
+>>> get_first_link_text_bs(html)
+'The quick brown fox jumps over the lazy dog'
+>>> get_first_link_text_sgmllib(html)
+'The quick brown fox jumps over the lazy dog'
+
+Empty <a> links have empty link text
+
+>>> html = ("""\
+... <html><head><title>Title</title></head><body>
+... <p><a href="http://example.com/"></a>
+... </body></html>
+... """)
+>>> get_first_link_text_bs(html)
+''
+>>> get_first_link_text_sgmllib(html)
+''
+
+But for backwards-compatibility, empty non-<a> links have None link text
+
+>>> html = ("""\
+... <html><head><title>Title</title></head><body>
+... <p><frame src="http://example.com/"></frame>
+... </body></html>
+... """)
+>>> print get_first_link_text_bs(html)
+None
+>>> print get_first_link_text_sgmllib(html)
+None
+
+
+Title parsing.  We follow Firefox's behaviour with regard to child
+elements (haven't tested IE).
+
+>>> def get_title_bs(html):
+...     factory = RobustTitleFactory()
+...     soup = MechanizeBs("utf-8", html)
+...     factory.set_soup(soup, "utf-8")
+...     return factory.title()
+
+>>> def get_title_sgmllib(html):
+...     factory = TitleFactory()
+...     response = test_html_response(html)
+...     factory.set_response(response, "utf-8")
+...     return factory.title()
+
+>>> html = ("""\
+... <html><head>
+... <title>Title</title>
+... </head><body><p>Blah.<p></body></html>
+... """)
+>>> get_title_bs(html)
+'Title'
+>>> get_title_sgmllib(html)
+'Title'
+
+>>> html = ("""\
+... <html><head>
+... <title>  Ti<script type="text/strange">alert("this is valid HTML -- yuck!")</script>
+... tle &amp;&#38;
+... </title>
+... </head><body><p>Blah.<p></body></html>
+... """)
+>>> get_title_bs(html)
+'Ti<script type="text/strange">alert("this is valid HTML -- yuck!")</script> tle &&'
+>>> get_title_sgmllib(html)
+'Ti<script type="text/strange">alert("this is valid HTML -- yuck!")</script> tle &&'
+
+
+No more tags after <title> used to cause an exception
+
+>>> html = ("""\
+... <html><head>
+... <title>""")
+>>> get_title_sgmllib(html)
+''

Added: mechanize/tags/0.1.10/test/test_html.py
===================================================================
--- mechanize/tags/0.1.10/test/test_html.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_html.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,100 @@
+#!/usr/bin/env python
+
+from unittest import TestCase
+
+import mechanize
+from mechanize._response import test_html_response
+
+
+class RegressionTests(TestCase):
+
+    def test_close_base_tag(self):
+        # any document containing a </base> tag used to cause an exception
+        br = mechanize.Browser()
+        response = test_html_response("</base>")
+        br.set_response(response)
+        list(br.links())
+
+    def test_bad_base_tag(self):
+        # a document with a base tag with no href used to cause an exception
+        for factory in [mechanize.DefaultFactory(), mechanize.RobustFactory()]:
+            br = mechanize.Browser(factory=factory)
+            response = test_html_response(
+                "<BASE TARGET='_main'><a href='http://example.com/'>eg</a>")
+            br.set_response(response)
+            list(br.links())
+
+
+class CachingGeneratorFunctionTests(TestCase):
+
+    def _get_simple_cgenf(self, log):
+        from mechanize._html import CachingGeneratorFunction
+        todo = []
+        for ii in range(2):
+            def work(ii=ii):
+                log.append(ii)
+                return ii
+            todo.append(work)
+        def genf():
+            for a in todo:
+                yield a()
+        return CachingGeneratorFunction(genf())
+
+    def test_cache(self):
+        log = []
+        cgenf = self._get_simple_cgenf(log)
+        for repeat in range(2):
+            for ii, jj in zip(cgenf(), range(2)):
+                self.assertEqual(ii, jj)
+            self.assertEqual(log, range(2))  # work only done once
+
+    def test_interleaved(self):
+        log = []
+        cgenf = self._get_simple_cgenf(log)
+        cgen = cgenf()
+        self.assertEqual(cgen.next(), 0)
+        self.assertEqual(log, [0])
+        cgen2 = cgenf()
+        self.assertEqual(cgen2.next(), 0)
+        self.assertEqual(log, [0])
+        self.assertEqual(cgen.next(), 1)
+        self.assertEqual(log, [0, 1])
+        self.assertEqual(cgen2.next(), 1)
+        self.assertEqual(log, [0, 1])
+        self.assertRaises(StopIteration, cgen.next)
+        self.assertRaises(StopIteration, cgen2.next)
+
+
+class UnescapeTests(TestCase):
+
+    def test_unescape_charref(self):
+        from mechanize._html import unescape_charref
+        mdash_utf8 = u"\u2014".encode("utf-8")
+        for ref, codepoint, utf8, latin1 in [
+            ("38", 38, u"&".encode("utf-8"), "&"),
+            ("x2014", 0x2014, mdash_utf8, "&#x2014;"),
+            ("8212", 8212, mdash_utf8, "&#8212;"),
+            ]:
+            self.assertEqual(unescape_charref(ref, None), unichr(codepoint))
+            self.assertEqual(unescape_charref(ref, 'latin-1'), latin1)
+            self.assertEqual(unescape_charref(ref, 'utf-8'), utf8)
+
+    def test_unescape(self):
+        import htmlentitydefs
+        from mechanize._html import unescape
+        data = "&amp; &lt; &mdash; &#8212; &#x2014;"
+        mdash_utf8 = u"\u2014".encode("utf-8")
+        ue = unescape(data, htmlentitydefs.name2codepoint, "utf-8")
+        self.assertEqual("& < %s %s %s" % ((mdash_utf8,)*3), ue)
+
+        for text, expect in [
+            ("&a&amp;", "&a&"),
+            ("a&amp;", "a&"),
+            ]:
+            got = unescape(text, htmlentitydefs.name2codepoint, "latin-1")
+            self.assertEqual(got, expect)
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: mechanize/tags/0.1.10/test/test_opener.doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_opener.doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_opener.doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,58 @@
+>>> import urllib2, StringIO
+>>> from mechanize import _opener
+
+Normal case.  Response goes through hook function and is returned.
+
+>>> def urlopen(fullurl, data=None, timeout=None):
+...     print "fullurl %r" % fullurl
+...     print "data %r" % data
+...     return "response"
+>>> def response_hook(response):
+...     print "response %r" % response
+...     return "processed response"
+>>> _opener.wrapped_open(urlopen, response_hook,
+...                      "http://example.com", "data"
+...                      )
+fullurl 'http://example.com'
+data 'data'
+response 'response'
+'processed response'
+
+
+Raised HTTPError exceptions still go through the response hook but
+the result is raised rather than returned.
+
+>>> def urlopen(fullurl, data=None, timeout=None):
+...     print "fullurl %r" % fullurl
+...     print "data %r" % data
+...     raise urllib2.HTTPError(
+...         "http://example.com", 200, "OK", {}, StringIO.StringIO())
+>>> def response_hook(response):
+...     print "response class", response.__class__.__name__
+...     return Exception("processed response")
+>>> try:
+...     _opener.wrapped_open(urlopen, response_hook,
+...                          "http://example.com", "data"
+...                          )
+... except Exception, exc:
+...     print exc
+fullurl 'http://example.com'
+data 'data'
+response class HTTPError
+processed response
+
+Other exceptions get ignored, since they're not response objects.
+
+>>> def urlopen(fullurl, data=None, timeout=None):
+...     print "fullurl %r" % fullurl
+...     print "data %r" % data
+...     raise Exception("not caught")
+>>> try:
+...     _opener.wrapped_open(urlopen, response_hook,
+...                          "http://example.com", "data"
+...                          )
+... except Exception, exc:
+...     print exc
+fullurl 'http://example.com'
+data 'data'
+not caught

Added: mechanize/tags/0.1.10/test/test_opener.py
===================================================================
--- mechanize/tags/0.1.10/test/test_opener.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_opener.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,180 @@
+#!/usr/bin/env python
+
+import os, math, stat
+from unittest import TestCase
+
+import mechanize
+import mechanize._sockettimeout as _sockettimeout
+
+
+def killfile(filename):
+    try:
+        os.remove(filename)
+    except OSError:
+        if os.name=='nt':
+            try:
+                os.chmod(filename, stat.S_IWRITE)
+                os.remove(filename)
+            except OSError:
+                pass
+
+class OpenerTests(TestCase):
+
+    def test_retrieve(self):
+        # The .retrieve() method deals with a number of different cases.  In
+        # each case, .read() should be called the expected number of times, the
+        # progress callback should be called as expected, and we should end up
+        # with a filename and some headers.
+
+        class Opener(mechanize.OpenerDirector):
+            def __init__(self, content_length=None):
+                mechanize.OpenerDirector.__init__(self)
+                self.calls = []
+                self.block_size = mechanize.OpenerDirector.BLOCK_SIZE
+                self.nr_blocks = 2.5
+                self.data = int((self.block_size/8)*self.nr_blocks)*"01234567"
+                self.total_size = len(self.data)
+                self._content_length = content_length
+            def open(self, fullurl, data=None,
+                     timeout=_sockettimeout._GLOBAL_DEFAULT_TIMEOUT):
+                from mechanize import _response
+                self.calls.append((fullurl, data, timeout))
+                headers = [("Foo", "Bar")]
+                if self._content_length is not None:
+                    if self._content_length is True:
+                        content_length = str(len(self.data))
+                    else:
+                        content_length = str(self._content_length)
+                    headers.append(("content-length", content_length))
+                return _response.test_response(self.data, headers)
+
+        class CallbackVerifier:
+            def __init__(self, testcase, total_size, block_size):
+                self.count = 0
+                self._testcase = testcase
+                self._total_size = total_size
+                self._block_size = block_size
+            def callback(self, block_nr, block_size, total_size):
+                self._testcase.assertEqual(block_nr, self.count)
+                self._testcase.assertEqual(block_size, self._block_size)
+                self._testcase.assertEqual(total_size, self._total_size)
+                self.count += 1
+
+        # ensure we start without the test file present
+        tfn = "mechanize_test_73940ukewrl.txt"
+        killfile(tfn)
+
+        # case 1: filename supplied
+        op = Opener()
+        verif = CallbackVerifier(self, -1, op.block_size)
+        url = "http://example.com/"
+        filename, headers = op.retrieve(
+            url, tfn, reporthook=verif.callback)
+        try:
+            self.assertEqual(filename, tfn)
+            self.assertEqual(headers["foo"], 'Bar')
+            self.assertEqual(open(filename, "rb").read(), op.data)
+            self.assertEqual(len(op.calls), 1)
+            self.assertEqual(verif.count, math.ceil(op.nr_blocks) + 1)
+            op.close()
+            # .close()ing the opener does NOT remove non-temporary files
+            self.assert_(os.path.isfile(filename))
+        finally:
+            killfile(filename)
+
+        # case 2: no filename supplied, use a temporary file
+        op = Opener(content_length=True)
+        # We asked the Opener to add a content-length header to the response
+        # this time.  Verify the total size passed to the callback is that case
+        # is according to the content-length (rather than -1).
+        verif = CallbackVerifier(self, op.total_size, op.block_size)
+        url = "http://example.com/"
+        filename, headers = op.retrieve(url, reporthook=verif.callback)
+        self.assertNotEqual(filename, tfn)  # (some temp filename instead)
+        self.assertEqual(headers["foo"], 'Bar')
+        self.assertEqual(open(filename, "rb").read(), op.data)
+        self.assertEqual(len(op.calls), 1)
+        # .close()ing the opener removes temporary files
+        self.assert_(os.path.exists(filename))
+        op.close()
+        self.failIf(os.path.exists(filename))
+        self.assertEqual(verif.count, math.ceil(op.nr_blocks) + 1)
+
+        # case 3: "file:" URL with no filename supplied
+        # we DON'T create a temporary file, since there's a file there already
+        op = Opener()
+        verif = CallbackVerifier(self, -1, op.block_size)
+        tifn = "input_for_"+tfn
+        try:
+            f = open(tifn, 'wb')
+            try:
+                f.write(op.data)
+            finally:
+                f.close()
+            url = "file://" + tifn
+            filename, headers = op.retrieve(url, reporthook=verif.callback)
+            self.assertEqual(filename, None)  # this may change
+            self.assertEqual(headers["foo"], 'Bar')
+            self.assertEqual(open(tifn, "rb").read(), op.data)
+            # no .read()s took place, since we already have the disk file,
+            # and we weren't asked to write it to another filename
+            self.assertEqual(verif.count, 0)
+            op.close()
+            # .close()ing the opener does NOT remove the file!
+            self.assert_(os.path.isfile(tifn))
+        finally:
+            killfile(tifn)
+
+        # case 4: "file:" URL and filename supplied
+        # we DO create a new file in this case
+        op = Opener()
+        verif = CallbackVerifier(self, -1, op.block_size)
+        tifn = "input_for_"+tfn
+        try:
+            f = open(tifn, 'wb')
+            try:
+                f.write(op.data)
+            finally:
+                f.close()
+            url = "file://" + tifn
+            try:
+                filename, headers = op.retrieve(
+                    url, tfn, reporthook=verif.callback)
+                self.assertEqual(filename, tfn)
+                self.assertEqual(headers["foo"], 'Bar')
+                self.assertEqual(open(tifn, "rb").read(), op.data)
+                self.assertEqual(verif.count, math.ceil(op.nr_blocks) + 1)
+                op.close()
+                # .close()ing the opener does NOT remove non-temporary files
+                self.assert_(os.path.isfile(tfn))
+            finally:
+                killfile(tfn)
+        finally:
+            killfile(tifn)
+
+        # Content-Length mismatch with real file length gives URLError
+        big = 1024*32
+        op = Opener(content_length=big)
+        verif = CallbackVerifier(self, big, op.block_size)
+        url = "http://example.com/"
+        try:
+            try:
+                op.retrieve(url, reporthook=verif.callback)
+            except mechanize.ContentTooShortError, exc:
+                filename, headers = exc.result
+                self.assertNotEqual(filename, tfn)
+                self.assertEqual(headers["foo"], 'Bar')
+                # We still read and wrote to disk everything available, despite
+                # the exception.
+                self.assertEqual(open(filename, "rb").read(), op.data)
+                self.assertEqual(len(op.calls), 1)
+                self.assertEqual(verif.count, math.ceil(op.nr_blocks) + 1)
+                # cleanup should still take place
+                self.assert_(os.path.isfile(filename))
+                op.close()
+                self.failIf(os.path.isfile(filename))
+            else:
+                self.fail()
+        finally:
+            killfile(filename)
+

Added: mechanize/tags/0.1.10/test/test_password_manager.special_doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_password_manager.special_doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_password_manager.special_doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,148 @@
+Features common to HTTPPasswordMgr and HTTPProxyPasswordMgr
+===========================================================
+
+(mgr_class gets here through globs argument)
+
+>>> mgr = mgr_class()
+>>> add = mgr.add_password
+
+>>> add("Some Realm", "http://example.com/", "joe", "password")
+>>> add("Some Realm", "http://example.com/ni", "ni", "ni")
+>>> add("c", "http://example.com/foo", "foo", "ni")
+>>> add("c", "http://example.com/bar", "bar", "nini")
+>>> add("b", "http://example.com/", "first", "blah")
+>>> add("b", "http://example.com/", "second", "spam")
+>>> add("a", "http://example.com", "1", "a")
+>>> add("Some Realm", "http://c.example.com:3128", "3", "c")
+>>> add("Some Realm", "d.example.com", "4", "d")
+>>> add("Some Realm", "e.example.com:3128", "5", "e")
+
+>>> mgr.find_user_password("Some Realm", "example.com")
+('joe', 'password')
+>>> mgr.find_user_password("Some Realm", "http://example.com")
+('joe', 'password')
+>>> mgr.find_user_password("Some Realm", "http://example.com/")
+('joe', 'password')
+>>> mgr.find_user_password("Some Realm", "http://example.com/spam")
+('joe', 'password')
+>>> mgr.find_user_password("Some Realm", "http://example.com/spam/spam")
+('joe', 'password')
+>>> mgr.find_user_password("c", "http://example.com/foo")
+('foo', 'ni')
+>>> mgr.find_user_password("c", "http://example.com/bar")
+('bar', 'nini')
+
+Actually, this is really undefined ATM
+#Currently, we use the highest-level path where more than one match:
+#
+#>>> mgr.find_user_password("Some Realm", "http://example.com/ni")
+#('joe', 'password')
+
+Use latest add_password() in case of conflict:
+
+>>> mgr.find_user_password("b", "http://example.com/")
+('second', 'spam')
+
+No special relationship between a.example.com and example.com:
+
+>>> mgr.find_user_password("a", "http://example.com/")
+('1', 'a')
+>>> mgr.find_user_password("a", "http://a.example.com/")
+(None, None)
+
+Ports:
+
+>>> mgr.find_user_password("Some Realm", "c.example.com")
+(None, None)
+>>> mgr.find_user_password("Some Realm", "c.example.com:3128")
+('3', 'c')
+>>> mgr.find_user_password("Some Realm", "http://c.example.com:3128")
+('3', 'c')
+>>> mgr.find_user_password("Some Realm", "d.example.com")
+('4', 'd')
+>>> mgr.find_user_password("Some Realm", "e.example.com:3128")
+('5', 'e')
+
+
+Default port tests
+------------------
+
+>>> mgr = mgr_class()
+>>> add = mgr.add_password
+
+The point to note here is that we can't guess the default port if there's
+no scheme.  This applies to both add_password and find_user_password.
+
+>>> add("f", "http://g.example.com:80", "10", "j")
+>>> add("g", "http://h.example.com", "11", "k")
+>>> add("h", "i.example.com:80", "12", "l")
+>>> add("i", "j.example.com", "13", "m")
+>>> mgr.find_user_password("f", "g.example.com:100")
+(None, None)
+>>> mgr.find_user_password("f", "g.example.com:80")
+('10', 'j')
+>>> mgr.find_user_password("f", "g.example.com")
+(None, None)
+>>> mgr.find_user_password("f", "http://g.example.com:100")
+(None, None)
+>>> mgr.find_user_password("f", "http://g.example.com:80")
+('10', 'j')
+>>> mgr.find_user_password("f", "http://g.example.com")
+('10', 'j')
+>>> mgr.find_user_password("g", "h.example.com")
+('11', 'k')
+>>> mgr.find_user_password("g", "h.example.com:80")
+('11', 'k')
+>>> mgr.find_user_password("g", "http://h.example.com:80")
+('11', 'k')
+>>> mgr.find_user_password("h", "i.example.com")
+(None, None)
+>>> mgr.find_user_password("h", "i.example.com:80")
+('12', 'l')
+>>> mgr.find_user_password("h", "http://i.example.com:80")
+('12', 'l')
+>>> mgr.find_user_password("i", "j.example.com")
+('13', 'm')
+>>> mgr.find_user_password("i", "j.example.com:80")
+(None, None)
+>>> mgr.find_user_password("i", "http://j.example.com")
+('13', 'm')
+>>> mgr.find_user_password("i", "http://j.example.com:80")
+(None, None)
+
+
+Features specific to HTTPProxyPasswordMgr
+=========================================
+
+Default realm:
+
+>>> mgr = mechanize.HTTPProxyPasswordMgr()
+>>> add = mgr.add_password
+
+>>> mgr.find_user_password("d", "f.example.com")
+(None, None)
+>>> add(None, "f.example.com", "6", "f")
+>>> mgr.find_user_password("d", "f.example.com")
+('6', 'f')
+
+Default host/port:
+
+>>> mgr.find_user_password("e", "g.example.com")
+(None, None)
+>>> add("e", None, "7", "g")
+>>> mgr.find_user_password("e", "g.example.com")
+('7', 'g')
+
+Default realm and host/port:
+
+>>> mgr.find_user_password("f", "h.example.com")
+(None, None)
+>>> add(None, None, "8", "h")
+>>> mgr.find_user_password("f", "h.example.com")
+('8', 'h')
+
+Default realm beats default host/port:
+
+>>> add("d", None, "9", "i")
+>>> mgr.find_user_password("d", "f.example.com")
+('6', 'f')

Added: mechanize/tags/0.1.10/test/test_pullparser.py
===================================================================
--- mechanize/tags/0.1.10/test/test_pullparser.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_pullparser.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,327 @@
+#!/usr/bin/env python
+
+import sys
+from unittest import TestCase
+
+def peek_token(p):
+    tok = p.get_token()
+    p.unget_token(tok)
+    return tok
+
+
+class PullParserTests(TestCase):
+    from mechanize._pullparser import PullParser, TolerantPullParser
+    PARSERS = [(PullParser, False), (TolerantPullParser, True)]
+
+    def data_and_file(self):
+        from StringIO import StringIO
+        data = """<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
+"http://www.w3.org/TR/html4/strict.dtd">
+<html>
+<head>
+<title an=attr>Title</title>
+</head>
+<body>
+<p>This is a data <img alt="blah &amp; &#097;"> &amp; that was an entityref and this &#097; is
+a charref.  <blah foo="bing" blam="wallop">.
+<!-- comment blah blah
+still a comment , blah and a space at the end 
+-->
+<!rheum>
+<?rhaponicum>
+<randomtag spam="eggs"/>
+</body>
+</html>
+""" #"
+        f = StringIO(data)
+        return data, f
+
+    def test_encoding(self):
+        from mechanize import _pullparser
+        #for pc, tolerant in [(pullparser.PullParser, False)]:#PullParserTests.PARSERS:
+        for pc, tolerant in PullParserTests.PARSERS:
+            self._test_encoding(pc, tolerant)
+    def _test_encoding(self, parser_class, tolerant):
+        from StringIO import StringIO
+        datas = ["<a>&#1092;</a>", "<a>&#x444;</a>"]
+        def get_text(data, encoding):
+            p = _get_parser(data, encoding)
+            p.get_tag("a")
+            return p.get_text()
+        def get_attr(data, encoding, et_name, attr_name):
+            p = _get_parser(data, encoding)
+            while True:
+                tag = p.get_tag(et_name)
+                attrs = tag.attrs
+                if attrs is not None:
+                    break
+            return dict(attrs)[attr_name]
+        def _get_parser(data, encoding):
+            f = StringIO(data)
+            p = parser_class(f, encoding=encoding)
+            #print 'p._entitydefs>>%s<<' % p._entitydefs['&mdash;']
+            return p
+
+        for data in datas:
+            self.assertEqual(get_text(data, "KOI8-R"), "\xc6")
+            self.assertEqual(get_text(data, "UTF-8"), "\xd1\x84")
+
+        self.assertEqual(get_text("<a>&mdash;</a>", "UTF-8"),
+                         u"\u2014".encode('utf8'))
+        self.assertEqual(
+            get_attr('<a name="&mdash;">blah</a>', "UTF-8", "a", "name"),
+            u"\u2014".encode('utf8'))
+        self.assertEqual(get_text("<a>&mdash;</a>", "ascii"), "&mdash;")
+
+#        response = urllib.addinfourl(f, {"content-type": "text/html; charset=XXX"}, req.get_full_url())
+    def test_get_token(self):
+        for pc, tolerant in PullParserTests.PARSERS:
+            self._test_get_token(pc, tolerant)
+    def _test_get_token(self, parser_class, tolerant):
+        data, f = self.data_and_file()
+        p = parser_class(f)
+        from mechanize._pullparser import NoMoreTokensError
+        self.assertEqual(
+            p.get_token(), ("decl",
+'''DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
+"http://www.w3.org/TR/html4/strict.dtd"''', None))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("starttag", "html", []))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("starttag", "head", []))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("starttag", "title", [("an", "attr")]))
+        self.assertEqual(p.get_token(), ("data", "Title", None))
+        self.assertEqual(p.get_token(), ("endtag", "title", None))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("endtag", "head", None))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("starttag", "body", []))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("starttag", "p", []))
+        self.assertEqual(p.get_token(), ("data", "This is a data ", None))
+        self.assertEqual(p.get_token(), ("starttag", "img", [("alt", "blah & a")]))
+        self.assertEqual(p.get_token(), ("data", " ", None))
+        self.assertEqual(p.get_token(), ("entityref", "amp", None))
+        self.assertEqual(p.get_token(), ("data",
+                                         " that was an entityref and this ",
+                                         None))
+        self.assertEqual(p.get_token(), ("charref", "097", None))
+        self.assertEqual(p.get_token(), ("data", " is\na charref.  ", None))
+        self.assertEqual(p.get_token(), ("starttag", "blah",
+                                         [("foo", "bing"), ("blam", "wallop")]))
+        self.assertEqual(p.get_token(), ("data", ".\n", None))
+        self.assertEqual(p.get_token(), (
+            "comment", " comment blah blah\n"
+            "still a comment , blah and a space at the end \n", None))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("decl", "rheum", None))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("pi", "rhaponicum", None))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), (
+            (tolerant and "starttag" or "startendtag"), "randomtag",
+            [("spam", "eggs")]))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("endtag", "body", None))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertEqual(p.get_token(), ("endtag", "html", None))
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        self.assertRaises(NoMoreTokensError, p.get_token)
+#        print "token", p.get_token()
+#        sys.exit()
+
+    def test_unget_token(self):
+        for pc, tolerant in PullParserTests.PARSERS:
+            self._test_unget_token(pc, tolerant)
+    def _test_unget_token(self, parser_class, tolerant):
+        from mechanize._pullparser import NoMoreTokensError
+        data, f = self.data_and_file()
+        p = parser_class(f)
+        p.get_token()
+        tok = p.get_token()
+        self.assertEqual(tok, ("data", "\n", None))
+        p.unget_token(tok)
+        self.assertEqual(p.get_token(), ("data", "\n", None))
+        tok = p.get_token()
+        self.assertEqual(tok, ("starttag", "html", []))
+        p.unget_token(tok)
+        self.assertEqual(tok, ("starttag", "html", []))
+
+    def test_get_tag(self):
+        for pc, tolerant in PullParserTests.PARSERS:
+            self._test_get_tag(pc, tolerant)
+    def _test_get_tag(self, parser_class, tolerant):
+        from mechanize._pullparser import NoMoreTokensError
+        data, f = self.data_and_file()
+        p = parser_class(f)
+        self.assertEqual(p.get_tag(), ("starttag", "html", []))
+        self.assertEqual(p.get_tag("blah", "body", "title"),
+                     ("starttag", "title", [("an", "attr")]))
+        self.assertEqual(p.get_tag(), ("endtag", "title", None))
+        self.assertEqual(p.get_tag("randomtag"),
+                         ((tolerant and "starttag" or "startendtag"), "randomtag",
+                          [("spam", "eggs")]))
+        self.assertEqual(p.get_tag(), ("endtag", "body", None))
+        self.assertEqual(p.get_tag(), ("endtag", "html", None))
+        self.assertRaises(NoMoreTokensError, p.get_tag)
+#        print "tag", p.get_tag()
+#        sys.exit()
+
+    def test_get_text(self):
+        for pc, tolerant in PullParserTests.PARSERS:
+            self._test_get_text(pc, tolerant)
+    def _test_get_text(self, parser_class, tolerant):
+        from mechanize._pullparser import NoMoreTokensError
+        data, f = self.data_and_file()
+        p = parser_class(f)
+        self.assertEqual(p.get_text(), "\n")
+        self.assertEqual(peek_token(p).data, "html")
+        self.assertEqual(p.get_text(), "")
+        self.assertEqual(peek_token(p).data, "html"); p.get_token()
+        self.assertEqual(p.get_text(), "\n"); p.get_token()
+        self.assertEqual(p.get_text(), "\n"); p.get_token()
+        self.assertEqual(p.get_text(), "Title"); p.get_token()
+        self.assertEqual(p.get_text(), "\n"); p.get_token()
+        self.assertEqual(p.get_text(), "\n"); p.get_token()
+        self.assertEqual(p.get_text(), "\n"); p.get_token()
+        self.assertEqual(p.get_text(),
+                         "This is a data blah & a[IMG]"); p.get_token()
+        self.assertEqual(p.get_text(), " & that was an entityref "
+                         "and this a is\na charref.  "); p.get_token()
+        self.assertEqual(p.get_text(), ".\n\n\n\n"); p.get_token()
+        self.assertEqual(p.get_text(), "\n"); p.get_token()
+        self.assertEqual(p.get_text(), "\n"); p.get_token()
+        self.assertEqual(p.get_text(), "\n"); p.get_token()
+        # no more tokens, so we just get empty string
+        self.assertEqual(p.get_text(), "")
+        self.assertEqual(p.get_text(), "")
+        self.assertRaises(NoMoreTokensError, p.get_token)
+        #print "text", `p.get_text()`
+        #sys.exit()
+
+    def test_get_text_2(self):
+        for pc, tolerant in PullParserTests.PARSERS:
+            self._test_get_text_2(pc, tolerant)
+    def _test_get_text_2(self, parser_class, tolerant):
+        # more complicated stuff
+        from mechanize._pullparser import NoMoreTokensError
+
+        # endat
+        data, f = self.data_and_file()
+        p = parser_class(f)
+        self.assertEqual(p.get_text(endat=("endtag", "html")),
+                     u"\n\n\nTitle\n\n\nThis is a data blah & a[IMG]"
+                     " & that was an entityref and this a is\na charref.  ."
+                     "\n\n\n\n\n\n")
+        f.close()
+
+        data, f = self.data_and_file()
+        p = parser_class(f)
+        self.assertEqual(p.get_text(endat=("endtag", "title")),
+                         "\n\n\nTitle")
+        self.assertEqual(p.get_text(endat=("starttag", "img")),
+                         "\n\n\nThis is a data blah & a[IMG]")
+        f.close()
+
+        # textify arg
+        data, f = self.data_and_file()
+        p = parser_class(f, textify={"title": "an", "img": lambda x: "YYY"})
+        self.assertEqual(p.get_text(endat=("endtag", "title")),
+                         "\n\n\nattr[TITLE]Title")
+        self.assertEqual(p.get_text(endat=("starttag", "img")),
+                         "\n\n\nThis is a data YYY")
+        f.close()
+
+        # get_compressed_text
+        data, f = self.data_and_file()
+        p = parser_class(f)
+        self.assertEqual(p.get_compressed_text(endat=("endtag", "html")),
+                         u"Title This is a data blah & a[IMG]"
+                         " & that was an entityref and this a is a charref. .")
+        f.close()
+
+    def test_tags(self):
+        for pc, tolerant in PullParserTests.PARSERS:
+            self._test_tags(pc, tolerant)
+    def _test_tags(self, parser_class, tolerant):
+        from mechanize._pullparser import NoMoreTokensError
+
+        # no args
+        data, f = self.data_and_file()
+        p = parser_class(f)
+
+        expected_tag_names = [
+            "html", "head", "title", "title", "head", "body", "p", "img",
+            "blah", "randomtag", "body", "html"
+            ]
+
+        for i, token in enumerate(p.tags()):
+            self.assertEquals(token.data, expected_tag_names[i])
+        f.close()
+
+        # tag name args
+        data, f = self.data_and_file()
+        p = parser_class(f)
+
+        expected_tokens = [
+            ("starttag", "head", []),
+            ("endtag", "head", None),
+            ("starttag", "p", []),
+            ]
+
+        for i, token in enumerate(p.tags("head", "p")):
+            self.assertEquals(token, expected_tokens[i])
+        f.close()
+
+    def test_tokens(self):
+        for pc, tolerant in PullParserTests.PARSERS:
+            self._test_tokens(pc, tolerant)
+    def _test_tokens(self, parser_class, tolerant):
+        from mechanize._pullparser import NoMoreTokensError
+
+        # no args
+        data, f = self.data_and_file()
+        p = parser_class(f)
+
+        expected_token_types = [
+            "decl", "data", "starttag", "data", "starttag", "data", "starttag",
+            "data", "endtag", "data", "endtag", "data", "starttag", "data",
+            "starttag", "data", "starttag", "data", "entityref", "data",
+            "charref", "data", "starttag", "data", "comment", "data", "decl",
+            "data", "pi", "data", (tolerant and "starttag" or "startendtag"),
+            "data", "endtag", "data", "endtag", "data"
+            ]
+
+        for i, token in enumerate(p.tokens()):
+            self.assertEquals(token.type, expected_token_types[i])
+        f.close()
+
+        # token type args
+        data, f = self.data_and_file()
+        p = parser_class(f)
+
+        expected_tokens = [
+            ("entityref", "amp", None),
+            ("charref", "097", None),
+            ]
+
+        for i, token in enumerate(p.tokens("charref", "entityref")):
+            self.assertEquals(token, expected_tokens[i])
+        f.close()
+
+    def test_token_eq(self):
+        from mechanize._pullparser import Token
+        for (a, b) in [
+            (Token('endtag', 'html', None),
+             ('endtag', 'html', None)),
+            (Token('endtag', 'html', {'woof': 'bark'}),
+             ('endtag', 'html', {'woof': 'bark'})),
+            ]:
+            self.assertEquals(a, a)
+            self.assertEquals(a, b)
+            self.assertEquals(b, a)
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: mechanize/tags/0.1.10/test/test_request.doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_request.doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_request.doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,66 @@
+>>> from mechanize import Request
+>>> r = Request("http://example.com/foo#frag")
+>>> r.get_selector()
+'/foo'
+
+
+Request Headers Dictionary
+--------------------------
+
+The Request.headers dictionary is not a documented interface.  It should
+stay that way, because the complete set of headers are only accessible
+through the .get_header(), .has_header(), .header_items() interface.
+However, .headers pre-dates those methods, and so real code will be using
+the dictionary.
+
+The introduction in 2.4 of those methods was a mistake for the same reason:
+code that previously saw all (urllib2 user)-provided headers in .headers
+now sees only a subset (and the function interface is ugly and incomplete).
+A better change would have been to replace .headers dict with a dict
+subclass (or UserDict.DictMixin instance?)  that preserved the .headers
+interface and also provided access to the "unredirected" headers.  It's
+probably too late to fix that, though.
+
+
+Check .capitalize() case normalization:
+
+>>> url = "http://example.com"
+>>> Request(url, headers={"Spam-eggs": "blah"}).headers["Spam-eggs"]
+'blah'
+>>> Request(url, headers={"spam-EggS": "blah"}).headers["Spam-eggs"]
+'blah'
+
+Currently, Request(url, "Spam-eggs").headers["Spam-Eggs"] raises KeyError,
+but that could be changed in future.
+
+
+Request Headers Methods
+-----------------------
+
+Note the case normalization of header names here, to .capitalize()-case.
+This should be preserved for backwards-compatibility.  (In the HTTP case,
+normalization to .title()-case is done by urllib2 before sending headers to
+httplib).
+
+>>> url = "http://example.com"
+>>> r = Request(url, headers={"Spam-eggs": "blah"})
+>>> r.has_header("Spam-eggs")
+True
+>>> r.header_items()
+[('Spam-eggs', 'blah')]
+>>> r.add_header("Foo-Bar", "baz")
+>>> items = r.header_items()
+>>> items.sort()
+>>> items
+[('Foo-bar', 'baz'), ('Spam-eggs', 'blah')]
+
+Note that e.g. r.has_header("spam-EggS") is currently False, and
+r.get_header("spam-EggS") returns None, but that could be changed in
+future.
+
+>>> r.has_header("Not-there")
+False
+>>> print r.get_header("Not-there")
+None
+>>> r.get_header("Not-there", "default")
+'default'

Added: mechanize/tags/0.1.10/test/test_response.doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_response.doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_response.doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,227 @@
+The read_complete flag lets us know if all of the wrapped file's data
+has been read.  We want to know this because Browser.back() must
+.reload() the response if not.
+
+I've noted here the various cases where .read_complete may be set.
+
+>>> text = "To err is human, to moo, bovine.\n"*10
+>>> def get_wrapper():
+...     import cStringIO
+...     from mechanize._response import seek_wrapper
+...     f = cStringIO.StringIO(text)
+...     wr = seek_wrapper(f)
+...     return wr
+
+.read() case #1
+
+>>> wr = get_wrapper()
+>>> wr.read_complete
+False
+>>> junk = wr.read()
+>>> wr.read_complete
+True
+>>> wr.seek(0)
+>>> wr.read_complete
+True
+
+Excercise partial .read() and .readline(), and .seek() case #1
+
+>>> wr = get_wrapper()
+>>> junk = wr.read(10)
+>>> wr.read_complete
+False
+>>> junk = wr.readline()
+>>> wr.read_complete
+False
+>>> wr.seek(0, 2)
+>>> wr.read_complete
+True
+>>> wr.seek(0)
+>>> wr.read_complete
+True
+
+.readlines() case #1
+
+>>> wr = get_wrapper()
+>>> junk = wr.readlines()
+>>> wr.read_complete
+True
+>>> wr.seek(0)
+>>> wr.read_complete
+True
+
+.seek() case #2
+
+>>> wr = get_wrapper()
+>>> wr.seek(10)
+>>> wr.read_complete
+False
+>>> wr.seek(1000000)
+
+.read() case #2
+
+>>> wr = get_wrapper()
+>>> junk = wr.read(1000000)
+>>> wr.read_complete  # we read to the end, but don't know it yet
+False
+>>> junk = wr.read(10)
+>>> wr.read_complete
+True
+
+.readline() case #1
+
+>>> wr = get_wrapper()
+>>> junk = wr.read(len(text)-10)
+>>> wr.read_complete
+False
+>>> junk = wr.readline()
+>>> wr.read_complete  # we read to the end, but don't know it yet
+False
+>>> junk = wr.readline()
+>>> wr.read_complete
+True
+
+Test copying and sharing of .read_complete state
+
+>>> import copy
+>>> wr = get_wrapper()
+>>> wr2 = copy.copy(wr)
+>>> wr.read_complete
+False
+>>> wr2.read_complete
+False
+>>> junk = wr2.read()
+>>> wr.read_complete
+True
+>>> wr2.read_complete
+True
+
+
+Fix from -r36082: .read() after .close() used to break
+.read_complete state
+
+>>> from mechanize._response import test_response
+>>> r = test_response(text)
+>>> junk = r.read(64)
+>>> r.close()
+>>> r.read_complete
+False
+>>> r.read()
+''
+>>> r.read_complete
+False
+
+
+
+Tests for the truly horrendous upgrade_response()
+
+>>> def is_response(r):
+...     names = "get_data read readline readlines close seek code msg".split()
+...     for name in names:
+...         if not hasattr(r, name):
+...             return False
+...     return r.get_data() == "test data"
+
+>>> from cStringIO import StringIO
+>>> from mechanize._response import upgrade_response, make_headers, \
+...     make_response, closeable_response, seek_wrapper
+>>> data="test data"; url="http://example.com/"; code=200; msg="OK"
+
+Normal response (closeable_response wrapped with seek_wrapper): return a copy
+
+>>> r1 = make_response(data, [], url, code, msg)
+>>> r2 = upgrade_response(r1)
+>>> is_response(r2)
+True
+>>> r1 is not r2
+True
+>>> r1.wrapped is r2.wrapped
+True
+
+closeable_response with no seek_wrapper: wrap with seek_wrapper
+
+>>> r1 = closeable_response(StringIO(data), make_headers([]), url, code, msg)
+>>> is_response(r1)
+False
+>>> r2 = upgrade_response(r1)
+>>> is_response(r2)
+True
+>>> r1 is not r2
+True
+>>> r1 is r2.wrapped
+True
+
+urllib2.addinfourl: extract .fp and wrap it with closeable_response
+and seek_wrapper
+
+>>> import urllib2
+>>> r1= urllib2.addinfourl(StringIO(data), make_headers([]), url)
+>>> is_response(r1)
+False
+>>> r2 = upgrade_response(r1)
+>>> is_response(r2)
+True
+>>> r1 is not r2
+True
+>>> r1 is not r2.wrapped
+True
+>>> r1.fp is r2.wrapped.fp
+True
+
+addinfourl with code, msg
+
+>>> r1= urllib2.addinfourl(StringIO(data), make_headers([]), url)
+>>> r1.code = 206
+>>> r1.msg = "cool"
+>>> r2 = upgrade_response(r1)
+>>> is_response(r2)
+True
+>>> r2.code == r1.code
+True
+>>> r2.msg == r1.msg
+True
+
+addinfourl with seek wrapper: cached data is not lost
+
+>>> r1= urllib2.addinfourl(StringIO(data), make_headers([]), url)
+>>> r1 = seek_wrapper(r1)
+>>> r1.read(4)
+'test'
+>>> r2 = upgrade_response(r1)
+>>> is_response(r2)
+True
+
+addinfourl wrapped with HTTPError -- remains an HTTPError of the same
+subclass (through horrible trickery)
+
+>>> hdrs = make_headers([])
+>>> r1 = urllib2.addinfourl(StringIO(data), hdrs, url)
+>>> class MyHTTPError(urllib2.HTTPError): pass
+>>> r1 = MyHTTPError(url, code, msg, hdrs, r1)
+>>> is_response(r1)
+False
+>>> r2 = upgrade_response(r1)
+>>> is_response(r2)
+True
+>>> isinstance(r2, MyHTTPError)
+True
+>>> r2  # doctest: +ELLIPSIS
+<httperror_seek_wrapper (MyHTTPError instance) at ...
+
+The trickery does not cause double-wrapping
+
+>>> r3 = upgrade_response(r2)
+>>> is_response(r3)
+True
+>>> r3 is not r2
+True
+>>> r3.wrapped is r2.wrapped
+True
+
+Test dynamically-created class __repr__ for case where we have the
+module name
+
+>>> r4 = urllib2.addinfourl(StringIO(data), hdrs, url)
+>>> r4 = urllib2.HTTPError(url, code, msg, hdrs, r4)
+>>> upgrade_response(r4)  # doctest: +ELLIPSIS
+<httperror_seek_wrapper (urllib2.HTTPError instance) at ...

Added: mechanize/tags/0.1.10/test/test_response.py
===================================================================
--- mechanize/tags/0.1.10/test/test_response.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_response.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,218 @@
+"""Tests for mechanize._response.seek_wrapper and friends."""
+
+import copy
+import cStringIO
+from unittest import TestCase
+
+class TestUnSeekable:
+    def __init__(self, text):
+        self._file = cStringIO.StringIO(text)
+        self.log = []
+
+    def tell(self): return self._file.tell()
+
+    def seek(self, offset, whence=0): assert False
+
+    def read(self, size=-1):
+        self.log.append(("read", size))
+        return self._file.read(size)
+
+    def readline(self, size=-1):
+        self.log.append(("readline", size))
+        return self._file.readline(size)
+
+    def readlines(self, sizehint=-1):
+        self.log.append(("readlines", sizehint))
+        return self._file.readlines(sizehint)
+
+class TestUnSeekableResponse(TestUnSeekable):
+    def __init__(self, text, headers):
+        TestUnSeekable.__init__(self, text)
+        self.code = 200
+        self.msg = "OK"
+        self.headers = headers
+        self.url = "http://example.com/"
+
+    def geturl(self):
+        return self.url
+
+    def info(self):
+        return self.headers
+
+    def close(self):
+        pass
+
+
+class SeekableTests(TestCase):
+
+    text = """\
+The quick brown fox
+jumps over the lazy
+
+dog.
+
+"""
+    text_lines = map(lambda l: l+"\n", text.split("\n")[:-1])
+
+    def testSeekable(self):
+        from mechanize._response import seek_wrapper
+        text = self.text
+        text_lines = self.text_lines
+
+        for ii in range(1, 6):
+            fh = TestUnSeekable(text)
+            sfh = seek_wrapper(fh)
+            test = getattr(self, "_test%d" % ii)
+            test(sfh)
+
+        # copies have independent seek positions
+        fh = TestUnSeekable(text)
+        sfh = seek_wrapper(fh)
+        self._testCopy(sfh)
+
+    def _testCopy(self, sfh):
+        sfh2 = copy.copy(sfh)
+        sfh.read(10)
+        text = self.text
+        self.assertEqual(sfh2.read(10), text[:10])
+        sfh2.seek(5)
+        self.assertEqual(sfh.read(10), text[10:20])
+        self.assertEqual(sfh2.read(10), text[5:15])
+        sfh.seek(0)
+        sfh2.seek(0)
+        return sfh2
+
+    def _test1(self, sfh):
+        text = self.text
+        text_lines = self.text_lines
+        assert sfh.read(10) == text[:10]  # calls fh.read
+        assert sfh.log[-1] == ("read", 10)  # .log delegated to fh
+        sfh.seek(0)  # doesn't call fh.seek
+        assert sfh.read(10) == text[:10]  # doesn't call fh.read
+        assert len(sfh.log) == 1
+        sfh.seek(0)
+        assert sfh.read(5) == text[:5]  # read only part of cached data
+        assert len(sfh.log) == 1
+        sfh.seek(0)
+        assert sfh.read(25) == text[:25]  # calls fh.read
+        assert sfh.log[1] == ("read", 15)
+        lines = []
+        sfh.seek(-1, 1)
+        while 1:
+            l = sfh.readline()
+            if l == "": break
+            lines.append(l)
+        assert lines == ["s over the lazy\n"]+text_lines[2:]
+        assert sfh.log[2:] == [("readline", -1)]*5
+        sfh.seek(0)
+        lines = []
+        while 1:
+            l = sfh.readline()
+            if l == "": break
+            lines.append(l)
+        assert lines == text_lines
+
+    def _test2(self, sfh):
+        text = self.text
+        sfh.read(5)
+        sfh.seek(0)
+        assert sfh.read() == text
+        assert sfh.read() == ""
+        sfh.seek(0)
+        assert sfh.read() == text
+        sfh.seek(0)
+        assert sfh.readline(5) == "The q"
+        assert sfh.read() == text[5:]
+        sfh.seek(0)
+        assert sfh.readline(5) == "The q"
+        assert sfh.readline() == "uick brown fox\n"
+
+    def _test3(self, sfh):
+        text = self.text
+        text_lines = self.text_lines
+        sfh.read(25)
+        sfh.seek(-1, 1)
+        self.assertEqual(sfh.readlines(), ["s over the lazy\n"]+text_lines[2:])
+        nr_logs = len(sfh.log)
+        sfh.seek(0)
+        assert sfh.readlines() == text_lines
+
+    def _test4(self, sfh):
+        text = self.text
+        text_lines = self.text_lines
+        count = 0
+        limit = 10
+        while count < limit:
+            if count == 5:
+                self.assertRaises(StopIteration, sfh.next)
+                break
+            else:
+                sfh.next() == text_lines[count]
+            count = count + 1
+        else:
+            assert False, "StopIteration not raised"
+
+    def _test5(self, sfh):
+        text = self.text
+        sfh.read(10)
+        sfh.seek(5)
+        self.assert_(sfh.invariant())
+        sfh.seek(0, 2)
+        self.assert_(sfh.invariant())
+        sfh.seek(0)
+        self.assertEqual(sfh.read(), text)
+
+    def testResponseSeekWrapper(self):
+        from mechanize import response_seek_wrapper
+        hdrs = {"Content-type": "text/html"}
+        r = TestUnSeekableResponse(self.text, hdrs)
+        rsw = response_seek_wrapper(r)
+        rsw2 = self._testCopy(rsw)
+        self.assert_(rsw is not rsw2)
+        self.assertEqual(rsw.info(), rsw2.info())
+        self.assert_(rsw.info() is not rsw2.info())
+
+        # should be able to close already-closed object
+        rsw2.close()
+        rsw2.close()
+
+    def testSetResponseData(self):
+        from mechanize import response_seek_wrapper
+        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
+        rsw = response_seek_wrapper(r)
+        rsw.set_data("""\
+A Seeming somwhat more than View;
+  That doth instruct the Mind
+  In Things that ly behind,
+""")
+        self.assertEqual(rsw.read(9), "A Seeming")
+        self.assertEqual(rsw.read(13), " somwhat more")
+        rsw.seek(0)
+        self.assertEqual(rsw.read(9), "A Seeming")
+        self.assertEqual(rsw.readline(), " somwhat more than View;\n")
+        rsw.seek(0)
+        self.assertEqual(rsw.readline(), "A Seeming somwhat more than View;\n")
+        rsw.seek(-1, 1)
+        self.assertEqual(rsw.read(7), "\n  That")
+
+        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
+        rsw = response_seek_wrapper(r)
+        rsw.set_data(self.text)
+        self._test2(rsw)
+        rsw.seek(0)
+        self._test4(rsw)
+
+    def testGetResponseData(self):
+        from mechanize import response_seek_wrapper
+        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
+        rsw = response_seek_wrapper(r)
+
+        self.assertEqual(rsw.get_data(), self.text)
+        self._test2(rsw)
+        rsw.seek(0)
+        self._test4(rsw)
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: mechanize/tags/0.1.10/test/test_rfc3986.doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_rfc3986.doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_rfc3986.doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,168 @@
+>>> from mechanize._rfc3986 import urlsplit, urljoin, remove_dot_segments
+
+Some common cases
+
+>>> urlsplit("http://example.com/spam/eggs/spam.html?apples=pears&a=b#foo")
+('http', 'example.com', '/spam/eggs/spam.html', 'apples=pears&a=b', 'foo')
+>>> urlsplit("http://example.com/spam.html#foo")
+('http', 'example.com', '/spam.html', None, 'foo')
+>>> urlsplit("ftp://example.com/foo.gif")
+('ftp', 'example.com', '/foo.gif', None, None)
+>>> urlsplit('ftp://joe:password@example.com:port')
+('ftp', 'joe:password at example.com:port', '', None, None)
+>>> urlsplit("mailto:jjl at pobox.com")
+('mailto', None, 'jjl at pobox.com', None, None)
+
+The five path productions
+
+path-abempty:
+
+>>> urlsplit("http://www.example.com")
+('http', 'www.example.com', '', None, None)
+>>> urlsplit("http://www.example.com/foo")
+('http', 'www.example.com', '/foo', None, None)
+
+path-absolute:
+
+>>> urlsplit("a:/")
+('a', None, '/', None, None)
+>>> urlsplit("a:/b:/c/")
+('a', None, '/b:/c/', None, None)
+
+path-noscheme:
+
+>>> urlsplit("a:b/:c/")
+('a', None, 'b/:c/', None, None)
+
+path-rootless:
+
+>>> urlsplit("a:b:/c/")
+('a', None, 'b:/c/', None, None)
+
+path-empty:
+
+>>> urlsplit("quack:")
+('quack', None, '', None, None)
+
+
+>>> remove_dot_segments("/a/b/c/./../../g")
+'/a/g'
+>>> remove_dot_segments("mid/content=5/../6")
+'mid/6'
+>>> remove_dot_segments("/b/c/.")
+'/b/c/'
+>>> remove_dot_segments("/b/c/./.")
+'/b/c/'
+>>> remove_dot_segments(".")
+''
+>>> remove_dot_segments("/.")
+'/'
+>>> remove_dot_segments("./")
+''
+>>> remove_dot_segments("/..")
+'/'
+>>> remove_dot_segments("/../")
+'/'
+
+
+Examples from RFC 3986 section 5.4
+
+Normal Examples
+
+>>> base = "http://a/b/c/d;p?q"
+>>> def join(uri): return urljoin(base, uri)
+>>> join("g:h")
+'g:h'
+>>> join("g")
+'http://a/b/c/g'
+>>> join("./g")
+'http://a/b/c/g'
+>>> join("g/")
+'http://a/b/c/g/'
+>>> join("/g")
+'http://a/g'
+>>> join("//g")
+'http://g'
+>>> join("?y")
+'http://a/b/c/d;p?y'
+>>> join("g?y")
+'http://a/b/c/g?y'
+>>> join("#s")
+'http://a/b/c/d;p?q#s'
+>>> join("g#s")
+'http://a/b/c/g#s'
+>>> join("g?y#s")
+'http://a/b/c/g?y#s'
+>>> join(";x")
+'http://a/b/c/;x'
+>>> join("g;x")
+'http://a/b/c/g;x'
+>>> join("g;x?y#s")
+'http://a/b/c/g;x?y#s'
+>>> join("")
+'http://a/b/c/d;p?q'
+>>> join(".")
+'http://a/b/c/'
+>>> join("./")
+'http://a/b/c/'
+>>> join("..")
+'http://a/b/'
+>>> join("../")
+'http://a/b/'
+>>> join("../g")
+'http://a/b/g'
+>>> join("../..")
+'http://a/'
+>>> join("../../")
+'http://a/'
+>>> join("../../g")
+'http://a/g'
+
+Abnormal Examples
+
+>>> join("../../../g")
+'http://a/g'
+>>> join("../../../../g")
+'http://a/g'
+>>> join("/./g")
+'http://a/g'
+>>> join("/../g")
+'http://a/g'
+>>> join("g.")
+'http://a/b/c/g.'
+>>> join(".g")
+'http://a/b/c/.g'
+>>> join("g..")
+'http://a/b/c/g..'
+>>> join("..g")
+'http://a/b/c/..g'
+>>> join("./../g")
+'http://a/b/g'
+>>> join("./g/.")
+'http://a/b/c/g/'
+>>> join("g/./h")
+'http://a/b/c/g/h'
+>>> join("g/../h")
+'http://a/b/c/h'
+>>> join("g;x=1/./y")
+'http://a/b/c/g;x=1/y'
+>>> join("g;x=1/../y")
+'http://a/b/c/y'
+>>> join("g?y/./x")
+'http://a/b/c/g?y/./x'
+>>> join("g?y/../x")
+'http://a/b/c/g?y/../x'
+>>> join("g#s/./x")
+'http://a/b/c/g#s/./x'
+>>> join("g#s/../x")
+'http://a/b/c/g#s/../x'
+>>> join("http:g")
+'http://a/b/c/g'
+
+
+Additional urljoin tests, not taken from RFC:
+
+>>> join("/..")
+'http://a/'
+>>> join("/../")
+'http://a/'

Added: mechanize/tags/0.1.10/test/test_robotfileparser.special_doctest
===================================================================
--- mechanize/tags/0.1.10/test/test_robotfileparser.special_doctest	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_robotfileparser.special_doctest	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,8 @@
+>>> from mechanize._http import MechanizeRobotFileParser
+
+Calling .set_opener() without args sets a default opener.
+
+>>> rfp = MechanizeRobotFileParser()
+>>> rfp.set_opener()
+>>> rfp._opener  # doctest: +ELLIPSIS
+<mechanize._opener.OpenerDirector instance at ...>

Added: mechanize/tags/0.1.10/test/test_urllib2.py
===================================================================
--- mechanize/tags/0.1.10/test/test_urllib2.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_urllib2.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,1330 @@
+"""Tests for urllib2-level functionality.
+
+This is made up of:
+
+ - tests that I've contributed back to stdlib test_urllib2.py
+
+ - tests for features that aren't in urllib2, but works on the level of the
+   interfaces exported by urllib2, especially urllib2 "handler" interface,
+   but *excluding* the extended interfaces provided by mechanize.UserAgent
+   and mechanize.Browser.
+
+"""
+
+# XXX
+# Request (I'm too lazy)
+# CacheFTPHandler (hard to write)
+# parse_keqv_list, parse_http_list
+
+import unittest, StringIO, os, sys, UserDict, httplib, warnings
+
+import mechanize
+
+from mechanize._http import AbstractHTTPHandler, parse_head
+from mechanize._response import test_response
+from mechanize import HTTPRedirectHandler, HTTPRequestUpgradeProcessor, \
+     HTTPEquivProcessor, HTTPRefreshProcessor, SeekableProcessor, \
+     HTTPCookieProcessor, HTTPRefererProcessor, \
+     HTTPErrorProcessor, HTTPHandler
+from mechanize import OpenerDirector, build_opener, urlopen, Request
+from mechanize._util import hide_deprecations, reset_deprecations
+import mechanize._sockettimeout as _sockettimeout
+
+## from logging import getLogger, DEBUG
+## l = getLogger("mechanize")
+## l.setLevel(DEBUG)
+
+class AlwaysEqual:
+    def __cmp__(self, other):
+        return 0
+
+class MockOpener:
+    addheaders = []
+    def open(self, req, data=None):
+        self.req, self.data = req, data
+    def error(self, proto, *args):
+        self.proto, self.args = proto, args
+
+class MockFile:
+    def read(self, count=None): pass
+    def readline(self, count=None): pass
+    def close(self): pass
+
+def http_message(mapping):
+    """
+    >>> http_message({"Content-Type": "text/html"}).items()
+    [('content-type', 'text/html')]
+
+    """
+    f = []
+    for kv in mapping.items():
+        f.append("%s: %s" % kv)
+    f.append("")
+    msg = httplib.HTTPMessage(StringIO.StringIO("\r\n".join(f)))
+    return msg
+
+class MockResponse(StringIO.StringIO):
+    def __init__(self, code, msg, headers, data, url=None):
+        StringIO.StringIO.__init__(self, data)
+        self.code, self.msg, self.headers, self.url = code, msg, headers, url
+    def info(self):
+        return self.headers
+    def geturl(self):
+        return self.url
+
+class MockCookieJar:
+    def add_cookie_header(self, request, unverifiable=False):
+        self.ach_req, self.ach_u = request, unverifiable
+    def extract_cookies(self, response, request, unverifiable=False):
+        self.ec_req, self.ec_r, self.ec_u = request, response, unverifiable
+
+class MockMethod:
+    def __init__(self, meth_name, action, handle):
+        self.meth_name = meth_name
+        self.handle = handle
+        self.action = action
+    def __call__(self, *args):
+        return apply(self.handle, (self.meth_name, self.action)+args)
+
+class MockHandler:
+    processor_order = 500
+    def __init__(self, methods):
+        self._define_methods(methods)
+    def _define_methods(self, methods):
+        for spec in methods:
+            if len(spec) == 2: name, action = spec
+            else: name, action = spec, None
+            meth = MockMethod(name, action, self.handle)
+            setattr(self.__class__, name, meth)
+    def handle(self, fn_name, action, *args, **kwds):
+        self.parent.calls.append((self, fn_name, args, kwds))
+        if action is None:
+            return None
+        elif action == "return self":
+            return self
+        elif action == "return response":
+            res = MockResponse(200, "OK", {}, "")
+            return res
+        elif action == "return request":
+            return Request("http://blah/")
+        elif action.startswith("error"):
+            code = int(action[-3:])
+            res = MockResponse(200, "OK", {}, "")
+            return self.parent.error("http", args[0], res, code, "", {})
+        elif action == "raise":
+            raise mechanize.URLError("blah")
+        assert False
+    def close(self): pass
+    def add_parent(self, parent):
+        self.parent = parent
+        self.parent.calls = []
+    def __lt__(self, other):
+        if not hasattr(other, "handler_order"):
+            # Try to preserve the old behavior of having custom classes
+            # inserted after default ones (works only for custom user
+            # classes which are not aware of handler_order).
+            return True
+        return self.handler_order < other.handler_order
+
+
+def add_ordered_mock_handlers(opener, meth_spec):
+    handlers = []
+    count = 0
+    for meths in meth_spec:
+        class MockHandlerSubclass(MockHandler): pass
+        h = MockHandlerSubclass(meths)
+        h.handler_order = h.processor_order = 101+count
+        h.add_parent(opener)
+        count = count + 1
+        handlers.append(h)
+        opener.add_handler(h)
+    return handlers
+
+
+class OpenerDirectorTests(unittest.TestCase):
+
+    def test_handled(self):
+        # handler returning non-None means no more handlers will be called
+        o = OpenerDirector()
+        meth_spec = [
+            ["http_open", "ftp_open", "http_error_302"],
+            ["ftp_open"],
+            [("http_open", "return self")],
+            [("http_open", "return self")],
+            ]
+        handlers = add_ordered_mock_handlers(o, meth_spec)
+
+        req = Request("http://example.com/")
+        r = o.open(req)
+        # Second http_open gets called, third doesn't, since second returned
+        # non-None.  Handlers without http_open never get any methods called
+        # on them.
+        # In fact, second mock handler returns self (instead of response),
+        # which becomes the OpenerDirector's return value.
+        self.assert_(r == handlers[2])
+        calls = [(handlers[0], "http_open"), (handlers[2], "http_open")]
+        for i in range(len(o.calls)):
+            handler, name, args, kwds = o.calls[i]
+            self.assert_((handler, name) == calls[i])
+            self.assert_(args == (req,))
+
+    def test_reindex_handlers(self):
+        o = OpenerDirector()
+        class MockHandler:
+            def add_parent(self, parent): pass
+            def close(self):pass
+            def __lt__(self, other):
+                return self.handler_order < other.handler_order
+        # this first class is here as an obscure regression test for bug
+        # encountered during development: if something manages to get through
+        # to _maybe_reindex_handlers, make sure it's properly removed and
+        # doesn't affect adding of subsequent handlers
+        class NonHandler(MockHandler):
+            handler_order = 1
+        class Handler(MockHandler):
+            handler_order = 2
+            def http_open(self): pass
+        class Processor(MockHandler):
+            handler_order = 3
+            def any_response(self): pass
+            def http_response(self): pass
+        o.add_handler(NonHandler())
+        h = Handler()
+        o.add_handler(h)
+        p = Processor()
+        o.add_handler(p)
+        o._maybe_reindex_handlers()
+        self.assertEqual(o.handle_open, {"http": [h]})
+        self.assertEqual(len(o.process_response.keys()), 1)
+        self.assertEqual(list(o.process_response["http"]), [p])
+        self.assertEqual(list(o._any_response), [p])
+        self.assertEqual(o.handlers, [h, p])
+
+    def test_handler_order(self):
+        o = OpenerDirector()
+        handlers = []
+        for meths, handler_order in [
+            ([("http_open", "return self")], 500),
+            (["http_open"], 0),
+            ]:
+            class MockHandlerSubclass(MockHandler): pass
+            h = MockHandlerSubclass(meths)
+            h.handler_order = handler_order
+            handlers.append(h)
+            o.add_handler(h)
+
+        r = o.open("http://example.com/")
+        # handlers called in reverse order, thanks to their sort order
+        self.assert_(o.calls[0][0] == handlers[1])
+        self.assert_(o.calls[1][0] == handlers[0])
+
+    def test_raise(self):
+        # raising URLError stops processing of request
+        o = OpenerDirector()
+        meth_spec = [
+            [("http_open", "raise")],
+            [("http_open", "return self")],
+            ]
+        handlers = add_ordered_mock_handlers(o, meth_spec)
+
+        req = Request("http://example.com/")
+        self.assertRaises(mechanize.URLError, o.open, req)
+        self.assert_(o.calls == [(handlers[0], "http_open", (req,), {})])
+
+##     def test_error(self):
+##         # XXX this doesn't actually seem to be used in standard library,
+##         #  but should really be tested anyway...
+
+    def test_http_error(self):
+        # XXX http_error_default
+        # http errors are a special case
+        o = OpenerDirector()
+        meth_spec = [
+            [("http_open", "error 302")],
+            [("http_error_400", "raise"), "http_open"],
+            [("http_error_302", "return response"), "http_error_303",
+             "http_error"],
+            [("http_error_302")],
+            ]
+        handlers = add_ordered_mock_handlers(o, meth_spec)
+
+        class Unknown: pass
+
+        req = Request("http://example.com/")
+        r = o.open(req)
+        assert len(o.calls) == 2
+        calls = [(handlers[0], "http_open", (req,)),
+                 (handlers[2], "http_error_302", (req, Unknown, 302, "", {}))]
+        for i in range(len(calls)):
+            handler, method_name, args, kwds = o.calls[i]
+            self.assert_((handler, method_name) == calls[i][:2])
+            # check handler methods were called with expected arguments
+            expected_args = calls[i][2]
+            for j in range(len(args)):
+                if expected_args[j] is not Unknown:
+                    self.assert_(args[j] == expected_args[j])
+
+    def test_http_error_raised(self):
+        # should get an HTTPError if an HTTP handler raises a non-200 response
+        # XXX it worries me that this is the only test that excercises the else
+        # branch in HTTPDefaultErrorHandler
+        from mechanize import _response
+        o = mechanize.OpenerDirector()
+        o.add_handler(mechanize.HTTPErrorProcessor())
+        o.add_handler(mechanize.HTTPDefaultErrorHandler())
+        class HTTPHandler(AbstractHTTPHandler):
+            def http_open(self, req):
+                return _response.test_response(code=302)
+        o.add_handler(HTTPHandler())
+        self.assertRaises(mechanize.HTTPError, o.open, "http://example.com/")
+
+    def test_processors(self):
+        # *_request / *_response methods get called appropriately
+        o = OpenerDirector()
+        meth_spec = [
+            [("http_request", "return request"),
+             ("http_response", "return response")],
+            [("http_request", "return request"),
+             ("http_response", "return response")],
+            ]
+        handlers = add_ordered_mock_handlers(o, meth_spec)
+
+        req = Request("http://example.com/")
+        r = o.open(req)
+
+        # processor methods are called on *all* handlers that define them,
+        # not just the first handler
+        calls = [(handlers[0], "http_request"), (handlers[1], "http_request"),
+                 (handlers[0], "http_response"), (handlers[1], "http_response")]
+        self.assertEqual(len(o.calls), len(calls))
+        for i in range(len(o.calls)):
+            handler, name, args, kwds = o.calls[i]
+            if i < 2:
+                # *_request
+                self.assert_((handler, name) == calls[i])
+                self.assert_(len(args) == 1)
+                self.assert_(isinstance(args[0], Request))
+            else:
+                # *_response
+                self.assert_((handler, name) == calls[i])
+                self.assert_(len(args) == 2)
+                self.assert_(isinstance(args[0], Request))
+                # response from opener.open is None, because there's no
+                # handler that defines http_open to handle it
+                self.assert_(args[1] is None or
+                             isinstance(args[1], MockResponse))
+
+    def test_any(self):
+        # XXXXX two handlers case: ordering
+        o = OpenerDirector()
+        meth_spec = [[
+            ("http_request", "return request"),
+            ("http_response", "return response"),
+            ("ftp_request", "return request"),
+            ("ftp_response", "return response"),
+            ("any_request", "return request"),
+            ("any_response", "return response"),
+            ]]
+        handlers = add_ordered_mock_handlers(o, meth_spec)
+        handler = handlers[0]
+
+        for scheme in ["http", "ftp"]:
+            o.calls = []
+            req = Request("%s://example.com/" % scheme)
+            r = o.open(req)
+
+            calls = [(handler, "any_request"),
+                     (handler, ("%s_request" % scheme)),
+                     (handler, "any_response"),
+                     (handler, ("%s_response" % scheme)),
+                     ]
+            self.assertEqual(len(o.calls), len(calls))
+            for i, ((handler, name, args, kwds), calls) in (
+                enumerate(zip(o.calls, calls))):
+                if i < 2:
+                    # *_request
+                    self.assert_((handler, name) == calls)
+                    self.assert_(len(args) == 1)
+                    self.assert_(isinstance(args[0], Request))
+                else:
+                    # *_response
+                    self.assert_((handler, name) == calls)
+                    self.assert_(len(args) == 2)
+                    self.assert_(isinstance(args[0], Request))
+                    # response from opener.open is None, because there's no
+                    # handler that defines http_open to handle it
+                    self.assert_(args[1] is None or
+                                 isinstance(args[1], MockResponse))
+
+
+class MockHTTPResponse:
+    def __init__(self, fp, msg, status, reason):
+        self.fp = fp
+        self.msg = msg
+        self.status = status
+        self.reason = reason
+    def read(self):
+        return ''
+
+class MockHTTPClass:
+    def __init__(self):
+        self.req_headers = []
+        self.data = None
+        self.raise_on_endheaders = False
+    def __call__(self, host):
+        self.host = host
+        return self
+    def set_debuglevel(self, level):
+        self.level = level
+    def request(self, method, url, body=None, headers={}):
+        self.method = method
+        self.selector = url
+        self.req_headers.extend(headers.items())
+        if body:
+            self.data = body
+        if self.raise_on_endheaders:
+            import socket
+            raise socket.error()
+    def getresponse(self):
+        return MockHTTPResponse(MockFile(), {}, 200, "OK")
+
+class MockFTPWrapper:
+    def __init__(self, data): self.data = data
+    def retrfile(self, filename, filetype):
+        self.filename, self.filetype = filename, filetype
+        return StringIO.StringIO(self.data), len(self.data)
+
+class NullFTPHandler(mechanize.FTPHandler):
+    def __init__(self, data): self.data = data
+    def connect_ftp(self, user, passwd, host, port, dirs, timeout=None):
+        self.user, self.passwd = user, passwd
+        self.host, self.port = host, port
+        self.dirs = dirs
+        self.timeout = timeout
+        self.ftpwrapper = MockFTPWrapper(self.data)
+        return self.ftpwrapper
+
+def sanepathname2url(path):
+    import urllib
+    urlpath = urllib.pathname2url(path)
+    if os.name == "nt" and urlpath.startswith("///"):
+        urlpath = urlpath[2:]
+    # XXX don't ask me about the mac...
+    return urlpath
+
+class MockRobotFileParserClass:
+    def __init__(self):
+        self.calls = []
+        self._can_fetch = True
+    def clear(self):
+        self.calls = []
+    def __call__(self):
+        self.calls.append("__call__")
+        return self
+    def set_url(self, url):
+        self.calls.append(("set_url", url))
+    def set_timeout(self, timeout):
+        self.calls.append(("set_timeout", timeout))
+    def set_opener(self, opener):
+        self.calls.append(("set_opener", opener))
+    def read(self):
+        self.calls.append("read")
+    def can_fetch(self, ua, url):
+        self.calls.append(("can_fetch", ua, url))
+        return self._can_fetch
+
+class MockPasswordManager:
+    def add_password(self, realm, uri, user, password):
+        self.realm = realm
+        self.url = uri
+        self.user = user
+        self.password = password
+    def find_user_password(self, realm, authuri):
+        self.target_realm = realm
+        self.target_url = authuri
+        return self.user, self.password
+
+class HandlerTests(unittest.TestCase):
+
+    def test_ftp(self):
+        import ftplib, socket
+        data = "rheum rhaponicum"
+        h = NullFTPHandler(data)
+        o = h.parent = MockOpener()
+
+        for url, host, port, type_, dirs, timeout, filename, mimetype in [
+            ("ftp://localhost/foo/bar/baz.html",
+             "localhost", ftplib.FTP_PORT, "I",
+             ["foo", "bar"], _sockettimeout._GLOBAL_DEFAULT_TIMEOUT,
+             "baz.html", "text/html"),
+            # XXXX Bug: FTPHandler tries to gethostbyname "localhost:80",
+            #  with the port still there.
+            #("ftp://localhost:80/foo/bar/",
+            # "localhost", 80, "D",
+            # ["foo", "bar"], _sockettimeout._GLOBAL_DEFAULT_TIMEOUT,
+            # "", None),
+            # XXXX bug: second use of splitattr() in FTPHandler should be
+            #  splitvalue()
+            #("ftp://localhost/baz.gif;type=a",
+            # "localhost", ftplib.FTP_PORT, "A",
+            # [], _sockettimeout._GLOBAL_DEFAULT_TIMEOUT,
+            # "baz.gif", "image/gif"),
+            ]:
+            request = Request(url)
+            r = h.ftp_open(request)
+            # ftp authentication not yet implemented by FTPHandler
+            self.assert_(h.user == h.passwd == "")
+            self.assert_(h.host == socket.gethostbyname(host))
+            self.assert_(h.port == port)
+            self.assert_(h.dirs == dirs)
+            if sys.version_info >= (2, 6):
+                self.assertEquals(h.timeout, timeout)
+            self.assert_(h.ftpwrapper.filename == filename)
+            self.assert_(h.ftpwrapper.filetype == type_)
+            headers = r.info()
+            self.assert_(headers["Content-type"] == mimetype)
+            self.assert_(int(headers["Content-length"]) == len(data))
+
+        def test_file(self):
+            import time, rfc822, socket
+            h = mechanize.FileHandler()
+            o = h.parent = MockOpener()
+
+            #TESTFN = test_support.TESTFN
+            TESTFN = "test.txt"
+            urlpath = sanepathname2url(os.path.abspath(TESTFN))
+            towrite = "hello, world\n"
+            try:
+                fqdn = socket.gethostbyname(socket.gethostname())
+            except socket.gaierror:
+                fqdn = "localhost"
+            for url in [
+                "file://localhost%s" % urlpath,
+                "file://%s" % urlpath,
+                "file://%s%s" % (socket.gethostbyname('localhost'), urlpath),
+                "file://%s%s" % (fqdn, urlpath)
+                ]:
+                f = open(TESTFN, "wb")
+                try:
+                    try:
+                        f.write(towrite)
+                    finally:
+                        f.close()
+
+                    r = h.file_open(Request(url))
+                    try:
+                        data = r.read()
+                        headers = r.info()
+                        newurl = r.geturl()
+                    finally:
+                        r.close()
+                    stats = os.stat(TESTFN)
+                    modified = rfc822.formatdate(stats.st_mtime)
+                finally:
+                    os.remove(TESTFN)
+                self.assertEqual(data, towrite)
+                self.assertEqual(headers["Content-type"], "text/plain")
+                self.assertEqual(headers["Content-length"], "13")
+                self.assertEqual(headers["Last-modified"], modified)
+
+            for url in [
+                "file://localhost:80%s" % urlpath,
+    # XXXX bug: these fail with socket.gaierror, should be URLError
+    ##             "file://%s:80%s/%s" % (socket.gethostbyname('localhost'),
+    ##                                    os.getcwd(), TESTFN),
+    ##             "file://somerandomhost.ontheinternet.com%s/%s" %
+    ##             (os.getcwd(), TESTFN),
+                ]:
+                try:
+                    f = open(TESTFN, "wb")
+                    try:
+                        f.write(towrite)
+                    finally:
+                        f.close()
+
+                    self.assertRaises(mechanize.URLError,
+                                      h.file_open, Request(url))
+                finally:
+                    os.remove(TESTFN)
+
+            h = mechanize.FileHandler()
+            o = h.parent = MockOpener()
+            # XXXX why does // mean ftp (and /// mean not ftp!), and where
+            #  is file: scheme specified?  I think this is really a bug, and
+            #  what was intended was to distinguish between URLs like:
+            # file:/blah.txt (a file)
+            # file://localhost/blah.txt (a file)
+            # file:///blah.txt (a file)
+            # file://ftp.example.com/blah.txt (an ftp URL)
+            for url, ftp in [
+                ("file://ftp.example.com//foo.txt", True),
+                ("file://ftp.example.com///foo.txt", False),
+    # XXXX bug: fails with OSError, should be URLError
+                ("file://ftp.example.com/foo.txt", False),
+                ]:
+                req = Request(url)
+                try:
+                    h.file_open(req)
+                # XXXX remove OSError when bug fixed
+                except (mechanize.URLError, OSError):
+                    self.assert_(not ftp)
+                else:
+                    self.assert_(o.req is req)
+                    self.assertEqual(req.type, "ftp")
+
+    def test_http(self):
+        h = AbstractHTTPHandler()
+        o = h.parent = MockOpener()
+
+        url = "http://example.com/"
+        for method, data in [("GET", None), ("POST", "blah")]:
+            req = Request(url, data, {"Foo": "bar"})
+            req.add_unredirected_header("Spam", "eggs")
+            http = MockHTTPClass()
+            r = h.do_open(http, req)
+
+            # result attributes
+            r.read; r.readline  # wrapped MockFile methods
+            r.info; r.geturl  # addinfourl methods
+            r.code, r.msg == 200, "OK"  # added from MockHTTPClass.getreply()
+            hdrs = r.info()
+            hdrs.get; hdrs.has_key  # r.info() gives dict from .getreply()
+            self.assert_(r.geturl() == url)
+
+            self.assert_(http.host == "example.com")
+            self.assert_(http.level == 0)
+            self.assert_(http.method == method)
+            self.assert_(http.selector == "/")
+            http.req_headers.sort()
+            self.assert_(http.req_headers == [
+                ("Connection", "close"),
+                ("Foo", "bar"), ("Spam", "eggs")])
+            self.assert_(http.data == data)
+
+        # check socket.error converted to URLError
+        http.raise_on_endheaders = True
+        self.assertRaises(mechanize.URLError, h.do_open, http, req)
+
+        # check adding of standard headers
+        o.addheaders = [("Spam", "eggs")]
+        for data in "", None:  # POST, GET
+            req = Request("http://example.com/", data)
+            r = MockResponse(200, "OK", {}, "")
+            newreq = h.do_request_(req)
+            if data is None:  # GET
+                self.assert_("Content-length" not in req.unredirected_hdrs)
+                self.assert_("Content-type" not in req.unredirected_hdrs)
+            else:  # POST
+                self.assert_(req.unredirected_hdrs["Content-length"] == "0")
+                self.assert_(req.unredirected_hdrs["Content-type"] ==
+                             "application/x-www-form-urlencoded")
+            # XXX the details of Host could be better tested
+            self.assert_(req.unredirected_hdrs["Host"] == "example.com")
+            self.assert_(req.unredirected_hdrs["Spam"] == "eggs")
+
+            # don't clobber existing headers
+            req.add_unredirected_header("Content-length", "foo")
+            req.add_unredirected_header("Content-type", "bar")
+            req.add_unredirected_header("Host", "baz")
+            req.add_unredirected_header("Spam", "foo")
+            newreq = h.do_request_(req)
+            self.assert_(req.unredirected_hdrs["Content-length"] == "foo")
+            self.assert_(req.unredirected_hdrs["Content-type"] == "bar")
+            self.assert_(req.unredirected_hdrs["Host"] == "baz")
+            self.assert_(req.unredirected_hdrs["Spam"] == "foo")
+
+    def test_request_upgrade(self):
+        import urllib2
+        new_req_class = hasattr(urllib2.Request, "has_header")
+
+        h = HTTPRequestUpgradeProcessor()
+        o = h.parent = MockOpener()
+
+        # urllib2.Request gets upgraded, unless it's the new Request
+        # class from 2.4
+        req = urllib2.Request("http://example.com/")
+        newreq = h.http_request(req)
+        if new_req_class:
+            self.assert_(newreq is req)
+        else:
+            self.assert_(newreq is not req)
+        if new_req_class:
+            self.assert_(newreq.__class__ is not Request)
+        else:
+            self.assert_(newreq.__class__ is Request)
+        # ClientCookie._urllib2_support.Request doesn't get upgraded
+        req = Request("http://example.com/")
+        newreq = h.http_request(req)
+        self.assert_(newreq is req)
+        self.assert_(newreq.__class__ is Request)
+
+    def test_referer(self):
+        h = HTTPRefererProcessor()
+        o = h.parent = MockOpener()
+
+        # normal case
+        url = "http://example.com/"
+        req = Request(url)
+        r = MockResponse(200, "OK", {}, "", url)
+        newr = h.http_response(req, r)
+        self.assert_(r is newr)
+        self.assert_(h.referer == url)
+        newreq = h.http_request(req)
+        self.assert_(req is newreq)
+        self.assert_(req.unredirected_hdrs["Referer"] == url)
+        # don't clobber existing Referer
+        ref = "http://set.by.user.com/"
+        req.add_unredirected_header("Referer", ref)
+        newreq = h.http_request(req)
+        self.assert_(req is newreq)
+        self.assert_(req.unredirected_hdrs["Referer"] == ref)
+
+    def test_errors(self):
+        from mechanize import _response
+        h = HTTPErrorProcessor()
+        o = h.parent = MockOpener()
+
+        req = Request("http://example.com")
+        # 200 OK is passed through
+        r = _response.test_response()
+        newr = h.http_response(req, r)
+        self.assert_(r is newr)
+        self.assert_(not hasattr(o, "proto"))  # o.error not called
+        # anything else calls o.error (and MockOpener returns None, here)
+        r = _response.test_response(code=201, msg="Created")
+        self.assert_(h.http_response(req, r) is None)
+        self.assert_(o.proto == "http")  # o.error called
+        self.assert_(o.args == (req, r, 201, "Created", AlwaysEqual()))
+
+    def test_raise_http_errors(self):
+        # HTTPDefaultErrorHandler should raise HTTPError if no error handler
+        # handled the error response
+        from mechanize import _response
+        h = mechanize.HTTPDefaultErrorHandler()
+
+        url = "http://example.com"; code = 500; msg = "Error"
+        request = mechanize.Request(url)
+        response = _response.test_response(url=url, code=code, msg=msg)
+
+        # case 1. it's not an HTTPError
+        try:
+            h.http_error_default(
+                request, response, code, msg, response.info())
+        except mechanize.HTTPError, exc:
+            self.assert_(exc is not response)
+            self.assert_(exc.fp is response)
+        else:
+            self.assert_(False)
+
+        # case 2. response object is already an HTTPError, so just re-raise it
+        error = mechanize.HTTPError(
+            url, code, msg, "fake headers", response)
+        try:
+            h.http_error_default(
+                request, error, code, msg, error.info())
+        except mechanize.HTTPError, exc:
+            self.assert_(exc is error)
+        else:
+            self.assert_(False)
+
+    def test_robots(self):
+        # XXX useragent
+        try:
+            import robotparser
+        except ImportError:
+            return  # skip test
+        else:
+            from mechanize import HTTPRobotRulesProcessor
+        opener = OpenerDirector()
+        rfpc = MockRobotFileParserClass()
+        h = HTTPRobotRulesProcessor(rfpc)
+        opener.add_handler(h)
+
+        url = "http://example.com:80/foo/bar.html"
+        req = Request(url)
+        # first time: initialise and set up robots.txt parser before checking
+        #  whether OK to fetch URL
+        h.http_request(req)
+        self.assertEquals(rfpc.calls, [
+            "__call__",
+            ("set_opener", opener),
+            ("set_url", "http://example.com:80/robots.txt"),
+            ("set_timeout", _sockettimeout._GLOBAL_DEFAULT_TIMEOUT),
+            "read",
+            ("can_fetch", "", url),
+            ])
+        # second time: just use existing parser
+        rfpc.clear()
+        req = Request(url)
+        h.http_request(req)
+        self.assert_(rfpc.calls == [
+            ("can_fetch", "", url),
+            ])
+        # different URL on same server: same again
+        rfpc.clear()
+        url = "http://example.com:80/blah.html"
+        req = Request(url)
+        h.http_request(req)
+        self.assert_(rfpc.calls == [
+            ("can_fetch", "", url),
+            ])
+        # disallowed URL
+        rfpc.clear()
+        rfpc._can_fetch = False
+        url = "http://example.com:80/rhubarb.html"
+        req = Request(url)
+        try:
+            h.http_request(req)
+        except mechanize.HTTPError, e:
+            self.assert_(e.request == req)
+            self.assert_(e.code == 403)
+        # new host: reload robots.txt (even though the host and port are
+        #  unchanged, we treat this as a new host because
+        #  "example.com" != "example.com:80")
+        rfpc.clear()
+        rfpc._can_fetch = True
+        url = "http://example.com/rhubarb.html"
+        req = Request(url)
+        h.http_request(req)
+        self.assertEquals(rfpc.calls, [
+            "__call__",
+            ("set_opener", opener),
+            ("set_url", "http://example.com/robots.txt"),
+            ("set_timeout", _sockettimeout._GLOBAL_DEFAULT_TIMEOUT),
+            "read",
+            ("can_fetch", "", url),
+            ])
+        # https url -> should fetch robots.txt from https url too
+        rfpc.clear()
+        url = "https://example.org/rhubarb.html"
+        req = Request(url)
+        h.http_request(req)
+        self.assertEquals(rfpc.calls, [
+            "__call__",
+            ("set_opener", opener),
+            ("set_url", "https://example.org/robots.txt"),
+            ("set_timeout", _sockettimeout._GLOBAL_DEFAULT_TIMEOUT),
+            "read",
+            ("can_fetch", "", url),
+            ])
+        # non-HTTP URL -> ignore robots.txt
+        rfpc.clear()
+        url = "ftp://example.com/"
+        req = Request(url)
+        h.http_request(req)
+        self.assert_(rfpc.calls == [])
+
+    def test_redirected_robots_txt(self):
+        # redirected robots.txt fetch shouldn't result in another attempted
+        # robots.txt fetch to check the redirection is allowed!
+        import mechanize
+        from mechanize import build_opener, HTTPHandler, \
+             HTTPDefaultErrorHandler, HTTPRedirectHandler, \
+             HTTPRobotRulesProcessor
+
+        class MockHTTPHandler(mechanize.BaseHandler):
+            def __init__(self):
+                self.requests = []
+            def http_open(self, req):
+                import mimetools, httplib, copy
+                from StringIO import StringIO
+                self.requests.append(copy.deepcopy(req))
+                if req.get_full_url() == "http://example.com/robots.txt":
+                    hdr = "Location: http://example.com/en/robots.txt\r\n\r\n"
+                    msg = mimetools.Message(StringIO(hdr))
+                    return self.parent.error(
+                        "http", req, test_response(), 302, "Blah", msg)
+                else:
+                    return test_response("Allow: *", [], req.get_full_url())
+
+        hh = MockHTTPHandler()
+        hdeh = HTTPDefaultErrorHandler()
+        hrh = HTTPRedirectHandler()
+        rh = HTTPRobotRulesProcessor()
+        o = build_test_opener(hh, hdeh, hrh, rh)
+        o.open("http://example.com/")
+        self.assertEqual([req.get_full_url() for req in hh.requests],
+                         ["http://example.com/robots.txt",
+                          "http://example.com/en/robots.txt",
+                          "http://example.com/",
+                          ])
+
+    def test_cookies(self):
+        cj = MockCookieJar()
+        h = HTTPCookieProcessor(cj)
+        o = h.parent = MockOpener()
+
+        req = Request("http://example.com/")
+        r = MockResponse(200, "OK", {}, "")
+        newreq = h.http_request(req)
+        self.assert_(cj.ach_req is req is newreq)
+        self.assert_(req.origin_req_host == "example.com")
+        self.assert_(cj.ach_u == False)
+        newr = h.http_response(req, r)
+        self.assert_(cj.ec_req is req)
+        self.assert_(cj.ec_r is r is newr)
+        self.assert_(cj.ec_u == False)
+
+    def test_seekable(self):
+        hide_deprecations()
+        try:
+            h = SeekableProcessor()
+        finally:
+            reset_deprecations()
+        o = h.parent = MockOpener()
+
+        req = mechanize.Request("http://example.com/")
+        class MockUnseekableResponse:
+            code = 200
+            msg = "OK"
+            def info(self): pass
+            def geturl(self): return ""
+        r = MockUnseekableResponse()
+        newr = h.any_response(req, r)
+        self.assert_(not hasattr(r, "seek"))
+        self.assert_(hasattr(newr, "seek"))
+
+    def test_http_equiv(self):
+        from mechanize import _response
+        h = HTTPEquivProcessor()
+        o = h.parent = MockOpener()
+
+        data = ('<html><head>'
+                '<meta http-equiv="Refresh" content="spam&amp;eggs">'
+                '</head></html>'
+                )
+        headers = [("Foo", "Bar"),
+                   ("Content-type", "text/html"),
+                   ("Refresh", "blah"),
+                   ]
+        url = "http://example.com/"
+        req = Request(url)
+        r = _response.make_response(data, headers, url, 200, "OK")
+        newr = h.http_response(req, r)
+
+        new_headers = newr.info()
+        self.assertEqual(new_headers["Foo"], "Bar")
+        self.assertEqual(new_headers["Refresh"], "spam&eggs")
+        self.assertEqual(new_headers.getheaders("Refresh"),
+                         ["blah", "spam&eggs"])
+
+    def test_refresh(self):
+        # XXX test processor constructor optional args
+        h = HTTPRefreshProcessor(max_time=None, honor_time=False)
+
+        for val, valid in [
+            ('0; url="http://example.com/foo/"', True),
+            ("2", True),
+            # in the past, this failed with UnboundLocalError
+            ('0; "http://example.com/foo/"', False),
+            ]:
+            o = h.parent = MockOpener()
+            req = Request("http://example.com/")
+            headers = http_message({"refresh": val})
+            r = MockResponse(200, "OK", headers, "", "http://example.com/")
+            newr = h.http_response(req, r)
+            if valid:
+                self.assertEqual(o.proto, "http")
+                self.assertEqual(o.args, (req, r, "refresh", "OK", headers))
+
+    def test_refresh_honor_time(self):
+        class SleepTester:
+            def __init__(self, test, seconds):
+                self._test = test
+                if seconds is 0:
+                    seconds = None  # don't expect a sleep for 0 seconds
+                self._expected = seconds
+                self._got = None
+            def sleep(self, seconds):
+                self._got = seconds
+            def verify(self):
+                self._test.assertEqual(self._expected, self._got)
+        class Opener:
+            called = False
+            def error(self, *args, **kwds):
+                self.called = True
+        def test(rp, header, refresh_after):
+            expect_refresh = refresh_after is not None
+            opener = Opener()
+            rp.parent = opener
+            st = SleepTester(self, refresh_after)
+            rp._sleep = st.sleep
+            rp.http_response(Request("http://example.com"),
+                             test_response(headers=[("Refresh", header)]),
+                             )
+            self.assertEqual(expect_refresh, opener.called)
+            st.verify()
+
+        # by default, only zero-time refreshes are honoured
+        test(HTTPRefreshProcessor(), "0", 0)
+        test(HTTPRefreshProcessor(), "2", None)
+
+        # if requested, more than zero seconds are allowed
+        test(HTTPRefreshProcessor(max_time=None), "2", 2)
+        test(HTTPRefreshProcessor(max_time=30), "2", 2)
+
+        # no sleep if we don't "honor_time"
+        test(HTTPRefreshProcessor(max_time=30, honor_time=False), "2", 0)
+
+        # request for too-long wait before refreshing --> no refresh occurs
+        test(HTTPRefreshProcessor(max_time=30), "60", None)
+
+    def test_redirect(self):
+        from_url = "http://example.com/a.html"
+        to_url = "http://example.com/b.html"
+        h = HTTPRedirectHandler()
+        o = h.parent = MockOpener()
+
+        # ordinary redirect behaviour
+        for code in 301, 302, 303, 307, "refresh":
+            for data in None, "blah\nblah\n":
+                method = getattr(h, "http_error_%s" % code)
+                req = Request(from_url, data)
+                req.add_header("Nonsense", "viking=withhold")
+                req.add_unredirected_header("Spam", "spam")
+                req.origin_req_host = "example.com"  # XXX
+                try:
+                    method(req, MockFile(), code, "Blah",
+                           http_message({"location": to_url}))
+                except mechanize.HTTPError:
+                    # 307 in response to POST requires user OK
+                    self.assert_(code == 307 and data is not None)
+                self.assert_(o.req.get_full_url() == to_url)
+                try:
+                    self.assert_(o.req.get_method() == "GET")
+                except AttributeError:
+                    self.assert_(not o.req.has_data())
+                self.assert_(o.req.headers["Nonsense"] == "viking=withhold")
+                self.assert_(not o.req.headers.has_key("Spam"))
+                self.assert_(not o.req.unredirected_hdrs.has_key("Spam"))
+
+        # loop detection
+        def redirect(h, req, url=to_url):
+            h.http_error_302(req, MockFile(), 302, "Blah",
+                             http_message({"location": url}))
+        # Note that the *original* request shares the same record of
+        # redirections with the sub-requests caused by the redirections.
+
+        # detect infinite loop redirect of a URL to itself
+        req = Request(from_url)
+        req.origin_req_host = "example.com"
+        count = 0
+        try:
+            while 1:
+                redirect(h, req, "http://example.com/")
+                count = count + 1
+        except mechanize.HTTPError:
+            # don't stop until max_repeats, because cookies may introduce state
+            self.assert_(count == HTTPRedirectHandler.max_repeats)
+
+        # detect endless non-repeating chain of redirects
+        req = Request(from_url)
+        req.origin_req_host = "example.com"
+        count = 0
+        try:
+            while 1:
+                redirect(h, req, "http://example.com/%d" % count)
+                count = count + 1
+        except mechanize.HTTPError:
+            self.assert_(count == HTTPRedirectHandler.max_redirections)
+
+    def test_redirect_bad_uri(self):
+        # bad URIs should be cleaned up before redirection
+        from mechanize._response import test_html_response
+        from_url = "http://example.com/a.html"
+        bad_to_url = "http://example.com/b. |html"
+        good_to_url = "http://example.com/b.%20%7Chtml"
+
+        h = HTTPRedirectHandler()
+        o = h.parent = MockOpener()
+
+        req = Request(from_url)
+        h.http_error_302(req, test_html_response(), 302, "Blah",
+                         http_message({"location": bad_to_url}),
+                         )
+        self.assertEqual(o.req.get_full_url(), good_to_url)
+
+    def test_refresh_bad_uri(self):
+        # bad URIs should be cleaned up before redirection
+        from mechanize._response import test_html_response
+        from_url = "http://example.com/a.html"
+        bad_to_url = "http://example.com/b. |html"
+        good_to_url = "http://example.com/b.%20%7Chtml"
+
+        h = HTTPRefreshProcessor(max_time=None, honor_time=False)
+        o = h.parent = MockOpener()
+
+        req = Request("http://example.com/")
+        r = test_html_response(
+            headers=[("refresh", '0; url="%s"' % bad_to_url)])
+        newr = h.http_response(req, r)
+        headers = o.args[-1]
+        self.assertEqual(headers["Location"], good_to_url)
+
+    def test_cookie_redirect(self):
+        # cookies shouldn't leak into redirected requests
+        import mechanize
+        from mechanize import CookieJar, build_opener, HTTPHandler, \
+             HTTPCookieProcessor, HTTPError, HTTPDefaultErrorHandler, \
+             HTTPRedirectHandler
+
+        from test_cookies import interact_netscape
+
+        cj = CookieJar()
+        interact_netscape(cj, "http://www.example.com/", "spam=eggs")
+        hh = MockHTTPHandler(302, "Location: http://www.cracker.com/\r\n\r\n")
+        hdeh = HTTPDefaultErrorHandler()
+        hrh = HTTPRedirectHandler()
+        cp = HTTPCookieProcessor(cj)
+        o = build_test_opener(hh, hdeh, hrh, cp)
+        o.open("http://www.example.com/")
+        self.assert_(not hh.req.has_header("Cookie"))
+
+    def test_proxy(self):
+        o = OpenerDirector()
+        ph = mechanize.ProxyHandler(dict(http="proxy.example.com:3128"))
+        o.add_handler(ph)
+        meth_spec = [
+            [("http_open", "return response")]
+            ]
+        handlers = add_ordered_mock_handlers(o, meth_spec)
+
+        o._maybe_reindex_handlers()
+
+        req = Request("http://acme.example.com/")
+        self.assertEqual(req.get_host(), "acme.example.com")
+        r = o.open(req)
+        self.assertEqual(req.get_host(), "proxy.example.com:3128")
+
+        self.assertEqual([(handlers[0], "http_open")],
+                         [tup[0:2] for tup in o.calls])
+
+    def test_basic_auth(self):
+        opener = OpenerDirector()
+        password_manager = MockPasswordManager()
+        auth_handler = mechanize.HTTPBasicAuthHandler(password_manager)
+        realm = "ACME Widget Store"
+        http_handler = MockHTTPHandler(
+            401, 'WWW-Authenticate: Basic realm="%s"\r\n\r\n' % realm)
+        opener.add_handler(auth_handler)
+        opener.add_handler(http_handler)
+        self._test_basic_auth(opener, auth_handler, "Authorization",
+                              realm, http_handler, password_manager,
+                              "http://acme.example.com/protected",
+                              "http://acme.example.com/protected",
+                              )
+
+    def test_proxy_basic_auth(self):
+        opener = OpenerDirector()
+        ph = mechanize.ProxyHandler(dict(http="proxy.example.com:3128"))
+        opener.add_handler(ph)
+        password_manager = MockPasswordManager()
+        auth_handler = mechanize.ProxyBasicAuthHandler(password_manager)
+        realm = "ACME Networks"
+        http_handler = MockHTTPHandler(
+            407, 'Proxy-Authenticate: Basic realm="%s"\r\n\r\n' % realm)
+        opener.add_handler(auth_handler)
+        opener.add_handler(http_handler)
+        self._test_basic_auth(opener, auth_handler, "Proxy-authorization",
+                              realm, http_handler, password_manager,
+                              "http://acme.example.com:3128/protected",
+                              "proxy.example.com:3128",
+                              )
+
+    def test_basic_and_digest_auth_handlers(self):
+        # HTTPDigestAuthHandler threw an exception if it couldn't handle a 40*
+        # response (http://python.org/sf/1479302), where it should instead
+        # return None to allow another handler (especially
+        # HTTPBasicAuthHandler) to handle the response.
+
+        # Also (http://python.org/sf/1479302, RFC 2617 section 1.2), we must
+        # try digest first (since it's the strongest auth scheme), so we record
+        # order of calls here to check digest comes first:
+        class RecordingOpenerDirector(OpenerDirector):
+            def __init__(self):
+                OpenerDirector.__init__(self)
+                self.recorded = []
+            def record(self, info):
+                self.recorded.append(info)
+        class TestDigestAuthHandler(mechanize.HTTPDigestAuthHandler):
+            def http_error_401(self, *args, **kwds):
+                self.parent.record("digest")
+                mechanize.HTTPDigestAuthHandler.http_error_401(self,
+                                                             *args, **kwds)
+        class TestBasicAuthHandler(mechanize.HTTPBasicAuthHandler):
+            def http_error_401(self, *args, **kwds):
+                self.parent.record("basic")
+                mechanize.HTTPBasicAuthHandler.http_error_401(self,
+                                                            *args, **kwds)
+
+        opener = RecordingOpenerDirector()
+        password_manager = MockPasswordManager()
+        digest_handler = TestDigestAuthHandler(password_manager)
+        basic_handler = TestBasicAuthHandler(password_manager)
+        realm = "ACME Networks"
+        http_handler = MockHTTPHandler(
+            401, 'WWW-Authenticate: Basic realm="%s"\r\n\r\n' % realm)
+        opener.add_handler(digest_handler)
+        opener.add_handler(basic_handler)
+        opener.add_handler(http_handler)
+        opener._maybe_reindex_handlers()
+
+        # check basic auth isn't blocked by digest handler failing
+        self._test_basic_auth(opener, basic_handler, "Authorization",
+                              realm, http_handler, password_manager,
+                              "http://acme.example.com/protected",
+                              "http://acme.example.com/protected",
+                              )
+        # check digest was tried before basic (twice, because
+        # _test_basic_auth called .open() twice)
+        self.assertEqual(opener.recorded, ["digest", "basic"]*2)
+
+    def _test_basic_auth(self, opener, auth_handler, auth_header,
+                         realm, http_handler, password_manager,
+                         request_url, protected_url):
+        import base64, httplib
+        user, password = "wile", "coyote"
+
+        # .add_password() fed through to password manager
+        auth_handler.add_password(realm, request_url, user, password)
+        self.assertEqual(realm, password_manager.realm)
+        self.assertEqual(request_url, password_manager.url)
+        self.assertEqual(user, password_manager.user)
+        self.assertEqual(password, password_manager.password)
+
+        r = opener.open(request_url)
+
+        # should have asked the password manager for the username/password
+        self.assertEqual(password_manager.target_realm, realm)
+        self.assertEqual(password_manager.target_url, protected_url)
+
+        # expect one request without authorization, then one with
+        self.assertEqual(len(http_handler.requests), 2)
+        self.failIf(http_handler.requests[0].has_header(auth_header))
+        userpass = '%s:%s' % (user, password)
+        auth_hdr_value = 'Basic '+base64.encodestring(userpass).strip()
+        self.assertEqual(http_handler.requests[1].get_header(auth_header),
+                         auth_hdr_value)
+
+        # if the password manager can't find a password, the handler won't
+        # handle the HTTP auth error
+        password_manager.user = password_manager.password = None
+        http_handler.reset()
+        r = opener.open(request_url)
+        self.assertEqual(len(http_handler.requests), 1)
+        self.failIf(http_handler.requests[0].has_header(auth_header))
+
+
+class HeadParserTests(unittest.TestCase):
+
+    def test(self):
+        # XXX XHTML
+        from mechanize import HeadParser
+        htmls = [
+            ("""<meta http-equiv="refresh" content="1; http://example.com/">
+            """,
+            [("refresh", "1; http://example.com/")]
+            ),
+            ("""
+            <html><head>
+            <meta http-equiv="refresh" content="1; http://example.com/">
+            <meta name="spam" content="eggs">
+            <meta http-equiv="foo" content="bar">
+            <p> <!-- p is not allowed in head, so parsing should stop here-->
+            <meta http-equiv="moo" content="cow">
+            </html>
+            """,
+             [("refresh", "1; http://example.com/"), ("foo", "bar")]),
+            ("""<meta http-equiv="refresh">
+            """,
+             [])
+            ]
+        for html, result in htmls:
+            self.assertEqual(parse_head(StringIO.StringIO(html), HeadParser()), result)
+
+
+def build_test_opener(*handler_instances):
+    opener = OpenerDirector()
+    for h in handler_instances:
+        opener.add_handler(h)
+    return opener
+
+class MockHTTPHandler(mechanize.BaseHandler):
+    # useful for testing redirections and auth
+    # sends supplied headers and code as first response
+    # sends 200 OK as second response
+    def __init__(self, code, headers):
+        self.code = code
+        self.headers = headers
+        self.reset()
+    def reset(self):
+        self._count = 0
+        self.requests = []
+    def http_open(self, req):
+        import mimetools, httplib, copy
+        from StringIO import StringIO
+        self.requests.append(copy.deepcopy(req))
+        if self._count == 0:
+            self._count = self._count + 1
+            msg = mimetools.Message(StringIO(self.headers))
+            return self.parent.error(
+                "http", req, test_response(), self.code, "Blah", msg)
+        else:
+            self.req = req
+            return test_response("", [], req.get_full_url())
+
+
+class MyHTTPHandler(HTTPHandler): pass
+class FooHandler(mechanize.BaseHandler):
+    def foo_open(self): pass
+class BarHandler(mechanize.BaseHandler):
+    def bar_open(self): pass
+
+class A:
+    def a(self): pass
+class B(A):
+    def a(self): pass
+    def b(self): pass
+class C(A):
+    def c(self): pass
+class D(C, B):
+    def a(self): pass
+    def d(self): pass
+
+class FunctionTests(unittest.TestCase):
+
+    def test_build_opener(self):
+        o = build_opener(FooHandler, BarHandler)
+        self.opener_has_handler(o, FooHandler)
+        self.opener_has_handler(o, BarHandler)
+
+        # can take a mix of classes and instances
+        o = build_opener(FooHandler, BarHandler())
+        self.opener_has_handler(o, FooHandler)
+        self.opener_has_handler(o, BarHandler)
+
+        # subclasses of default handlers override default handlers
+        o = build_opener(MyHTTPHandler)
+        self.opener_has_handler(o, MyHTTPHandler)
+
+        # a particular case of overriding: default handlers can be passed
+        # in explicitly
+        o = build_opener()
+        self.opener_has_handler(o, HTTPHandler)
+        o = build_opener(HTTPHandler)
+        self.opener_has_handler(o, HTTPHandler)
+        o = build_opener(HTTPHandler())
+        self.opener_has_handler(o, HTTPHandler)
+
+    def opener_has_handler(self, opener, handler_class):
+        for h in opener.handlers:
+            if h.__class__ == handler_class:
+                break
+        else:
+            self.assert_(False)
+
+
+if __name__ == "__main__":
+    import doctest
+    doctest.testmod()
+    unittest.main()

Added: mechanize/tags/0.1.10/test/test_useragent.py
===================================================================
--- mechanize/tags/0.1.10/test/test_useragent.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test/test_useragent.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,58 @@
+#!/usr/bin/env python
+
+from unittest import TestCase
+
+import mechanize
+
+from test_browser import make_mock_handler
+
+
+class UserAgentTests(TestCase):
+
+    def test_set_handled_schemes(self):
+        class MockHandlerClass(make_mock_handler()):
+            def __call__(self): return self
+        class BlahHandlerClass(MockHandlerClass): pass
+        class BlahProcessorClass(MockHandlerClass): pass
+        BlahHandler = BlahHandlerClass([("blah_open", None)])
+        BlahProcessor = BlahProcessorClass([("blah_request", None)])
+        class TestUserAgent(mechanize.UserAgent):
+            default_others = []
+            default_features = []
+            handler_classes = mechanize.UserAgent.handler_classes.copy()
+            handler_classes.update(
+                {"blah": BlahHandler, "_blah": BlahProcessor})
+        ua = TestUserAgent()
+
+        self.assertEqual(len(ua.handlers), 4)
+        ua.set_handled_schemes(["http", "https"])
+        self.assertEqual(len(ua.handlers), 2)
+        self.assertRaises(ValueError,
+            ua.set_handled_schemes, ["blah", "non-existent"])
+        self.assertRaises(ValueError,
+            ua.set_handled_schemes, ["blah", "_blah"])
+        ua.set_handled_schemes(["blah"])
+
+        req = mechanize.Request("blah://example.com/")
+        r = ua.open(req)
+        exp_calls = [("blah_open", (req,), {})]
+        assert len(ua.calls) == len(exp_calls)
+        for got, expect in zip(ua.calls, exp_calls):
+            self.assertEqual(expect, got[1:])
+
+        ua.calls = []
+        req = mechanize.Request("blah://example.com/")
+        ua._set_handler("_blah", True)
+        r = ua.open(req)
+        exp_calls = [
+            ("blah_request", (req,), {}),
+            ("blah_open", (req,), {})]
+        assert len(ua.calls) == len(exp_calls)
+        for got, expect in zip(ua.calls, exp_calls):
+            self.assertEqual(expect, got[1:])
+        ua._set_handler("_blah", True)
+
+
+if __name__ == "__main__":
+    import unittest
+    unittest.main()

Added: mechanize/tags/0.1.10/test-tools/cookietest.cgi
===================================================================
--- mechanize/tags/0.1.10/test-tools/cookietest.cgi	                        (rev 0)
+++ mechanize/tags/0.1.10/test-tools/cookietest.cgi	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,58 @@
+#!/usr/bin/python
+# -*-python-*-
+
+# This is used by functional_tests.py
+
+#import cgitb; cgitb.enable()
+
+import time
+
+print "Content-Type: text/html"
+year_plus_one = time.localtime(time.time())[0] + 1
+expires = "expires=09-Nov-%d 23:12:40 GMT" % (year_plus_one,)
+print "Set-Cookie: foo=bar; %s\n" % expires
+import sys, os, string, cgi, Cookie, urllib
+from xml.sax import saxutils
+
+from types import ListType
+
+print "<html><head><title>Cookies and form submission parameters</title>"
+cookie = Cookie.SimpleCookie()
+cookieHdr = os.environ.get("HTTP_COOKIE", "")
+cookie.load(cookieHdr)
+form = cgi.FieldStorage()
+refresh_value = None
+if form.has_key("refresh"):
+    refresh = form["refresh"]
+    if not isinstance(refresh, ListType):
+        refresh_value = refresh.value
+if refresh_value is not None:
+    print '<meta http-equiv="refresh" content=%s>' % (
+        saxutils.quoteattr(urllib.unquote_plus(refresh_value)))
+elif not cookie.has_key("foo"):
+    print '<meta http-equiv="refresh" content="5">'
+
+print "</head>"
+print "<p>Received cookies:</p>"
+print "<pre>"
+print cgi.escape(os.environ.get("HTTP_COOKIE", ""))
+print "</pre>"
+if cookie.has_key("foo"):
+    print "Your browser supports cookies!"
+print "<p>Referer:</p>"
+print "<pre>"
+print cgi.escape(os.environ.get("HTTP_REFERER", ""))
+print "</pre>"
+print "<p>Received parameters:</p>"
+print "<pre>"
+for k in form.keys():
+    v = form[k]
+    if isinstance(v, ListType):
+        vs = []
+        for item in v:
+            vs.append(item.value)
+        text = string.join(vs, ", ")
+    else:
+        text = v.value
+    print "%s: %s" % (cgi.escape(k), cgi.escape(text))
+print "</pre></html>"


Property changes on: mechanize/tags/0.1.10/test-tools/cookietest.cgi
___________________________________________________________________
Added: svn:executable
   + 

Added: mechanize/tags/0.1.10/test-tools/doctest.py
===================================================================
--- mechanize/tags/0.1.10/test-tools/doctest.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test-tools/doctest.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,2695 @@
+# Module doctest.
+# Released to the public domain 16-Jan-2001, by Tim Peters (tim at python.org).
+# Major enhancements and refactoring by:
+#     Jim Fulton
+#     Edward Loper
+
+# Provided as-is; use at your own risk; no warranty; no promises; enjoy!
+
+r"""Module doctest -- a framework for running examples in docstrings.
+
+In simplest use, end each module M to be tested with:
+
+def _test():
+    import doctest
+    doctest.testmod()
+
+if __name__ == "__main__":
+    _test()
+
+Then running the module as a script will cause the examples in the
+docstrings to get executed and verified:
+
+python M.py
+
+This won't display anything unless an example fails, in which case the
+failing example(s) and the cause(s) of the failure(s) are printed to stdout
+(why not stderr? because stderr is a lame hack <0.2 wink>), and the final
+line of output is "Test failed.".
+
+Run it with the -v switch instead:
+
+python M.py -v
+
+and a detailed report of all examples tried is printed to stdout, along
+with assorted summaries at the end.
+
+You can force verbose mode by passing "verbose=True" to testmod, or prohibit
+it by passing "verbose=False".  In either of those cases, sys.argv is not
+examined by testmod.
+
+There are a variety of other ways to run doctests, including integration
+with the unittest framework, and support for running non-Python text
+files containing doctests.  There are also many ways to override parts
+of doctest's default behaviors.  See the Library Reference Manual for
+details.
+"""
+
+__docformat__ = 'reStructuredText en'
+
+__all__ = [
+    # 0, Option Flags
+    'register_optionflag',
+    'DONT_ACCEPT_TRUE_FOR_1',
+    'DONT_ACCEPT_BLANKLINE',
+    'NORMALIZE_WHITESPACE',
+    'ELLIPSIS',
+    'SKIP',
+    'IGNORE_EXCEPTION_DETAIL',
+    'COMPARISON_FLAGS',
+    'REPORT_UDIFF',
+    'REPORT_CDIFF',
+    'REPORT_NDIFF',
+    'REPORT_ONLY_FIRST_FAILURE',
+    'REPORTING_FLAGS',
+    # 1. Utility Functions
+    'is_private',
+    # 2. Example & DocTest
+    'Example',
+    'DocTest',
+    # 3. Doctest Parser
+    'DocTestParser',
+    # 4. Doctest Finder
+    'DocTestFinder',
+    # 5. Doctest Runner
+    'DocTestRunner',
+    'OutputChecker',
+    'DocTestFailure',
+    'UnexpectedException',
+    'DebugRunner',
+    # 6. Test Functions
+    'testmod',
+    'testfile',
+    'run_docstring_examples',
+    # 7. Tester
+    'Tester',
+    # 8. Unittest Support
+    'DocTestSuite',
+    'DocFileSuite',
+    'set_unittest_reportflags',
+    # 9. Debugging Support
+    'script_from_examples',
+    'testsource',
+    'debug_src',
+    'debug',
+]
+
+import __future__
+
+import sys, traceback, inspect, linecache_copy, os, re, types
+import unittest, difflib, pdb, tempfile
+import warnings
+from StringIO import StringIO
+
+# Don't whine about the deprecated is_private function in this
+# module's tests.
+warnings.filterwarnings("ignore", "is_private", DeprecationWarning,
+                        __name__, 0)
+
+# There are 4 basic classes:
+#  - Example: a <source, want> pair, plus an intra-docstring line number.
+#  - DocTest: a collection of examples, parsed from a docstring, plus
+#    info about where the docstring came from (name, filename, lineno).
+#  - DocTestFinder: extracts DocTests from a given object's docstring and
+#    its contained objects' docstrings.
+#  - DocTestRunner: runs DocTest cases, and accumulates statistics.
+#
+# So the basic picture is:
+#
+#                             list of:
+# +------+                   +---------+                   +-------+
+# |object| --DocTestFinder-> | DocTest | --DocTestRunner-> |results|
+# +------+                   +---------+                   +-------+
+#                            | Example |
+#                            |   ...   |
+#                            | Example |
+#                            +---------+
+
+# Option constants.
+
+OPTIONFLAGS_BY_NAME = {}
+def register_optionflag(name):
+    flag = 1 << len(OPTIONFLAGS_BY_NAME)
+    OPTIONFLAGS_BY_NAME[name] = flag
+    return flag
+
+DONT_ACCEPT_TRUE_FOR_1 = register_optionflag('DONT_ACCEPT_TRUE_FOR_1')
+DONT_ACCEPT_BLANKLINE = register_optionflag('DONT_ACCEPT_BLANKLINE')
+NORMALIZE_WHITESPACE = register_optionflag('NORMALIZE_WHITESPACE')
+ELLIPSIS = register_optionflag('ELLIPSIS')
+SKIP = register_optionflag('SKIP')
+IGNORE_EXCEPTION_DETAIL = register_optionflag('IGNORE_EXCEPTION_DETAIL')
+
+COMPARISON_FLAGS = (DONT_ACCEPT_TRUE_FOR_1 |
+                    DONT_ACCEPT_BLANKLINE |
+                    NORMALIZE_WHITESPACE |
+                    ELLIPSIS |
+                    SKIP |
+                    IGNORE_EXCEPTION_DETAIL)
+
+REPORT_UDIFF = register_optionflag('REPORT_UDIFF')
+REPORT_CDIFF = register_optionflag('REPORT_CDIFF')
+REPORT_NDIFF = register_optionflag('REPORT_NDIFF')
+REPORT_ONLY_FIRST_FAILURE = register_optionflag('REPORT_ONLY_FIRST_FAILURE')
+
+REPORTING_FLAGS = (REPORT_UDIFF |
+                   REPORT_CDIFF |
+                   REPORT_NDIFF |
+                   REPORT_ONLY_FIRST_FAILURE)
+
+# Special string markers for use in `want` strings:
+BLANKLINE_MARKER = '<BLANKLINE>'
+ELLIPSIS_MARKER = '...'
+
+######################################################################
+## Table of Contents
+######################################################################
+#  1. Utility Functions
+#  2. Example & DocTest -- store test cases
+#  3. DocTest Parser -- extracts examples from strings
+#  4. DocTest Finder -- extracts test cases from objects
+#  5. DocTest Runner -- runs test cases
+#  6. Test Functions -- convenient wrappers for testing
+#  7. Tester Class -- for backwards compatibility
+#  8. Unittest Support
+#  9. Debugging Support
+# 10. Example Usage
+
+######################################################################
+## 1. Utility Functions
+######################################################################
+
+def is_private(prefix, base):
+    """prefix, base -> true iff name prefix + "." + base is "private".
+
+    Prefix may be an empty string, and base does not contain a period.
+    Prefix is ignored (although functions you write conforming to this
+    protocol may make use of it).
+    Return true iff base begins with an (at least one) underscore, but
+    does not both begin and end with (at least) two underscores.
+
+    >>> is_private("a.b", "my_func")
+    False
+    >>> is_private("____", "_my_func")
+    True
+    >>> is_private("someclass", "__init__")
+    False
+    >>> is_private("sometypo", "__init_")
+    True
+    >>> is_private("x.y.z", "_")
+    True
+    >>> is_private("_x.y.z", "__")
+    False
+    >>> is_private("", "")  # senseless but consistent
+    False
+    """
+    warnings.warn("is_private is deprecated; it wasn't useful; "
+                  "examine DocTestFinder.find() lists instead",
+                  DeprecationWarning, stacklevel=2)
+    return base[:1] == "_" and not base[:2] == "__" == base[-2:]
+
+def _extract_future_flags(globs):
+    """
+    Return the compiler-flags associated with the future features that
+    have been imported into the given namespace (globs).
+    """
+    flags = 0
+    for fname in __future__.all_feature_names:
+        feature = globs.get(fname, None)
+        if feature is getattr(__future__, fname):
+            flags |= feature.compiler_flag
+    return flags
+
+def _normalize_module(module, depth=2):
+    """
+    Return the module specified by `module`.  In particular:
+      - If `module` is a module, then return module.
+      - If `module` is a string, then import and return the
+        module with that name.
+      - If `module` is None, then return the calling module.
+        The calling module is assumed to be the module of
+        the stack frame at the given depth in the call stack.
+    """
+    if inspect.ismodule(module):
+        return module
+    elif isinstance(module, (str, unicode)):
+        return __import__(module, globals(), locals(), ["*"])
+    elif module is None:
+        return sys.modules[sys._getframe(depth).f_globals['__name__']]
+    else:
+        raise TypeError("Expected a module, string, or None")
+
+def _load_testfile(filename, package, module_relative):
+    if module_relative:
+        package = _normalize_module(package, 3)
+        filename = _module_relative_path(package, filename)
+        if hasattr(package, '__loader__'):
+            if hasattr(package.__loader__, 'get_data'):
+                return package.__loader__.get_data(filename), filename
+    return open(filename).read(), filename
+
+def _indent(s, indent=4):
+    """
+    Add the given number of space characters to the beginning every
+    non-blank line in `s`, and return the result.
+    """
+    # This regexp matches the start of non-blank lines:
+    return re.sub('(?m)^(?!$)', indent*' ', s)
+
+def _exception_traceback(exc_info):
+    """
+    Return a string containing a traceback message for the given
+    exc_info tuple (as returned by sys.exc_info()).
+    """
+    # Get a traceback message.
+    excout = StringIO()
+    exc_type, exc_val, exc_tb = exc_info
+    traceback.print_exception(exc_type, exc_val, exc_tb, file=excout)
+    return excout.getvalue()
+
+# Override some StringIO methods.
+class _SpoofOut(StringIO):
+    def getvalue(self):
+        result = StringIO.getvalue(self)
+        # If anything at all was written, make sure there's a trailing
+        # newline.  There's no way for the expected output to indicate
+        # that a trailing newline is missing.
+        if result and not result.endswith("\n"):
+            result += "\n"
+        # Prevent softspace from screwing up the next test case, in
+        # case they used print with a trailing comma in an example.
+        if hasattr(self, "softspace"):
+            del self.softspace
+        return result
+
+    def truncate(self,   size=None):
+        StringIO.truncate(self, size)
+        if hasattr(self, "softspace"):
+            del self.softspace
+
+# Worst-case linear-time ellipsis matching.
+def _ellipsis_match(want, got):
+    """
+    Essentially the only subtle case:
+    >>> _ellipsis_match('aa...aa', 'aaa')
+    False
+    """
+    if ELLIPSIS_MARKER not in want:
+        return want == got
+
+    # Find "the real" strings.
+    ws = want.split(ELLIPSIS_MARKER)
+    assert len(ws) >= 2
+
+    # Deal with exact matches possibly needed at one or both ends.
+    startpos, endpos = 0, len(got)
+    w = ws[0]
+    if w:   # starts with exact match
+        if got.startswith(w):
+            startpos = len(w)
+            del ws[0]
+        else:
+            return False
+    w = ws[-1]
+    if w:   # ends with exact match
+        if got.endswith(w):
+            endpos -= len(w)
+            del ws[-1]
+        else:
+            return False
+
+    if startpos > endpos:
+        # Exact end matches required more characters than we have, as in
+        # _ellipsis_match('aa...aa', 'aaa')
+        return False
+
+    # For the rest, we only need to find the leftmost non-overlapping
+    # match for each piece.  If there's no overall match that way alone,
+    # there's no overall match period.
+    for w in ws:
+        # w may be '' at times, if there are consecutive ellipses, or
+        # due to an ellipsis at the start or end of `want`.  That's OK.
+        # Search for an empty string succeeds, and doesn't change startpos.
+        startpos = got.find(w, startpos, endpos)
+        if startpos < 0:
+            return False
+        startpos += len(w)
+
+    return True
+
+def _comment_line(line):
+    "Return a commented form of the given line"
+    line = line.rstrip()
+    if line:
+        return '# '+line
+    else:
+        return '#'
+
+class _OutputRedirectingPdb(pdb.Pdb):
+    """
+    A specialized version of the python debugger that redirects stdout
+    to a given stream when interacting with the user.  Stdout is *not*
+    redirected when traced code is executed.
+    """
+    def __init__(self, out):
+        self.__out = out
+        self.__debugger_used = False
+        pdb.Pdb.__init__(self)
+
+    def set_trace(self):
+        self.__debugger_used = True
+        pdb.Pdb.set_trace(self)
+
+    def set_continue(self):
+        # Calling set_continue unconditionally would break unit test coverage
+        # reporting, as Bdb.set_continue calls sys.settrace(None).
+        if self.__debugger_used:
+            pdb.Pdb.set_continue(self)
+
+    def trace_dispatch(self, *args):
+        # Redirect stdout to the given stream.
+        save_stdout = sys.stdout
+        sys.stdout = self.__out
+        # Call Pdb's trace dispatch method.
+        try:
+            return pdb.Pdb.trace_dispatch(self, *args)
+        finally:
+            sys.stdout = save_stdout
+
+# [XX] Normalize with respect to os.path.pardir?
+def _module_relative_path(module, path):
+    if not inspect.ismodule(module):
+        raise TypeError, 'Expected a module: %r' % module
+    if path.startswith('/'):
+        raise ValueError, 'Module-relative files may not have absolute paths'
+
+    # Find the base directory for the path.
+    if hasattr(module, '__file__'):
+        # A normal module/package
+        basedir = os.path.split(module.__file__)[0]
+    elif module.__name__ == '__main__':
+        # An interactive session.
+        if len(sys.argv)>0 and sys.argv[0] != '':
+            basedir = os.path.split(sys.argv[0])[0]
+        else:
+            basedir = os.curdir
+    else:
+        # A module w/o __file__ (this includes builtins)
+        raise ValueError("Can't resolve paths relative to the module " +
+                         module + " (it has no __file__)")
+
+    # Combine the base directory and the path.
+    return os.path.join(basedir, *(path.split('/')))
+
+######################################################################
+## 2. Example & DocTest
+######################################################################
+## - An "example" is a <source, want> pair, where "source" is a
+##   fragment of source code, and "want" is the expected output for
+##   "source."  The Example class also includes information about
+##   where the example was extracted from.
+##
+## - A "doctest" is a collection of examples, typically extracted from
+##   a string (such as an object's docstring).  The DocTest class also
+##   includes information about where the string was extracted from.
+
+class Example:
+    """
+    A single doctest example, consisting of source code and expected
+    output.  `Example` defines the following attributes:
+
+      - source: A single Python statement, always ending with a newline.
+        The constructor adds a newline if needed.
+
+      - want: The expected output from running the source code (either
+        from stdout, or a traceback in case of exception).  `want` ends
+        with a newline unless it's empty, in which case it's an empty
+        string.  The constructor adds a newline if needed.
+
+      - exc_msg: The exception message generated by the example, if
+        the example is expected to generate an exception; or `None` if
+        it is not expected to generate an exception.  This exception
+        message is compared against the return value of
+        `traceback.format_exception_only()`.  `exc_msg` ends with a
+        newline unless it's `None`.  The constructor adds a newline
+        if needed.
+
+      - lineno: The line number within the DocTest string containing
+        this Example where the Example begins.  This line number is
+        zero-based, with respect to the beginning of the DocTest.
+
+      - indent: The example's indentation in the DocTest string.
+        I.e., the number of space characters that preceed the
+        example's first prompt.
+
+      - options: A dictionary mapping from option flags to True or
+        False, which is used to override default options for this
+        example.  Any option flags not contained in this dictionary
+        are left at their default value (as specified by the
+        DocTestRunner's optionflags).  By default, no options are set.
+    """
+    def __init__(self, source, want, exc_msg=None, lineno=0, indent=0,
+                 options=None):
+        # Normalize inputs.
+        if not source.endswith('\n'):
+            source += '\n'
+        if want and not want.endswith('\n'):
+            want += '\n'
+        if exc_msg is not None and not exc_msg.endswith('\n'):
+            exc_msg += '\n'
+        # Store properties.
+        self.source = source
+        self.want = want
+        self.lineno = lineno
+        self.indent = indent
+        if options is None: options = {}
+        self.options = options
+        self.exc_msg = exc_msg
+
+class DocTest:
+    """
+    A collection of doctest examples that should be run in a single
+    namespace.  Each `DocTest` defines the following attributes:
+
+      - examples: the list of examples.
+
+      - globs: The namespace (aka globals) that the examples should
+        be run in.
+
+      - name: A name identifying the DocTest (typically, the name of
+        the object whose docstring this DocTest was extracted from).
+
+      - filename: The name of the file that this DocTest was extracted
+        from, or `None` if the filename is unknown.
+
+      - lineno: The line number within filename where this DocTest
+        begins, or `None` if the line number is unavailable.  This
+        line number is zero-based, with respect to the beginning of
+        the file.
+
+      - docstring: The string that the examples were extracted from,
+        or `None` if the string is unavailable.
+    """
+    def __init__(self, examples, globs, name, filename, lineno, docstring):
+        """
+        Create a new DocTest containing the given examples.  The
+        DocTest's globals are initialized with a copy of `globs`.
+        """
+        assert not isinstance(examples, basestring), \
+               "DocTest no longer accepts str; use DocTestParser instead"
+        self.examples = examples
+        self.docstring = docstring
+        self.globs = globs.copy()
+        self.name = name
+        self.filename = filename
+        self.lineno = lineno
+
+    def __repr__(self):
+        if len(self.examples) == 0:
+            examples = 'no examples'
+        elif len(self.examples) == 1:
+            examples = '1 example'
+        else:
+            examples = '%d examples' % len(self.examples)
+        return ('<DocTest %s from %s:%s (%s)>' %
+                (self.name, self.filename, self.lineno, examples))
+
+
+    # This lets us sort tests by name:
+    def __cmp__(self, other):
+        if not isinstance(other, DocTest):
+            return -1
+        return cmp((self.name, self.filename, self.lineno, id(self)),
+                   (other.name, other.filename, other.lineno, id(other)))
+
+######################################################################
+## 3. DocTestParser
+######################################################################
+
+class DocTestParser:
+    """
+    A class used to parse strings containing doctest examples.
+    """
+    # This regular expression is used to find doctest examples in a
+    # string.  It defines three groups: `source` is the source code
+    # (including leading indentation and prompts); `indent` is the
+    # indentation of the first (PS1) line of the source code; and
+    # `want` is the expected output (including leading indentation).
+    _EXAMPLE_RE = re.compile(r'''
+        # Source consists of a PS1 line followed by zero or more PS2 lines.
+        (?P<source>
+            (?:^(?P<indent> [ ]*) >>>    .*)    # PS1 line
+            (?:\n           [ ]*  \.\.\. .*)*)  # PS2 lines
+        \n?
+        # Want consists of any non-blank lines that do not start with PS1.
+        (?P<want> (?:(?![ ]*$)    # Not a blank line
+                     (?![ ]*>>>)  # Not a line starting with PS1
+                     .*$\n?       # But any other line
+                  )*)
+        ''', re.MULTILINE | re.VERBOSE)
+
+    # A regular expression for handling `want` strings that contain
+    # expected exceptions.  It divides `want` into three pieces:
+    #    - the traceback header line (`hdr`)
+    #    - the traceback stack (`stack`)
+    #    - the exception message (`msg`), as generated by
+    #      traceback.format_exception_only()
+    # `msg` may have multiple lines.  We assume/require that the
+    # exception message is the first non-indented line starting with a word
+    # character following the traceback header line.
+    _EXCEPTION_RE = re.compile(r"""
+        # Grab the traceback header.  Different versions of Python have
+        # said different things on the first traceback line.
+        ^(?P<hdr> Traceback\ \(
+            (?: most\ recent\ call\ last
+            |   innermost\ last
+            ) \) :
+        )
+        \s* $                # toss trailing whitespace on the header.
+        (?P<stack> .*?)      # don't blink: absorb stuff until...
+        ^ (?P<msg> \w+ .*)   #     a line *starts* with alphanum.
+        """, re.VERBOSE | re.MULTILINE | re.DOTALL)
+
+    # A callable returning a true value iff its argument is a blank line
+    # or contains a single comment.
+    _IS_BLANK_OR_COMMENT = re.compile(r'^[ ]*(#.*)?$').match
+
+    def parse(self, string, name='<string>'):
+        """
+        Divide the given string into examples and intervening text,
+        and return them as a list of alternating Examples and strings.
+        Line numbers for the Examples are 0-based.  The optional
+        argument `name` is a name identifying this string, and is only
+        used for error messages.
+        """
+        string = string.expandtabs()
+        # If all lines begin with the same indentation, then strip it.
+        min_indent = self._min_indent(string)
+        if min_indent > 0:
+            string = '\n'.join([l[min_indent:] for l in string.split('\n')])
+
+        output = []
+        charno, lineno = 0, 0
+        # Find all doctest examples in the string:
+        for m in self._EXAMPLE_RE.finditer(string):
+            # Add the pre-example text to `output`.
+            output.append(string[charno:m.start()])
+            # Update lineno (lines before this example)
+            lineno += string.count('\n', charno, m.start())
+            # Extract info from the regexp match.
+            (source, options, want, exc_msg) = \
+                     self._parse_example(m, name, lineno)
+            # Create an Example, and add it to the list.
+            if not self._IS_BLANK_OR_COMMENT(source):
+                output.append( Example(source, want, exc_msg,
+                                    lineno=lineno,
+                                    indent=min_indent+len(m.group('indent')),
+                                    options=options) )
+            # Update lineno (lines inside this example)
+            lineno += string.count('\n', m.start(), m.end())
+            # Update charno.
+            charno = m.end()
+        # Add any remaining post-example text to `output`.
+        output.append(string[charno:])
+        return output
+
+    def get_doctest(self, string, globs, name, filename, lineno):
+        """
+        Extract all doctest examples from the given string, and
+        collect them into a `DocTest` object.
+
+        `globs`, `name`, `filename`, and `lineno` are attributes for
+        the new `DocTest` object.  See the documentation for `DocTest`
+        for more information.
+        """
+        return DocTest(self.get_examples(string, name), globs,
+                       name, filename, lineno, string)
+
+    def get_examples(self, string, name='<string>'):
+        """
+        Extract all doctest examples from the given string, and return
+        them as a list of `Example` objects.  Line numbers are
+        0-based, because it's most common in doctests that nothing
+        interesting appears on the same line as opening triple-quote,
+        and so the first interesting line is called \"line 1\" then.
+
+        The optional argument `name` is a name identifying this
+        string, and is only used for error messages.
+        """
+        return [x for x in self.parse(string, name)
+                if isinstance(x, Example)]
+
+    def _parse_example(self, m, name, lineno):
+        """
+        Given a regular expression match from `_EXAMPLE_RE` (`m`),
+        return a pair `(source, want)`, where `source` is the matched
+        example's source code (with prompts and indentation stripped);
+        and `want` is the example's expected output (with indentation
+        stripped).
+
+        `name` is the string's name, and `lineno` is the line number
+        where the example starts; both are used for error messages.
+        """
+        # Get the example's indentation level.
+        indent = len(m.group('indent'))
+
+        # Divide source into lines; check that they're properly
+        # indented; and then strip their indentation & prompts.
+        source_lines = m.group('source').split('\n')
+        self._check_prompt_blank(source_lines, indent, name, lineno)
+        self._check_prefix(source_lines[1:], ' '*indent + '.', name, lineno)
+        source = '\n'.join([sl[indent+4:] for sl in source_lines])
+
+        # Divide want into lines; check that it's properly indented; and
+        # then strip the indentation.  Spaces before the last newline should
+        # be preserved, so plain rstrip() isn't good enough.
+        want = m.group('want')
+        want_lines = want.split('\n')
+        if len(want_lines) > 1 and re.match(r' *$', want_lines[-1]):
+            del want_lines[-1]  # forget final newline & spaces after it
+        self._check_prefix(want_lines, ' '*indent, name,
+                           lineno + len(source_lines))
+        want = '\n'.join([wl[indent:] for wl in want_lines])
+
+        # If `want` contains a traceback message, then extract it.
+        m = self._EXCEPTION_RE.match(want)
+        if m:
+            exc_msg = m.group('msg')
+        else:
+            exc_msg = None
+
+        # Extract options from the source.
+        options = self._find_options(source, name, lineno)
+
+        return source, options, want, exc_msg
+
+    # This regular expression looks for option directives in the
+    # source code of an example.  Option directives are comments
+    # starting with "doctest:".  Warning: this may give false
+    # positives for string-literals that contain the string
+    # "#doctest:".  Eliminating these false positives would require
+    # actually parsing the string; but we limit them by ignoring any
+    # line containing "#doctest:" that is *followed* by a quote mark.
+    _OPTION_DIRECTIVE_RE = re.compile(r'#\s*doctest:\s*([^\n\'"]*)$',
+                                      re.MULTILINE)
+
+    def _find_options(self, source, name, lineno):
+        """
+        Return a dictionary containing option overrides extracted from
+        option directives in the given source string.
+
+        `name` is the string's name, and `lineno` is the line number
+        where the example starts; both are used for error messages.
+        """
+        options = {}
+        # (note: with the current regexp, this will match at most once:)
+        for m in self._OPTION_DIRECTIVE_RE.finditer(source):
+            option_strings = m.group(1).replace(',', ' ').split()
+            for option in option_strings:
+                if (option[0] not in '+-' or
+                    option[1:] not in OPTIONFLAGS_BY_NAME):
+                    raise ValueError('line %r of the doctest for %s '
+                                     'has an invalid option: %r' %
+                                     (lineno+1, name, option))
+                flag = OPTIONFLAGS_BY_NAME[option[1:]]
+                options[flag] = (option[0] == '+')
+        if options and self._IS_BLANK_OR_COMMENT(source):
+            raise ValueError('line %r of the doctest for %s has an option '
+                             'directive on a line with no example: %r' %
+                             (lineno, name, source))
+        return options
+
+    # This regular expression finds the indentation of every non-blank
+    # line in a string.
+    _INDENT_RE = re.compile('^([ ]*)(?=\S)', re.MULTILINE)
+
+    def _min_indent(self, s):
+        "Return the minimum indentation of any non-blank line in `s`"
+        indents = [len(indent) for indent in self._INDENT_RE.findall(s)]
+        if len(indents) > 0:
+            return min(indents)
+        else:
+            return 0
+
+    def _check_prompt_blank(self, lines, indent, name, lineno):
+        """
+        Given the lines of a source string (including prompts and
+        leading indentation), check to make sure that every prompt is
+        followed by a space character.  If any line is not followed by
+        a space character, then raise ValueError.
+        """
+        for i, line in enumerate(lines):
+            if len(line) >= indent+4 and line[indent+3] != ' ':
+                raise ValueError('line %r of the docstring for %s '
+                                 'lacks blank after %s: %r' %
+                                 (lineno+i+1, name,
+                                  line[indent:indent+3], line))
+
+    def _check_prefix(self, lines, prefix, name, lineno):
+        """
+        Check that every line in the given list starts with the given
+        prefix; if any line does not, then raise a ValueError.
+        """
+        for i, line in enumerate(lines):
+            if line and not line.startswith(prefix):
+                raise ValueError('line %r of the docstring for %s has '
+                                 'inconsistent leading whitespace: %r' %
+                                 (lineno+i+1, name, line))
+
+
+######################################################################
+## 4. DocTest Finder
+######################################################################
+
+class DocTestFinder:
+    """
+    A class used to extract the DocTests that are relevant to a given
+    object, from its docstring and the docstrings of its contained
+    objects.  Doctests can currently be extracted from the following
+    object types: modules, functions, classes, methods, staticmethods,
+    classmethods, and properties.
+    """
+
+    def __init__(self, verbose=False, parser=DocTestParser(),
+                 recurse=True, _namefilter=None, exclude_empty=True):
+        """
+        Create a new doctest finder.
+
+        The optional argument `parser` specifies a class or
+        function that should be used to create new DocTest objects (or
+        objects that implement the same interface as DocTest).  The
+        signature for this factory function should match the signature
+        of the DocTest constructor.
+
+        If the optional argument `recurse` is false, then `find` will
+        only examine the given object, and not any contained objects.
+
+        If the optional argument `exclude_empty` is false, then `find`
+        will include tests for objects with empty docstrings.
+        """
+        self._parser = parser
+        self._verbose = verbose
+        self._recurse = recurse
+        self._exclude_empty = exclude_empty
+        # _namefilter is undocumented, and exists only for temporary backward-
+        # compatibility support of testmod's deprecated isprivate mess.
+        self._namefilter = _namefilter
+
+    def find(self, obj, name=None, module=None, globs=None,
+             extraglobs=None):
+        """
+        Return a list of the DocTests that are defined by the given
+        object's docstring, or by any of its contained objects'
+        docstrings.
+
+        The optional parameter `module` is the module that contains
+        the given object.  If the module is not specified or is None, then
+        the test finder will attempt to automatically determine the
+        correct module.  The object's module is used:
+
+            - As a default namespace, if `globs` is not specified.
+            - To prevent the DocTestFinder from extracting DocTests
+              from objects that are imported from other modules.
+            - To find the name of the file containing the object.
+            - To help find the line number of the object within its
+              file.
+
+        Contained objects whose module does not match `module` are ignored.
+
+        If `module` is False, no attempt to find the module will be made.
+        This is obscure, of use mostly in tests:  if `module` is False, or
+        is None but cannot be found automatically, then all objects are
+        considered to belong to the (non-existent) module, so all contained
+        objects will (recursively) be searched for doctests.
+
+        The globals for each DocTest is formed by combining `globs`
+        and `extraglobs` (bindings in `extraglobs` override bindings
+        in `globs`).  A new copy of the globals dictionary is created
+        for each DocTest.  If `globs` is not specified, then it
+        defaults to the module's `__dict__`, if specified, or {}
+        otherwise.  If `extraglobs` is not specified, then it defaults
+        to {}.
+
+        """
+        # If name was not specified, then extract it from the object.
+        if name is None:
+            name = getattr(obj, '__name__', None)
+            if name is None:
+                raise ValueError("DocTestFinder.find: name must be given "
+                        "when obj.__name__ doesn't exist: %r" %
+                                 (type(obj),))
+
+        # Find the module that contains the given object (if obj is
+        # a module, then module=obj.).  Note: this may fail, in which
+        # case module will be None.
+        if module is False:
+            module = None
+        elif module is None:
+            module = inspect.getmodule(obj)
+
+        # Read the module's source code.  This is used by
+        # DocTestFinder._find_lineno to find the line number for a
+        # given object's docstring.
+        try:
+            file = inspect.getsourcefile(obj) or inspect.getfile(obj)
+            source_lines = linecache_copy.getlines(file)
+            if not source_lines:
+                source_lines = None
+        except TypeError:
+            source_lines = None
+
+        # Initialize globals, and merge in extraglobs.
+        if globs is None:
+            if module is None:
+                globs = {}
+            else:
+                globs = module.__dict__.copy()
+        else:
+            globs = globs.copy()
+        if extraglobs is not None:
+            globs.update(extraglobs)
+
+        # Recursively expore `obj`, extracting DocTests.
+        tests = []
+        self._find(tests, obj, name, module, source_lines, globs, {})
+        return tests
+
+    def _filter(self, obj, prefix, base):
+        """
+        Return true if the given object should not be examined.
+        """
+        return (self._namefilter is not None and
+                self._namefilter(prefix, base))
+
+    def _from_module(self, module, object):
+        """
+        Return true if the given object is defined in the given
+        module.
+        """
+        if module is None:
+            return True
+        elif inspect.isfunction(object):
+            return module.__dict__ is object.func_globals
+        elif inspect.isclass(object):
+            return module.__name__ == object.__module__
+        elif inspect.getmodule(object) is not None:
+            return module is inspect.getmodule(object)
+        elif hasattr(object, '__module__'):
+            return module.__name__ == object.__module__
+        elif isinstance(object, property):
+            return True # [XX] no way not be sure.
+        else:
+            raise ValueError("object must be a class or function")
+
+    def _find(self, tests, obj, name, module, source_lines, globs, seen):
+        """
+        Find tests for the given object and any contained objects, and
+        add them to `tests`.
+        """
+        if self._verbose:
+            print 'Finding tests in %s' % name
+
+        # If we've already processed this object, then ignore it.
+        if id(obj) in seen:
+            return
+        seen[id(obj)] = 1
+
+        # Find a test for this object, and add it to the list of tests.
+        test = self._get_test(obj, name, module, globs, source_lines)
+        if test is not None:
+            tests.append(test)
+
+        # Look for tests in a module's contained objects.
+        if inspect.ismodule(obj) and self._recurse:
+            for valname, val in obj.__dict__.items():
+                # Check if this contained object should be ignored.
+                if self._filter(val, name, valname):
+                    continue
+                valname = '%s.%s' % (name, valname)
+                # Recurse to functions & classes.
+                if ((inspect.isfunction(val) or inspect.isclass(val)) and
+                    self._from_module(module, val)):
+                    self._find(tests, val, valname, module, source_lines,
+                               globs, seen)
+
+        # Look for tests in a module's __test__ dictionary.
+        if inspect.ismodule(obj) and self._recurse:
+            for valname, val in getattr(obj, '__test__', {}).items():
+                if not isinstance(valname, basestring):
+                    raise ValueError("DocTestFinder.find: __test__ keys "
+                                     "must be strings: %r" %
+                                     (type(valname),))
+                if not (inspect.isfunction(val) or inspect.isclass(val) or
+                        inspect.ismethod(val) or inspect.ismodule(val) or
+                        isinstance(val, basestring)):
+                    raise ValueError("DocTestFinder.find: __test__ values "
+                                     "must be strings, functions, methods, "
+                                     "classes, or modules: %r" %
+                                     (type(val),))
+                valname = '%s.__test__.%s' % (name, valname)
+                self._find(tests, val, valname, module, source_lines,
+                           globs, seen)
+
+        # Look for tests in a class's contained objects.
+        if inspect.isclass(obj) and self._recurse:
+            for valname, val in obj.__dict__.items():
+                # Check if this contained object should be ignored.
+                if self._filter(val, name, valname):
+                    continue
+                # Special handling for staticmethod/classmethod.
+                if isinstance(val, staticmethod):
+                    val = getattr(obj, valname)
+                if isinstance(val, classmethod):
+                    val = getattr(obj, valname).im_func
+
+                # Recurse to methods, properties, and nested classes.
+                if ((inspect.isfunction(val) or inspect.isclass(val) or
+                      isinstance(val, property)) and
+                      self._from_module(module, val)):
+                    valname = '%s.%s' % (name, valname)
+                    self._find(tests, val, valname, module, source_lines,
+                               globs, seen)
+
+    def _get_test(self, obj, name, module, globs, source_lines):
+        """
+        Return a DocTest for the given object, if it defines a docstring;
+        otherwise, return None.
+        """
+        # Extract the object's docstring.  If it doesn't have one,
+        # then return None (no test for this object).
+        if isinstance(obj, basestring):
+            docstring = obj
+        else:
+            try:
+                if obj.__doc__ is None:
+                    docstring = ''
+                else:
+                    docstring = obj.__doc__
+                    if not isinstance(docstring, basestring):
+                        docstring = str(docstring)
+            except (TypeError, AttributeError):
+                docstring = ''
+
+        # Find the docstring's location in the file.
+        lineno = self._find_lineno(obj, source_lines)
+
+        # Don't bother if the docstring is empty.
+        if self._exclude_empty and not docstring:
+            return None
+
+        # Return a DocTest for this object.
+        if module is None:
+            filename = None
+        else:
+            filename = getattr(module, '__file__', module.__name__)
+            if filename[-4:] in (".pyc", ".pyo"):
+                filename = filename[:-1]
+        return self._parser.get_doctest(docstring, globs, name,
+                                        filename, lineno)
+
+    def _find_lineno(self, obj, source_lines):
+        """
+        Return a line number of the given object's docstring.  Note:
+        this method assumes that the object has a docstring.
+        """
+        lineno = None
+
+        # Find the line number for modules.
+        if inspect.ismodule(obj):
+            lineno = 0
+
+        # Find the line number for classes.
+        # Note: this could be fooled if a class is defined multiple
+        # times in a single file.
+        if inspect.isclass(obj):
+            if source_lines is None:
+                return None
+            pat = re.compile(r'^\s*class\s*%s\b' %
+                             getattr(obj, '__name__', '-'))
+            for i, line in enumerate(source_lines):
+                if pat.match(line):
+                    lineno = i
+                    break
+
+        # Find the line number for functions & methods.
+        if inspect.ismethod(obj): obj = obj.im_func
+        if inspect.isfunction(obj): obj = obj.func_code
+        if inspect.istraceback(obj): obj = obj.tb_frame
+        if inspect.isframe(obj): obj = obj.f_code
+        if inspect.iscode(obj):
+            lineno = getattr(obj, 'co_firstlineno', None)-1
+
+        # Find the line number where the docstring starts.  Assume
+        # that it's the first line that begins with a quote mark.
+        # Note: this could be fooled by a multiline function
+        # signature, where a continuation line begins with a quote
+        # mark.
+        if lineno is not None:
+            if source_lines is None:
+                return lineno+1
+            pat = re.compile('(^|.*:)\s*\w*("|\')')
+            for lineno in range(lineno, len(source_lines)):
+                if pat.match(source_lines[lineno]):
+                    return lineno
+
+        # We couldn't find the line number.
+        return None
+
+######################################################################
+## 5. DocTest Runner
+######################################################################
+
+class DocTestRunner:
+    """
+    A class used to run DocTest test cases, and accumulate statistics.
+    The `run` method is used to process a single DocTest case.  It
+    returns a tuple `(f, t)`, where `t` is the number of test cases
+    tried, and `f` is the number of test cases that failed.
+
+        >>> tests = DocTestFinder().find(_TestClass)
+        >>> runner = DocTestRunner(verbose=False)
+        >>> for test in tests:
+        ...     print runner.run(test)
+        (0, 2)
+        (0, 1)
+        (0, 2)
+        (0, 2)
+
+    The `summarize` method prints a summary of all the test cases that
+    have been run by the runner, and returns an aggregated `(f, t)`
+    tuple:
+
+        >>> runner.summarize(verbose=1)
+        4 items passed all tests:
+           2 tests in _TestClass
+           2 tests in _TestClass.__init__
+           2 tests in _TestClass.get
+           1 tests in _TestClass.square
+        7 tests in 4 items.
+        7 passed and 0 failed.
+        Test passed.
+        (0, 7)
+
+    The aggregated number of tried examples and failed examples is
+    also available via the `tries` and `failures` attributes:
+
+        >>> runner.tries
+        7
+        >>> runner.failures
+        0
+
+    The comparison between expected outputs and actual outputs is done
+    by an `OutputChecker`.  This comparison may be customized with a
+    number of option flags; see the documentation for `testmod` for
+    more information.  If the option flags are insufficient, then the
+    comparison may also be customized by passing a subclass of
+    `OutputChecker` to the constructor.
+
+    The test runner's display output can be controlled in two ways.
+    First, an output function (`out) can be passed to
+    `TestRunner.run`; this function will be called with strings that
+    should be displayed.  It defaults to `sys.stdout.write`.  If
+    capturing the output is not sufficient, then the display output
+    can be also customized by subclassing DocTestRunner, and
+    overriding the methods `report_start`, `report_success`,
+    `report_unexpected_exception`, and `report_failure`.
+    """
+    # This divider string is used to separate failure messages, and to
+    # separate sections of the summary.
+    DIVIDER = "*" * 70
+
+    def __init__(self, checker=None, verbose=None, optionflags=0):
+        """
+        Create a new test runner.
+
+        Optional keyword arg `checker` is the `OutputChecker` that
+        should be used to compare the expected outputs and actual
+        outputs of doctest examples.
+
+        Optional keyword arg 'verbose' prints lots of stuff if true,
+        only failures if false; by default, it's true iff '-v' is in
+        sys.argv.
+
+        Optional argument `optionflags` can be used to control how the
+        test runner compares expected output to actual output, and how
+        it displays failures.  See the documentation for `testmod` for
+        more information.
+        """
+        self._checker = checker or OutputChecker()
+        if verbose is None:
+            verbose = '-v' in sys.argv
+        self._verbose = verbose
+        self.optionflags = optionflags
+        self.original_optionflags = optionflags
+
+        # Keep track of the examples we've run.
+        self.tries = 0
+        self.failures = 0
+        self._name2ft = {}
+
+        # Create a fake output target for capturing doctest output.
+        self._fakeout = _SpoofOut()
+
+    #/////////////////////////////////////////////////////////////////
+    # Reporting methods
+    #/////////////////////////////////////////////////////////////////
+
+    def report_start(self, out, test, example):
+        """
+        Report that the test runner is about to process the given
+        example.  (Only displays a message if verbose=True)
+        """
+        if self._verbose:
+            if example.want:
+                out('Trying:\n' + _indent(example.source) +
+                    'Expecting:\n' + _indent(example.want))
+            else:
+                out('Trying:\n' + _indent(example.source) +
+                    'Expecting nothing\n')
+
+    def report_success(self, out, test, example, got):
+        """
+        Report that the given example ran successfully.  (Only
+        displays a message if verbose=True)
+        """
+        if self._verbose:
+            out("ok\n")
+
+    def report_failure(self, out, test, example, got):
+        """
+        Report that the given example failed.
+        """
+        out(self._failure_header(test, example) +
+            self._checker.output_difference(example, got, self.optionflags))
+
+    def report_unexpected_exception(self, out, test, example, exc_info):
+        """
+        Report that the given example raised an unexpected exception.
+        """
+        out(self._failure_header(test, example) +
+            'Exception raised:\n' + _indent(_exception_traceback(exc_info)))
+
+    def _failure_header(self, test, example):
+        out = [self.DIVIDER]
+        if test.filename:
+            if test.lineno is not None and example.lineno is not None:
+                lineno = test.lineno + example.lineno + 1
+            else:
+                lineno = '?'
+            out.append('File "%s", line %s, in %s' %
+                       (test.filename, lineno, test.name))
+        else:
+            out.append('Line %s, in %s' % (example.lineno+1, test.name))
+        out.append('Failed example:')
+        source = example.source
+        out.append(_indent(source))
+        return '\n'.join(out)
+
+    #/////////////////////////////////////////////////////////////////
+    # DocTest Running
+    #/////////////////////////////////////////////////////////////////
+
+    def __run(self, test, compileflags, out):
+        """
+        Run the examples in `test`.  Write the outcome of each example
+        with one of the `DocTestRunner.report_*` methods, using the
+        writer function `out`.  `compileflags` is the set of compiler
+        flags that should be used to execute examples.  Return a tuple
+        `(f, t)`, where `t` is the number of examples tried, and `f`
+        is the number of examples that failed.  The examples are run
+        in the namespace `test.globs`.
+        """
+        # Keep track of the number of failures and tries.
+        failures = tries = 0
+
+        # Save the option flags (since option directives can be used
+        # to modify them).
+        original_optionflags = self.optionflags
+
+        SUCCESS, FAILURE, BOOM = range(3) # `outcome` state
+
+        check = self._checker.check_output
+
+        # Process each example.
+        for examplenum, example in enumerate(test.examples):
+
+            # If REPORT_ONLY_FIRST_FAILURE is set, then supress
+            # reporting after the first failure.
+            quiet = (self.optionflags & REPORT_ONLY_FIRST_FAILURE and
+                     failures > 0)
+
+            # Merge in the example's options.
+            self.optionflags = original_optionflags
+            if example.options:
+                for (optionflag, val) in example.options.items():
+                    if val:
+                        self.optionflags |= optionflag
+                    else:
+                        self.optionflags &= ~optionflag
+
+            # If 'SKIP' is set, then skip this example.
+            if self.optionflags & SKIP:
+                continue
+
+            # Record that we started this example.
+            tries += 1
+            if not quiet:
+                self.report_start(out, test, example)
+
+            # Use a special filename for compile(), so we can retrieve
+            # the source code during interactive debugging (see
+            # __patched_linecache_getlines).
+            filename = '<doctest %s[%d]>' % (test.name, examplenum)
+
+            # Run the example in the given context (globs), and record
+            # any exception that gets raised.  (But don't intercept
+            # keyboard interrupts.)
+            try:
+                # Don't blink!  This is where the user's code gets run.
+                exec compile(example.source, filename, "single",
+                             compileflags, 1) in test.globs
+                self.debugger.set_continue() # ==== Example Finished ====
+                exception = None
+            except KeyboardInterrupt:
+                raise
+            except:
+                exception = sys.exc_info()
+                self.debugger.set_continue() # ==== Example Finished ====
+
+            got = self._fakeout.getvalue()  # the actual output
+            self._fakeout.truncate(0)
+            outcome = FAILURE   # guilty until proved innocent or insane
+
+            # If the example executed without raising any exceptions,
+            # verify its output.
+            if exception is None:
+                if check(example.want, got, self.optionflags):
+                    outcome = SUCCESS
+
+            # The example raised an exception:  check if it was expected.
+            else:
+                exc_info = sys.exc_info()
+                exc_msg = traceback.format_exception_only(*exc_info[:2])[-1]
+                if not quiet:
+                    got += _exception_traceback(exc_info)
+
+                # If `example.exc_msg` is None, then we weren't expecting
+                # an exception.
+                if example.exc_msg is None:
+                    outcome = BOOM
+
+                # We expected an exception:  see whether it matches.
+                elif check(example.exc_msg, exc_msg, self.optionflags):
+                    outcome = SUCCESS
+
+                # Another chance if they didn't care about the detail.
+                elif self.optionflags & IGNORE_EXCEPTION_DETAIL:
+                    m1 = re.match(r'[^:]*:', example.exc_msg)
+                    m2 = re.match(r'[^:]*:', exc_msg)
+                    if m1 and m2 and check(m1.group(0), m2.group(0),
+                                           self.optionflags):
+                        outcome = SUCCESS
+
+            # Report the outcome.
+            if outcome is SUCCESS:
+                if not quiet:
+                    self.report_success(out, test, example, got)
+            elif outcome is FAILURE:
+                if not quiet:
+                    self.report_failure(out, test, example, got)
+                failures += 1
+            elif outcome is BOOM:
+                if not quiet:
+                    self.report_unexpected_exception(out, test, example,
+                                                     exc_info)
+                failures += 1
+            else:
+                assert False, ("unknown outcome", outcome)
+
+        # Restore the option flags (in case they were modified)
+        self.optionflags = original_optionflags
+
+        # Record and return the number of failures and tries.
+        self.__record_outcome(test, failures, tries)
+        return failures, tries
+
+    def __record_outcome(self, test, f, t):
+        """
+        Record the fact that the given DocTest (`test`) generated `f`
+        failures out of `t` tried examples.
+        """
+        f2, t2 = self._name2ft.get(test.name, (0,0))
+        self._name2ft[test.name] = (f+f2, t+t2)
+        self.failures += f
+        self.tries += t
+
+    __LINECACHE_FILENAME_RE = re.compile(r'<doctest '
+                                         r'(?P<name>[\w\.]+)'
+                                         r'\[(?P<examplenum>\d+)\]>$')
+    def __patched_linecache_getlines(self, filename, module_globals=None):
+        m = self.__LINECACHE_FILENAME_RE.match(filename)
+        if m and m.group('name') == self.test.name:
+            example = self.test.examples[int(m.group('examplenum'))]
+            return example.source.splitlines(True)
+        else:
+            return self.save_linecache_getlines(filename, module_globals)
+
+    def run(self, test, compileflags=None, out=None, clear_globs=True):
+        """
+        Run the examples in `test`, and display the results using the
+        writer function `out`.
+
+        The examples are run in the namespace `test.globs`.  If
+        `clear_globs` is true (the default), then this namespace will
+        be cleared after the test runs, to help with garbage
+        collection.  If you would like to examine the namespace after
+        the test completes, then use `clear_globs=False`.
+
+        `compileflags` gives the set of flags that should be used by
+        the Python compiler when running the examples.  If not
+        specified, then it will default to the set of future-import
+        flags that apply to `globs`.
+
+        The output of each example is checked using
+        `DocTestRunner.check_output`, and the results are formatted by
+        the `DocTestRunner.report_*` methods.
+        """
+        self.test = test
+
+        if compileflags is None:
+            compileflags = _extract_future_flags(test.globs)
+
+        save_stdout = sys.stdout
+        if out is None:
+            out = save_stdout.write
+        sys.stdout = self._fakeout
+
+        # Patch pdb.set_trace to restore sys.stdout during interactive
+        # debugging (so it's not still redirected to self._fakeout).
+        # Note that the interactive output will go to *our*
+        # save_stdout, even if that's not the real sys.stdout; this
+        # allows us to write test cases for the set_trace behavior.
+        save_set_trace = pdb.set_trace
+        self.debugger = _OutputRedirectingPdb(save_stdout)
+        self.debugger.reset()
+        pdb.set_trace = self.debugger.set_trace
+
+        # Patch linecache_copy.getlines, so we can see the example's source
+        # when we're inside the debugger.
+        self.save_linecache_getlines = linecache_copy.getlines
+        linecache_copy.getlines = self.__patched_linecache_getlines
+
+        try:
+            return self.__run(test, compileflags, out)
+        finally:
+            sys.stdout = save_stdout
+            pdb.set_trace = save_set_trace
+            linecache_copy.getlines = self.save_linecache_getlines
+            if clear_globs:
+                test.globs.clear()
+
+    #/////////////////////////////////////////////////////////////////
+    # Summarization
+    #/////////////////////////////////////////////////////////////////
+    def summarize(self, verbose=None):
+        """
+        Print a summary of all the test cases that have been run by
+        this DocTestRunner, and return a tuple `(f, t)`, where `f` is
+        the total number of failed examples, and `t` is the total
+        number of tried examples.
+
+        The optional `verbose` argument controls how detailed the
+        summary is.  If the verbosity is not specified, then the
+        DocTestRunner's verbosity is used.
+        """
+        if verbose is None:
+            verbose = self._verbose
+        notests = []
+        passed = []
+        failed = []
+        totalt = totalf = 0
+        for x in self._name2ft.items():
+            name, (f, t) = x
+            assert f <= t
+            totalt += t
+            totalf += f
+            if t == 0:
+                notests.append(name)
+            elif f == 0:
+                passed.append( (name, t) )
+            else:
+                failed.append(x)
+        if verbose:
+            if notests:
+                print len(notests), "items had no tests:"
+                notests.sort()
+                for thing in notests:
+                    print "   ", thing
+            if passed:
+                print len(passed), "items passed all tests:"
+                passed.sort()
+                for thing, count in passed:
+                    print " %3d tests in %s" % (count, thing)
+        if failed:
+            print self.DIVIDER
+            print len(failed), "items had failures:"
+            failed.sort()
+            for thing, (f, t) in failed:
+                print " %3d of %3d in %s" % (f, t, thing)
+        if verbose:
+            print totalt, "tests in", len(self._name2ft), "items."
+            print totalt - totalf, "passed and", totalf, "failed."
+        if totalf:
+            print "***Test Failed***", totalf, "failures."
+        elif verbose:
+            print "Test passed."
+        return totalf, totalt
+
+    #/////////////////////////////////////////////////////////////////
+    # Backward compatibility cruft to maintain doctest.master.
+    #/////////////////////////////////////////////////////////////////
+    def merge(self, other):
+        d = self._name2ft
+        for name, (f, t) in other._name2ft.items():
+            if name in d:
+                print "*** DocTestRunner.merge: '" + name + "' in both" \
+                    " testers; summing outcomes."
+                f2, t2 = d[name]
+                f = f + f2
+                t = t + t2
+            d[name] = f, t
+
+class OutputChecker:
+    """
+    A class used to check the whether the actual output from a doctest
+    example matches the expected output.  `OutputChecker` defines two
+    methods: `check_output`, which compares a given pair of outputs,
+    and returns true if they match; and `output_difference`, which
+    returns a string describing the differences between two outputs.
+    """
+    def check_output(self, want, got, optionflags):
+        """
+        Return True iff the actual output from an example (`got`)
+        matches the expected output (`want`).  These strings are
+        always considered to match if they are identical; but
+        depending on what option flags the test runner is using,
+        several non-exact match types are also possible.  See the
+        documentation for `TestRunner` for more information about
+        option flags.
+        """
+        # Handle the common case first, for efficiency:
+        # if they're string-identical, always return true.
+        if got == want:
+            return True
+
+        # The values True and False replaced 1 and 0 as the return
+        # value for boolean comparisons in Python 2.3.
+        if not (optionflags & DONT_ACCEPT_TRUE_FOR_1):
+            if (got,want) == ("True\n", "1\n"):
+                return True
+            if (got,want) == ("False\n", "0\n"):
+                return True
+
+        # <BLANKLINE> can be used as a special sequence to signify a
+        # blank line, unless the DONT_ACCEPT_BLANKLINE flag is used.
+        if not (optionflags & DONT_ACCEPT_BLANKLINE):
+            # Replace <BLANKLINE> in want with a blank line.
+            want = re.sub('(?m)^%s\s*?$' % re.escape(BLANKLINE_MARKER),
+                          '', want)
+            # If a line in got contains only spaces, then remove the
+            # spaces.
+            got = re.sub('(?m)^\s*?$', '', got)
+            if got == want:
+                return True
+
+        # This flag causes doctest to ignore any differences in the
+        # contents of whitespace strings.  Note that this can be used
+        # in conjunction with the ELLIPSIS flag.
+        if optionflags & NORMALIZE_WHITESPACE:
+            got = ' '.join(got.split())
+            want = ' '.join(want.split())
+            if got == want:
+                return True
+
+        # The ELLIPSIS flag says to let the sequence "..." in `want`
+        # match any substring in `got`.
+        if optionflags & ELLIPSIS:
+            if _ellipsis_match(want, got):
+                return True
+
+        # We didn't find any match; return false.
+        return False
+
+    # Should we do a fancy diff?
+    def _do_a_fancy_diff(self, want, got, optionflags):
+        # Not unless they asked for a fancy diff.
+        if not optionflags & (REPORT_UDIFF |
+                              REPORT_CDIFF |
+                              REPORT_NDIFF):
+            return False
+
+        # If expected output uses ellipsis, a meaningful fancy diff is
+        # too hard ... or maybe not.  In two real-life failures Tim saw,
+        # a diff was a major help anyway, so this is commented out.
+        # [todo] _ellipsis_match() knows which pieces do and don't match,
+        # and could be the basis for a kick-ass diff in this case.
+        ##if optionflags & ELLIPSIS and ELLIPSIS_MARKER in want:
+        ##    return False
+
+        # ndiff does intraline difference marking, so can be useful even
+        # for 1-line differences.
+        if optionflags & REPORT_NDIFF:
+            return True
+
+        # The other diff types need at least a few lines to be helpful.
+        return want.count('\n') > 2 and got.count('\n') > 2
+
+    def output_difference(self, example, got, optionflags):
+        """
+        Return a string describing the differences between the
+        expected output for a given example (`example`) and the actual
+        output (`got`).  `optionflags` is the set of option flags used
+        to compare `want` and `got`.
+        """
+        want = example.want
+        # If <BLANKLINE>s are being used, then replace blank lines
+        # with <BLANKLINE> in the actual output string.
+        if not (optionflags & DONT_ACCEPT_BLANKLINE):
+            got = re.sub('(?m)^[ ]*(?=\n)', BLANKLINE_MARKER, got)
+
+        # Check if we should use diff.
+        if self._do_a_fancy_diff(want, got, optionflags):
+            # Split want & got into lines.
+            want_lines = want.splitlines(True)  # True == keep line ends
+            got_lines = got.splitlines(True)
+            # Use difflib to find their differences.
+            if optionflags & REPORT_UDIFF:
+                diff = difflib.unified_diff(want_lines, got_lines, n=2)
+                diff = list(diff)[2:] # strip the diff header
+                kind = 'unified diff with -expected +actual'
+            elif optionflags & REPORT_CDIFF:
+                diff = difflib.context_diff(want_lines, got_lines, n=2)
+                diff = list(diff)[2:] # strip the diff header
+                kind = 'context diff with expected followed by actual'
+            elif optionflags & REPORT_NDIFF:
+                engine = difflib.Differ(charjunk=difflib.IS_CHARACTER_JUNK)
+                diff = list(engine.compare(want_lines, got_lines))
+                kind = 'ndiff with -expected +actual'
+            else:
+                assert 0, 'Bad diff option'
+            # Remove trailing whitespace on diff output.
+            diff = [line.rstrip() + '\n' for line in diff]
+            return 'Differences (%s):\n' % kind + _indent(''.join(diff))
+
+        # If we're not using diff, then simply list the expected
+        # output followed by the actual output.
+        if want and got:
+            return 'Expected:\n%sGot:\n%s' % (_indent(want), _indent(got))
+        elif want:
+            return 'Expected:\n%sGot nothing\n' % _indent(want)
+        elif got:
+            return 'Expected nothing\nGot:\n%s' % _indent(got)
+        else:
+            return 'Expected nothing\nGot nothing\n'
+
+class DocTestFailure(Exception):
+    """A DocTest example has failed in debugging mode.
+
+    The exception instance has variables:
+
+    - test: the DocTest object being run
+
+    - excample: the Example object that failed
+
+    - got: the actual output
+    """
+    def __init__(self, test, example, got):
+        self.test = test
+        self.example = example
+        self.got = got
+
+    def __str__(self):
+        return str(self.test)
+
+class UnexpectedException(Exception):
+    """A DocTest example has encountered an unexpected exception
+
+    The exception instance has variables:
+
+    - test: the DocTest object being run
+
+    - excample: the Example object that failed
+
+    - exc_info: the exception info
+    """
+    def __init__(self, test, example, exc_info):
+        self.test = test
+        self.example = example
+        self.exc_info = exc_info
+
+    def __str__(self):
+        return str(self.test)
+
+class DebugRunner(DocTestRunner):
+    r"""Run doc tests but raise an exception as soon as there is a failure.
+
+       If an unexpected exception occurs, an UnexpectedException is raised.
+       It contains the test, the example, and the original exception:
+
+         >>> runner = DebugRunner(verbose=False)
+         >>> test = DocTestParser().get_doctest('>>> raise KeyError\n42',
+         ...                                    {}, 'foo', 'foo.py', 0)
+         >>> try:
+         ...     runner.run(test)
+         ... except UnexpectedException, failure:
+         ...     pass
+
+         >>> failure.test is test
+         True
+
+         >>> failure.example.want
+         '42\n'
+
+         >>> exc_info = failure.exc_info
+         >>> raise exc_info[0], exc_info[1], exc_info[2]
+         Traceback (most recent call last):
+         ...
+         KeyError
+
+       We wrap the original exception to give the calling application
+       access to the test and example information.
+
+       If the output doesn't match, then a DocTestFailure is raised:
+
+         >>> test = DocTestParser().get_doctest('''
+         ...      >>> x = 1
+         ...      >>> x
+         ...      2
+         ...      ''', {}, 'foo', 'foo.py', 0)
+
+         >>> try:
+         ...    runner.run(test)
+         ... except DocTestFailure, failure:
+         ...    pass
+
+       DocTestFailure objects provide access to the test:
+
+         >>> failure.test is test
+         True
+
+       As well as to the example:
+
+         >>> failure.example.want
+         '2\n'
+
+       and the actual output:
+
+         >>> failure.got
+         '1\n'
+
+       If a failure or error occurs, the globals are left intact:
+
+         >>> del test.globs['__builtins__']
+         >>> test.globs
+         {'x': 1}
+
+         >>> test = DocTestParser().get_doctest('''
+         ...      >>> x = 2
+         ...      >>> raise KeyError
+         ...      ''', {}, 'foo', 'foo.py', 0)
+
+         >>> runner.run(test)
+         Traceback (most recent call last):
+         ...
+         UnexpectedException: <DocTest foo from foo.py:0 (2 examples)>
+
+         >>> del test.globs['__builtins__']
+         >>> test.globs
+         {'x': 2}
+
+       But the globals are cleared if there is no error:
+
+         >>> test = DocTestParser().get_doctest('''
+         ...      >>> x = 2
+         ...      ''', {}, 'foo', 'foo.py', 0)
+
+         >>> runner.run(test)
+         (0, 1)
+
+         >>> test.globs
+         {}
+
+       """
+
+    def run(self, test, compileflags=None, out=None, clear_globs=True):
+        r = DocTestRunner.run(self, test, compileflags, out, False)
+        if clear_globs:
+            test.globs.clear()
+        return r
+
+    def report_unexpected_exception(self, out, test, example, exc_info):
+        raise UnexpectedException(test, example, exc_info)
+
+    def report_failure(self, out, test, example, got):
+        raise DocTestFailure(test, example, got)
+
+######################################################################
+## 6. Test Functions
+######################################################################
+# These should be backwards compatible.
+
+# For backward compatibility, a global instance of a DocTestRunner
+# class, updated by testmod.
+master = None
+
+def testmod(m=None, name=None, globs=None, verbose=None, isprivate=None,
+            report=True, optionflags=0, extraglobs=None,
+            raise_on_error=False, exclude_empty=False):
+    """m=None, name=None, globs=None, verbose=None, isprivate=None,
+       report=True, optionflags=0, extraglobs=None, raise_on_error=False,
+       exclude_empty=False
+
+    Test examples in docstrings in functions and classes reachable
+    from module m (or the current module if m is not supplied), starting
+    with m.__doc__.  Unless isprivate is specified, private names
+    are not skipped.
+
+    Also test examples reachable from dict m.__test__ if it exists and is
+    not None.  m.__test__ maps names to functions, classes and strings;
+    function and class docstrings are tested even if the name is private;
+    strings are tested directly, as if they were docstrings.
+
+    Return (#failures, #tests).
+
+    See doctest.__doc__ for an overview.
+
+    Optional keyword arg "name" gives the name of the module; by default
+    use m.__name__.
+
+    Optional keyword arg "globs" gives a dict to be used as the globals
+    when executing examples; by default, use m.__dict__.  A copy of this
+    dict is actually used for each docstring, so that each docstring's
+    examples start with a clean slate.
+
+    Optional keyword arg "extraglobs" gives a dictionary that should be
+    merged into the globals that are used to execute examples.  By
+    default, no extra globals are used.  This is new in 2.4.
+
+    Optional keyword arg "verbose" prints lots of stuff if true, prints
+    only failures if false; by default, it's true iff "-v" is in sys.argv.
+
+    Optional keyword arg "report" prints a summary at the end when true,
+    else prints nothing at the end.  In verbose mode, the summary is
+    detailed, else very brief (in fact, empty if all tests passed).
+
+    Optional keyword arg "optionflags" or's together module constants,
+    and defaults to 0.  This is new in 2.3.  Possible values (see the
+    docs for details):
+
+        DONT_ACCEPT_TRUE_FOR_1
+        DONT_ACCEPT_BLANKLINE
+        NORMALIZE_WHITESPACE
+        ELLIPSIS
+        SKIP
+        IGNORE_EXCEPTION_DETAIL
+        REPORT_UDIFF
+        REPORT_CDIFF
+        REPORT_NDIFF
+        REPORT_ONLY_FIRST_FAILURE
+
+    Optional keyword arg "raise_on_error" raises an exception on the
+    first unexpected exception or failure. This allows failures to be
+    post-mortem debugged.
+
+    Deprecated in Python 2.4:
+    Optional keyword arg "isprivate" specifies a function used to
+    determine whether a name is private.  The default function is
+    treat all functions as public.  Optionally, "isprivate" can be
+    set to doctest.is_private to skip over functions marked as private
+    using the underscore naming convention; see its docs for details.
+
+    Advanced tomfoolery:  testmod runs methods of a local instance of
+    class doctest.Tester, then merges the results into (or creates)
+    global Tester instance doctest.master.  Methods of doctest.master
+    can be called directly too, if you want to do something unusual.
+    Passing report=0 to testmod is especially useful then, to delay
+    displaying a summary.  Invoke doctest.master.summarize(verbose)
+    when you're done fiddling.
+    """
+    global master
+
+    if isprivate is not None:
+        warnings.warn("the isprivate argument is deprecated; "
+                      "examine DocTestFinder.find() lists instead",
+                      DeprecationWarning)
+
+    # If no module was given, then use __main__.
+    if m is None:
+        # DWA - m will still be None if this wasn't invoked from the command
+        # line, in which case the following TypeError is about as good an error
+        # as we should expect
+        m = sys.modules.get('__main__')
+
+    # Check that we were actually given a module.
+    if not inspect.ismodule(m):
+        raise TypeError("testmod: module required; %r" % (m,))
+
+    # If no name was given, then use the module's name.
+    if name is None:
+        name = m.__name__
+
+    # Find, parse, and run all tests in the given module.
+    finder = DocTestFinder(_namefilter=isprivate, exclude_empty=exclude_empty)
+
+    if raise_on_error:
+        runner = DebugRunner(verbose=verbose, optionflags=optionflags)
+    else:
+        runner = DocTestRunner(verbose=verbose, optionflags=optionflags)
+
+    for test in finder.find(m, name, globs=globs, extraglobs=extraglobs):
+        runner.run(test)
+
+    if report:
+        runner.summarize()
+
+    if master is None:
+        master = runner
+    else:
+        master.merge(runner)
+
+    return runner.failures, runner.tries
+
+def testfile(filename, module_relative=True, name=None, package=None,
+             globs=None, verbose=None, report=True, optionflags=0,
+             extraglobs=None, raise_on_error=False, parser=DocTestParser()):
+    """
+    Test examples in the given file.  Return (#failures, #tests).
+
+    Optional keyword arg "module_relative" specifies how filenames
+    should be interpreted:
+
+      - If "module_relative" is True (the default), then "filename"
+         specifies a module-relative path.  By default, this path is
+         relative to the calling module's directory; but if the
+         "package" argument is specified, then it is relative to that
+         package.  To ensure os-independence, "filename" should use
+         "/" characters to separate path segments, and should not
+         be an absolute path (i.e., it may not begin with "/").
+
+      - If "module_relative" is False, then "filename" specifies an
+        os-specific path.  The path may be absolute or relative (to
+        the current working directory).
+
+    Optional keyword arg "name" gives the name of the test; by default
+    use the file's basename.
+
+    Optional keyword argument "package" is a Python package or the
+    name of a Python package whose directory should be used as the
+    base directory for a module relative filename.  If no package is
+    specified, then the calling module's directory is used as the base
+    directory for module relative filenames.  It is an error to
+    specify "package" if "module_relative" is False.
+
+    Optional keyword arg "globs" gives a dict to be used as the globals
+    when executing examples; by default, use {}.  A copy of this dict
+    is actually used for each docstring, so that each docstring's
+    examples start with a clean slate.
+
+    Optional keyword arg "extraglobs" gives a dictionary that should be
+    merged into the globals that are used to execute examples.  By
+    default, no extra globals are used.
+
+    Optional keyword arg "verbose" prints lots of stuff if true, prints
+    only failures if false; by default, it's true iff "-v" is in sys.argv.
+
+    Optional keyword arg "report" prints a summary at the end when true,
+    else prints nothing at the end.  In verbose mode, the summary is
+    detailed, else very brief (in fact, empty if all tests passed).
+
+    Optional keyword arg "optionflags" or's together module constants,
+    and defaults to 0.  Possible values (see the docs for details):
+
+        DONT_ACCEPT_TRUE_FOR_1
+        DONT_ACCEPT_BLANKLINE
+        NORMALIZE_WHITESPACE
+        ELLIPSIS
+        SKIP
+        IGNORE_EXCEPTION_DETAIL
+        REPORT_UDIFF
+        REPORT_CDIFF
+        REPORT_NDIFF
+        REPORT_ONLY_FIRST_FAILURE
+
+    Optional keyword arg "raise_on_error" raises an exception on the
+    first unexpected exception or failure. This allows failures to be
+    post-mortem debugged.
+
+    Optional keyword arg "parser" specifies a DocTestParser (or
+    subclass) that should be used to extract tests from the files.
+
+    Advanced tomfoolery:  testmod runs methods of a local instance of
+    class doctest.Tester, then merges the results into (or creates)
+    global Tester instance doctest.master.  Methods of doctest.master
+    can be called directly too, if you want to do something unusual.
+    Passing report=0 to testmod is especially useful then, to delay
+    displaying a summary.  Invoke doctest.master.summarize(verbose)
+    when you're done fiddling.
+    """
+    global master
+
+    if package and not module_relative:
+        raise ValueError("Package may only be specified for module-"
+                         "relative paths.")
+
+    # Relativize the path
+    text, filename = _load_testfile(filename, package, module_relative)
+
+    # If no name was given, then use the file's name.
+    if name is None:
+        name = os.path.basename(filename)
+
+    # Assemble the globals.
+    if globs is None:
+        globs = {}
+    else:
+        globs = globs.copy()
+    if extraglobs is not None:
+        globs.update(extraglobs)
+
+    if raise_on_error:
+        runner = DebugRunner(verbose=verbose, optionflags=optionflags)
+    else:
+        runner = DocTestRunner(verbose=verbose, optionflags=optionflags)
+
+    # Read the file, convert it to a test, and run it.
+    test = parser.get_doctest(text, globs, name, filename, 0)
+    runner.run(test)
+
+    if report:
+        runner.summarize()
+
+    if master is None:
+        master = runner
+    else:
+        master.merge(runner)
+
+    return runner.failures, runner.tries
+
+def run_docstring_examples(f, globs, verbose=False, name="NoName",
+                           compileflags=None, optionflags=0):
+    """
+    Test examples in the given object's docstring (`f`), using `globs`
+    as globals.  Optional argument `name` is used in failure messages.
+    If the optional argument `verbose` is true, then generate output
+    even if there are no failures.
+
+    `compileflags` gives the set of flags that should be used by the
+    Python compiler when running the examples.  If not specified, then
+    it will default to the set of future-import flags that apply to
+    `globs`.
+
+    Optional keyword arg `optionflags` specifies options for the
+    testing and output.  See the documentation for `testmod` for more
+    information.
+    """
+    # Find, parse, and run all tests in the given module.
+    finder = DocTestFinder(verbose=verbose, recurse=False)
+    runner = DocTestRunner(verbose=verbose, optionflags=optionflags)
+    for test in finder.find(f, name, globs=globs):
+        runner.run(test, compileflags=compileflags)
+
+######################################################################
+## 7. Tester
+######################################################################
+# This is provided only for backwards compatibility.  It's not
+# actually used in any way.
+
+class Tester:
+    def __init__(self, mod=None, globs=None, verbose=None,
+                 isprivate=None, optionflags=0):
+
+        warnings.warn("class Tester is deprecated; "
+                      "use class doctest.DocTestRunner instead",
+                      DeprecationWarning, stacklevel=2)
+        if mod is None and globs is None:
+            raise TypeError("Tester.__init__: must specify mod or globs")
+        if mod is not None and not inspect.ismodule(mod):
+            raise TypeError("Tester.__init__: mod must be a module; %r" %
+                            (mod,))
+        if globs is None:
+            globs = mod.__dict__
+        self.globs = globs
+
+        self.verbose = verbose
+        self.isprivate = isprivate
+        self.optionflags = optionflags
+        self.testfinder = DocTestFinder(_namefilter=isprivate)
+        self.testrunner = DocTestRunner(verbose=verbose,
+                                        optionflags=optionflags)
+
+    def runstring(self, s, name):
+        test = DocTestParser().get_doctest(s, self.globs, name, None, None)
+        if self.verbose:
+            print "Running string", name
+        (f,t) = self.testrunner.run(test)
+        if self.verbose:
+            print f, "of", t, "examples failed in string", name
+        return (f,t)
+
+    def rundoc(self, object, name=None, module=None):
+        f = t = 0
+        tests = self.testfinder.find(object, name, module=module,
+                                     globs=self.globs)
+        for test in tests:
+            (f2, t2) = self.testrunner.run(test)
+            (f,t) = (f+f2, t+t2)
+        return (f,t)
+
+    def rundict(self, d, name, module=None):
+        import new
+        m = new.module(name)
+        m.__dict__.update(d)
+        if module is None:
+            module = False
+        return self.rundoc(m, name, module)
+
+    def run__test__(self, d, name):
+        import new
+        m = new.module(name)
+        m.__test__ = d
+        return self.rundoc(m, name)
+
+    def summarize(self, verbose=None):
+        return self.testrunner.summarize(verbose)
+
+    def merge(self, other):
+        self.testrunner.merge(other.testrunner)
+
+######################################################################
+## 8. Unittest Support
+######################################################################
+
+_unittest_reportflags = 0
+
+def set_unittest_reportflags(flags):
+    """Sets the unittest option flags.
+
+    The old flag is returned so that a runner could restore the old
+    value if it wished to:
+
+      >>> import doctest
+      >>> old = doctest._unittest_reportflags
+      >>> doctest.set_unittest_reportflags(REPORT_NDIFF |
+      ...                          REPORT_ONLY_FIRST_FAILURE) == old
+      True
+
+      >>> doctest._unittest_reportflags == (REPORT_NDIFF |
+      ...                                   REPORT_ONLY_FIRST_FAILURE)
+      True
+
+    Only reporting flags can be set:
+
+      >>> doctest.set_unittest_reportflags(ELLIPSIS)
+      Traceback (most recent call last):
+      ...
+      ValueError: ('Only reporting flags allowed', 8)
+
+      >>> doctest.set_unittest_reportflags(old) == (REPORT_NDIFF |
+      ...                                   REPORT_ONLY_FIRST_FAILURE)
+      True
+    """
+    global _unittest_reportflags
+
+    if (flags & REPORTING_FLAGS) != flags:
+        raise ValueError("Only reporting flags allowed", flags)
+    old = _unittest_reportflags
+    _unittest_reportflags = flags
+    return old
+
+
+class DocTestCase(unittest.TestCase):
+
+    def __init__(self, test, optionflags=0, setUp=None, tearDown=None,
+                 checker=None):
+
+        unittest.TestCase.__init__(self)
+        self._dt_optionflags = optionflags
+        self._dt_checker = checker
+        self._dt_test = test
+        self._dt_setUp = setUp
+        self._dt_tearDown = tearDown
+
+    def setUp(self):
+        test = self._dt_test
+
+        if self._dt_setUp is not None:
+            self._dt_setUp(test)
+
+    def tearDown(self):
+        test = self._dt_test
+
+        if self._dt_tearDown is not None:
+            self._dt_tearDown(test)
+
+        test.globs.clear()
+
+    def runTest(self):
+        test = self._dt_test
+        old = sys.stdout
+        new = StringIO()
+        optionflags = self._dt_optionflags
+
+        if not (optionflags & REPORTING_FLAGS):
+            # The option flags don't include any reporting flags,
+            # so add the default reporting flags
+            optionflags |= _unittest_reportflags
+
+        runner = DocTestRunner(optionflags=optionflags,
+                               checker=self._dt_checker, verbose=False)
+
+        try:
+            runner.DIVIDER = "-"*70
+            failures, tries = runner.run(
+                test, out=new.write, clear_globs=False)
+        finally:
+            sys.stdout = old
+
+        if failures:
+            raise self.failureException(self.format_failure(new.getvalue()))
+
+    def format_failure(self, err):
+        test = self._dt_test
+        if test.lineno is None:
+            lineno = 'unknown line number'
+        else:
+            lineno = '%s' % test.lineno
+        lname = '.'.join(test.name.split('.')[-1:])
+        return ('Failed doctest test for %s\n'
+                '  File "%s", line %s, in %s\n\n%s'
+                % (test.name, test.filename, lineno, lname, err)
+                )
+
+    def debug(self):
+        r"""Run the test case without results and without catching exceptions
+
+           The unit test framework includes a debug method on test cases
+           and test suites to support post-mortem debugging.  The test code
+           is run in such a way that errors are not caught.  This way a
+           caller can catch the errors and initiate post-mortem debugging.
+
+           The DocTestCase provides a debug method that raises
+           UnexpectedException errors if there is an unexepcted
+           exception:
+
+             >>> test = DocTestParser().get_doctest('>>> raise KeyError\n42',
+             ...                {}, 'foo', 'foo.py', 0)
+             >>> case = DocTestCase(test)
+             >>> try:
+             ...     case.debug()
+             ... except UnexpectedException, failure:
+             ...     pass
+
+           The UnexpectedException contains the test, the example, and
+           the original exception:
+
+             >>> failure.test is test
+             True
+
+             >>> failure.example.want
+             '42\n'
+
+             >>> exc_info = failure.exc_info
+             >>> raise exc_info[0], exc_info[1], exc_info[2]
+             Traceback (most recent call last):
+             ...
+             KeyError
+
+           If the output doesn't match, then a DocTestFailure is raised:
+
+             >>> test = DocTestParser().get_doctest('''
+             ...      >>> x = 1
+             ...      >>> x
+             ...      2
+             ...      ''', {}, 'foo', 'foo.py', 0)
+             >>> case = DocTestCase(test)
+
+             >>> try:
+             ...    case.debug()
+             ... except DocTestFailure, failure:
+             ...    pass
+
+           DocTestFailure objects provide access to the test:
+
+             >>> failure.test is test
+             True
+
+           As well as to the example:
+
+             >>> failure.example.want
+             '2\n'
+
+           and the actual output:
+
+             >>> failure.got
+             '1\n'
+
+           """
+
+        self.setUp()
+        runner = DebugRunner(optionflags=self._dt_optionflags,
+                             checker=self._dt_checker, verbose=False)
+        runner.run(self._dt_test)
+        self.tearDown()
+
+    def id(self):
+        return self._dt_test.name
+
+    def __repr__(self):
+        name = self._dt_test.name.split('.')
+        return "%s (%s)" % (name[-1], '.'.join(name[:-1]))
+
+    __str__ = __repr__
+
+    def shortDescription(self):
+        return "Doctest: " + self._dt_test.name
+
+def DocTestSuite(module=None, globs=None, extraglobs=None, test_finder=None,
+                 **options):
+    """
+    Convert doctest tests for a module to a unittest test suite.
+
+    This converts each documentation string in a module that
+    contains doctest tests to a unittest test case.  If any of the
+    tests in a doc string fail, then the test case fails.  An exception
+    is raised showing the name of the file containing the test and a
+    (sometimes approximate) line number.
+
+    The `module` argument provides the module to be tested.  The argument
+    can be either a module or a module name.
+
+    If no argument is given, the calling module is used.
+
+    A number of options may be provided as keyword arguments:
+
+    setUp
+      A set-up function.  This is called before running the
+      tests in each file. The setUp function will be passed a DocTest
+      object.  The setUp function can access the test globals as the
+      globs attribute of the test passed.
+
+    tearDown
+      A tear-down function.  This is called after running the
+      tests in each file.  The tearDown function will be passed a DocTest
+      object.  The tearDown function can access the test globals as the
+      globs attribute of the test passed.
+
+    globs
+      A dictionary containing initial global variables for the tests.
+
+    optionflags
+       A set of doctest option flags expressed as an integer.
+    """
+
+    if test_finder is None:
+        test_finder = DocTestFinder()
+
+    module = _normalize_module(module)
+    tests = test_finder.find(module, globs=globs, extraglobs=extraglobs)
+    if globs is None:
+        globs = module.__dict__
+    if not tests:
+        # Why do we want to do this? Because it reveals a bug that might
+        # otherwise be hidden.
+        raise ValueError(module, "has no tests")
+
+    tests.sort()
+    suite = unittest.TestSuite()
+    for test in tests:
+        if len(test.examples) == 0:
+            continue
+        if not test.filename:
+            filename = module.__file__
+            if filename[-4:] in (".pyc", ".pyo"):
+                filename = filename[:-1]
+            test.filename = filename
+        suite.addTest(DocTestCase(test, **options))
+
+    return suite
+
+class DocFileCase(DocTestCase):
+
+    def id(self):
+        return '_'.join(self._dt_test.name.split('.'))
+
+    def __repr__(self):
+        return self._dt_test.filename
+    __str__ = __repr__
+
+    def format_failure(self, err):
+        return ('Failed doctest test for %s\n  File "%s", line 0\n\n%s'
+                % (self._dt_test.name, self._dt_test.filename, err)
+                )
+
+def DocFileTest(path, module_relative=True, package=None,
+                globs=None, parser=DocTestParser(), **options):
+    if globs is None:
+        globs = {}
+    else:
+        globs = globs.copy()
+
+    if package and not module_relative:
+        raise ValueError("Package may only be specified for module-"
+                         "relative paths.")
+
+    # Relativize the path.
+    doc, path = _load_testfile(path, package, module_relative)
+
+    if "__file__" not in globs:
+        globs["__file__"] = path
+
+    # Find the file and read it.
+    name = os.path.basename(path)
+
+    # Convert it to a test, and wrap it in a DocFileCase.
+    test = parser.get_doctest(doc, globs, name, path, 0)
+    return DocFileCase(test, **options)
+
+def DocFileSuite(*paths, **kw):
+    """A unittest suite for one or more doctest files.
+
+    The path to each doctest file is given as a string; the
+    interpretation of that string depends on the keyword argument
+    "module_relative".
+
+    A number of options may be provided as keyword arguments:
+
+    module_relative
+      If "module_relative" is True, then the given file paths are
+      interpreted as os-independent module-relative paths.  By
+      default, these paths are relative to the calling module's
+      directory; but if the "package" argument is specified, then
+      they are relative to that package.  To ensure os-independence,
+      "filename" should use "/" characters to separate path
+      segments, and may not be an absolute path (i.e., it may not
+      begin with "/").
+
+      If "module_relative" is False, then the given file paths are
+      interpreted as os-specific paths.  These paths may be absolute
+      or relative (to the current working directory).
+
+    package
+      A Python package or the name of a Python package whose directory
+      should be used as the base directory for module relative paths.
+      If "package" is not specified, then the calling module's
+      directory is used as the base directory for module relative
+      filenames.  It is an error to specify "package" if
+      "module_relative" is False.
+
+    setUp
+      A set-up function.  This is called before running the
+      tests in each file. The setUp function will be passed a DocTest
+      object.  The setUp function can access the test globals as the
+      globs attribute of the test passed.
+
+    tearDown
+      A tear-down function.  This is called after running the
+      tests in each file.  The tearDown function will be passed a DocTest
+      object.  The tearDown function can access the test globals as the
+      globs attribute of the test passed.
+
+    globs
+      A dictionary containing initial global variables for the tests.
+
+    optionflags
+      A set of doctest option flags expressed as an integer.
+
+    parser
+      A DocTestParser (or subclass) that should be used to extract
+      tests from the files.
+    """
+    suite = unittest.TestSuite()
+
+    # We do this here so that _normalize_module is called at the right
+    # level.  If it were called in DocFileTest, then this function
+    # would be the caller and we might guess the package incorrectly.
+    if kw.get('module_relative', True):
+        kw['package'] = _normalize_module(kw.get('package'))
+
+    for path in paths:
+        suite.addTest(DocFileTest(path, **kw))
+
+    return suite
+
+######################################################################
+## 9. Debugging Support
+######################################################################
+
+def script_from_examples(s):
+    r"""Extract script from text with examples.
+
+       Converts text with examples to a Python script.  Example input is
+       converted to regular code.  Example output and all other words
+       are converted to comments:
+
+       >>> text = '''
+       ...       Here are examples of simple math.
+       ...
+       ...           Python has super accurate integer addition
+       ...
+       ...           >>> 2 + 2
+       ...           5
+       ...
+       ...           And very friendly error messages:
+       ...
+       ...           >>> 1/0
+       ...           To Infinity
+       ...           And
+       ...           Beyond
+       ...
+       ...           You can use logic if you want:
+       ...
+       ...           >>> if 0:
+       ...           ...    blah
+       ...           ...    blah
+       ...           ...
+       ...
+       ...           Ho hum
+       ...           '''
+
+       >>> print script_from_examples(text)
+       # Here are examples of simple math.
+       #
+       #     Python has super accurate integer addition
+       #
+       2 + 2
+       # Expected:
+       ## 5
+       #
+       #     And very friendly error messages:
+       #
+       1/0
+       # Expected:
+       ## To Infinity
+       ## And
+       ## Beyond
+       #
+       #     You can use logic if you want:
+       #
+       if 0:
+          blah
+          blah
+       #
+       #     Ho hum
+       <BLANKLINE>
+       """
+    output = []
+    for piece in DocTestParser().parse(s):
+        if isinstance(piece, Example):
+            # Add the example's source code (strip trailing NL)
+            output.append(piece.source[:-1])
+            # Add the expected output:
+            want = piece.want
+            if want:
+                output.append('# Expected:')
+                output += ['## '+l for l in want.split('\n')[:-1]]
+        else:
+            # Add non-example text.
+            output += [_comment_line(l)
+                       for l in piece.split('\n')[:-1]]
+
+    # Trim junk on both ends.
+    while output and output[-1] == '#':
+        output.pop()
+    while output and output[0] == '#':
+        output.pop(0)
+    # Combine the output, and return it.
+    # Add a courtesy newline to prevent exec from choking (see bug #1172785)
+    return '\n'.join(output) + '\n'
+
+def testsource(module, name):
+    """Extract the test sources from a doctest docstring as a script.
+
+    Provide the module (or dotted name of the module) containing the
+    test to be debugged and the name (within the module) of the object
+    with the doc string with tests to be debugged.
+    """
+    module = _normalize_module(module)
+    tests = DocTestFinder().find(module)
+    test = [t for t in tests if t.name == name]
+    if not test:
+        raise ValueError(name, "not found in tests")
+    test = test[0]
+    testsrc = script_from_examples(test.docstring)
+    return testsrc
+
+def debug_src(src, pm=False, globs=None):
+    """Debug a single doctest docstring, in argument `src`'"""
+    testsrc = script_from_examples(src)
+    debug_script(testsrc, pm, globs)
+
+def debug_script(src, pm=False, globs=None):
+    "Debug a test script.  `src` is the script, as a string."
+    import pdb
+
+    # Note that tempfile.NameTemporaryFile() cannot be used.  As the
+    # docs say, a file so created cannot be opened by name a second time
+    # on modern Windows boxes, and execfile() needs to open it.
+    srcfilename = tempfile.mktemp(".py", "doctestdebug")
+    f = open(srcfilename, 'w')
+    f.write(src)
+    f.close()
+
+    try:
+        if globs:
+            globs = globs.copy()
+        else:
+            globs = {}
+
+        if pm:
+            try:
+                execfile(srcfilename, globs, globs)
+            except:
+                print sys.exc_info()[1]
+                pdb.post_mortem(sys.exc_info()[2])
+        else:
+            # Note that %r is vital here.  '%s' instead can, e.g., cause
+            # backslashes to get treated as metacharacters on Windows.
+            pdb.run("execfile(%r)" % srcfilename, globs, globs)
+
+    finally:
+        os.remove(srcfilename)
+
+def debug(module, name, pm=False):
+    """Debug a single doctest docstring.
+
+    Provide the module (or dotted name of the module) containing the
+    test to be debugged and the name (within the module) of the object
+    with the docstring with tests to be debugged.
+    """
+    module = _normalize_module(module)
+    testsrc = testsource(module, name)
+    debug_script(testsrc, pm, module.__dict__)
+
+######################################################################
+## 10. Example Usage
+######################################################################
+class _TestClass:
+    """
+    A pointless class, for sanity-checking of docstring testing.
+
+    Methods:
+        square()
+        get()
+
+    >>> _TestClass(13).get() + _TestClass(-12).get()
+    1
+    >>> hex(_TestClass(13).square().get())
+    '0xa9'
+    """
+
+    def __init__(self, val):
+        """val -> _TestClass object with associated value val.
+
+        >>> t = _TestClass(123)
+        >>> print t.get()
+        123
+        """
+
+        self.val = val
+
+    def square(self):
+        """square() -> square TestClass's associated value
+
+        >>> _TestClass(13).square().get()
+        169
+        """
+
+        self.val = self.val ** 2
+        return self
+
+    def get(self):
+        """get() -> return TestClass's associated value.
+
+        >>> x = _TestClass(-42)
+        >>> print x.get()
+        -42
+        """
+
+        return self.val
+
+__test__ = {"_TestClass": _TestClass,
+            "string": r"""
+                      Example of a string object, searched as-is.
+                      >>> x = 1; y = 2
+                      >>> x + y, x * y
+                      (3, 2)
+                      """,
+
+            "bool-int equivalence": r"""
+                                    In 2.2, boolean expressions displayed
+                                    0 or 1.  By default, we still accept
+                                    them.  This can be disabled by passing
+                                    DONT_ACCEPT_TRUE_FOR_1 to the new
+                                    optionflags argument.
+                                    >>> 4 == 4
+                                    1
+                                    >>> 4 == 4
+                                    True
+                                    >>> 4 > 4
+                                    0
+                                    >>> 4 > 4
+                                    False
+                                    """,
+
+            "blank lines": r"""
+                Blank lines can be marked with <BLANKLINE>:
+                    >>> print 'foo\n\nbar\n'
+                    foo
+                    <BLANKLINE>
+                    bar
+                    <BLANKLINE>
+            """,
+
+            "ellipsis": r"""
+                If the ellipsis flag is used, then '...' can be used to
+                elide substrings in the desired output:
+                    >>> print range(1000) #doctest: +ELLIPSIS
+                    [0, 1, 2, ..., 999]
+            """,
+
+            "whitespace normalization": r"""
+                If the whitespace normalization flag is used, then
+                differences in whitespace are ignored.
+                    >>> print range(30) #doctest: +NORMALIZE_WHITESPACE
+                    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
+                     15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
+                     27, 28, 29]
+            """,
+           }
+
+def _test():
+    r = unittest.TextTestRunner()
+    r.run(DocTestSuite())
+
+if __name__ == "__main__":
+    _test()

Added: mechanize/tags/0.1.10/test-tools/linecache_copy.py
===================================================================
--- mechanize/tags/0.1.10/test-tools/linecache_copy.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test-tools/linecache_copy.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,132 @@
+"""Cache lines from files.
+
+This is intended to read lines from modules imported -- hence if a filename
+is not found, it will look down the module search path for a file by
+that name.
+"""
+
+import sys
+import os
+
+__all__ = ["getline", "clearcache", "checkcache"]
+
+def getline(filename, lineno, module_globals=None):
+    lines = getlines(filename, module_globals)
+    if 1 <= lineno <= len(lines):
+        return lines[lineno-1]
+    else:
+        return ''
+
+
+# The cache
+
+cache = {} # The cache
+
+
+def clearcache():
+    """Clear the cache entirely."""
+
+    global cache
+    cache = {}
+
+
+def getlines(filename, module_globals=None):
+    """Get the lines for a file from the cache.
+    Update the cache if it doesn't contain an entry for this file already."""
+
+    if filename in cache:
+        return cache[filename][2]
+    else:
+        return updatecache(filename, module_globals)
+
+
+def checkcache(filename=None):
+    """Discard cache entries that are out of date.
+    (This is not checked upon each call!)"""
+
+    if filename is None:
+        filenames = cache.keys()
+    else:
+        if filename in cache:
+            filenames = [filename]
+        else:
+            return
+
+    for filename in filenames:
+        size, mtime, lines, fullname = cache[filename]
+        if mtime is None:
+            continue   # no-op for files loaded via a __loader__
+        try:
+            stat = os.stat(fullname)
+        except os.error:
+            del cache[filename]
+            continue
+        if size != stat.st_size or mtime != stat.st_mtime:
+            del cache[filename]
+
+
+def updatecache(filename, module_globals=None):
+    """Update a cache entry and return its list of lines.
+    If something's wrong, print a message, discard the cache entry,
+    and return an empty list."""
+
+    if filename in cache:
+        del cache[filename]
+    if not filename or filename[0] + filename[-1] == '<>':
+        return []
+
+    fullname = filename
+    try:
+        stat = os.stat(fullname)
+    except os.error, msg:
+        basename = os.path.split(filename)[1]
+
+        # Try for a __loader__, if available
+        if module_globals and '__loader__' in module_globals:
+            name = module_globals.get('__name__')
+            loader = module_globals['__loader__']
+            get_source = getattr(loader, 'get_source', None)
+
+            if name and get_source:
+                if basename.startswith(name.split('.')[-1]+'.'):
+                    try:
+                        data = get_source(name)
+                    except (ImportError, IOError):
+                        pass
+                    else:
+                        cache[filename] = (
+                            len(data), None,
+                            [line+'\n' for line in data.splitlines()], fullname
+                        )
+                        return cache[filename][2]
+
+        # Try looking through the module search path.
+
+        for dirname in sys.path:
+            # When using imputil, sys.path may contain things other than
+            # strings; ignore them when it happens.
+            try:
+                fullname = os.path.join(dirname, basename)
+            except (TypeError, AttributeError):
+                # Not sufficiently string-like to do anything useful with.
+                pass
+            else:
+                try:
+                    stat = os.stat(fullname)
+                    break
+                except os.error:
+                    pass
+        else:
+            # No luck
+##          print '*** Cannot stat', filename, ':', msg
+            return []
+    try:
+        fp = open(fullname, 'rU')
+        lines = fp.readlines()
+        fp.close()
+    except IOError, msg:
+##      print '*** Cannot open', fullname, ':', msg
+        return []
+    size, mtime = stat.st_size, stat.st_mtime
+    cache[filename] = size, mtime, lines, fullname
+    return lines

Added: mechanize/tags/0.1.10/test-tools/testprogram.py
===================================================================
--- mechanize/tags/0.1.10/test-tools/testprogram.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test-tools/testprogram.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,330 @@
+"""Local server and cgitb support."""
+
+import cgitb
+#cgitb.enable(format="text")
+
+import glob
+import logging
+import os
+import subprocess
+import sys
+import time
+import traceback
+from unittest import defaultTestLoader, TextTestRunner, TestSuite, TestCase, \
+     _TextTestResult
+
+
+class ServerStartupError(Exception):
+
+    pass
+
+
+class ServerProcess:
+
+    def __init__(self, filename, name=None):
+        if filename is None:
+            raise ValueError('filename arg must be a string')
+        if name is None:
+            name = filename
+        self.name = os.path.basename(name)
+        self.port = None
+        self.report_hook = lambda msg: None
+        self._filename = filename
+        self._args = None
+        self._process = None
+
+    def _get_args(self):
+        """Return list of command line arguments.
+
+        Override me.
+        """
+        return []
+
+    def start(self):
+        self._args = [sys.executable, self._filename]+self._get_args()
+        self.report_hook("starting (%s)" % (self._args,))
+        self._process = subprocess.Popen(self._args)
+        self.report_hook("waiting for startup")
+        self._wait_for_startup()
+        self.report_hook("running")
+
+    def _wait_for_startup(self):
+        import socket
+        def connect():
+            self._process.poll()
+            if self._process.returncode is not None:
+                message = ("server exited on startup with status %d: %r" %
+                           (self._process.returncode, self._args))
+                raise ServerStartupError(message)
+            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+            sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+            sock.settimeout(1.0)
+            try:
+                sock.connect(('127.0.0.1', self.port))
+            finally:
+                sock.close()
+        backoff(connect, (socket.error,))
+
+    def stop(self):
+        """Kill process (forcefully if necessary)."""
+        pid = self._process.pid
+        if os.name == 'nt':
+            kill_windows(pid, self.report_hook)
+        else:
+            kill_posix(pid, self.report_hook)
+
+def backoff(func, errors,
+            initial_timeout=1., hard_timeout=60., factor=1.2):
+    starttime = time.time()
+    timeout = initial_timeout
+    while time.time() < starttime + hard_timeout - 0.01:
+        try:
+            func()
+        except errors, exc:
+            time.sleep(timeout)
+            timeout *= factor
+            hard_limit = hard_timeout - (time.time() - starttime)
+            timeout = min(timeout, hard_limit)
+        else:
+            break
+    else:
+        raise
+
+def kill_windows(handle, report_hook):
+    try:
+        import win32api
+    except ImportError:
+        import ctypes
+        ctypes.windll.kernel32.TerminateProcess(int(handle), -1)
+    else:
+        win32api.TerminateProcess(int(handle), -1)
+
+def kill_posix(pid, report_hook):
+    import signal
+    os.kill(pid, signal.SIGTERM)
+
+    timeout = 10.
+    starttime = time.time()
+    report_hook("waiting for exit")
+    def do_nothing(*args):
+        pass
+    old_handler = signal.signal(signal.SIGCHLD, do_nothing)
+    try:
+        while time.time() < starttime + timeout - 0.01:
+            pid, sts = os.waitpid(pid, os.WNOHANG)
+            if pid != 0:
+                # exited, or error
+                break
+            newtimeout = timeout - (time.time() - starttime) - 1.
+            time.sleep(newtimeout)  # wait for signal
+        else:
+            report_hook("forcefully killing")
+            try:
+                os.kill(pid, signal.SIGKILL)
+            except OSError, exc:
+                if exc.errno != errno.ECHILD:
+                    raise
+    finally:
+        signal.signal(signal.SIGCHLD, old_handler)
+
+class TwistedServerProcess(ServerProcess):
+
+    def __init__(self, name=None):
+        top_level_dir = os.path.dirname(os.path.abspath(sys.argv[0]))
+        path = os.path.join(top_level_dir, "test-tools/twisted-localserver.py")
+        ServerProcess.__init__(self, path, name)
+
+    def _get_args(self):
+        return [str(self.port)]
+
+
+class CgitbTextResult(_TextTestResult):
+    def _exc_info_to_string(self, err, test):
+        """Converts a sys.exc_info()-style tuple of values into a string."""
+        exctype, value, tb = err
+        # Skip test runner traceback levels
+        while tb and self._is_relevant_tb_level(tb):
+            tb = tb.tb_next
+        if exctype is test.failureException:
+            # Skip assert*() traceback levels
+            length = self._count_relevant_tb_levels(tb)
+            return cgitb.text((exctype, value, tb))
+        return cgitb.text((exctype, value, tb))
+
+class CgitbTextTestRunner(TextTestRunner):
+    def _makeResult(self):
+        return CgitbTextResult(self.stream, self.descriptions, self.verbosity)
+
+def add_uri_attribute_to_test_cases(suite, uri):
+    for test in suite._tests:
+        if isinstance(test, TestCase):
+            test.uri = uri
+        else:
+            try:
+                add_uri_attribute_to_test_cases(test, uri)
+            except AttributeError:
+                pass
+
+
+class TestProgram:
+    """A command-line program that runs a set of tests; this is primarily
+       for making test modules conveniently executable.
+    """
+    USAGE = """\
+Usage: %(progName)s [options] [test] [...]
+
+Note not all the functional tests take note of the --uri argument yet --
+some currently always access the internet regardless of the --uri and
+--run-local-server options.
+
+Options:
+  -l, --run-local-server
+                   Run a local Twisted HTTP server for the functional
+                   tests.  You need Twisted installed for this to work.
+                   The server is run on the port given in the --uri
+                   option.  If --run-local-server is given but no --uri is
+                   given, http://127.0.0.1:8000 is used as the base URI.
+                   Also, if you're on Windows and don't have pywin32 or
+                   ctypes installed, this option won't work, and you'll
+                   have to start up test-tools/localserver.py manually.
+  --uri=URL        Base URI for functional tests
+                   (test.py does not access the network, unless you tell
+                   it to run module functional_tests;
+                   functional_tests.py does access the network)
+                   e.g. --uri=http://127.0.0.1:8000/
+  -h, --help       Show this message
+  -v, --verbose    Verbose output
+  -q, --quiet      Minimal output
+
+The following options are only available through test.py (you can still run the
+functional tests through test.py, just give 'functional_tests' as the module
+name to run):
+
+  -u               Skip plain (non-doctest) unittests
+  -d               Skip doctests
+  -c               Run coverage (requires coverage.py, seems buggy)
+  -t               Display tracebacks using cgitb's text mode
+
+"""
+    USAGE_EXAMPLES = """
+Examples:
+  %(progName)s
+                 - run all tests
+  %(progName)s test_cookies
+                 - run module 'test_cookies'
+  %(progName)s test_cookies.CookieTests
+                 - run all 'test*' test methods in test_cookies.CookieTests
+  %(progName)s test_cookies.CookieTests.test_expires
+                 - run test_cookies.CookieTests.test_expires
+
+  %(progName)s functional_tests
+                 - run the functional tests
+  %(progName)s -l functional_tests
+                 - start a local Twisted HTTP server and run the functional
+                   tests against that, rather than against SourceForge
+                   (quicker!)
+"""
+    def __init__(self, moduleNames, localServerProcess, defaultTest=None,
+                 argv=None, testRunner=None, testLoader=defaultTestLoader,
+                 defaultUri="http://wwwsearch.sourceforge.net/",
+                 usageExamples=USAGE_EXAMPLES,
+                 ):
+        self.modules = []
+        for moduleName in moduleNames:
+            module = __import__(moduleName)
+            for part in moduleName.split('.')[1:]:
+                module = getattr(module, part)
+            self.modules.append(module)
+        self.uri = None
+        self._defaultUri = defaultUri
+        if argv is None:
+            argv = sys.argv
+        self.verbosity = 1
+        self.defaultTest = defaultTest
+        self.testRunner = testRunner
+        self.testLoader = testLoader
+        self.progName = os.path.basename(argv[0])
+        self.usageExamples = usageExamples
+        self.runLocalServer = False
+        self.parseArgs(argv)
+        if self.runLocalServer:
+            import urllib
+            from mechanize._rfc3986 import urlsplit
+            authority = urlsplit(self.uri)[1]
+            host, port = urllib.splitport(authority)
+            if port is None:
+                port = "80"
+            try:
+                port = int(port)
+            except:
+                self.usageExit("port in --uri value must be an integer "
+                               "(try --uri=http://127.0.0.1:8000/)")
+            self._serverProcess = localServerProcess
+            def report(msg):
+                print "%s: %s" % (localServerProcess.name, msg)
+            localServerProcess.port = port
+            localServerProcess.report_hook = report
+
+    def usageExit(self, msg=None):
+        if msg: print msg
+        print (self.USAGE + self.usageExamples) % self.__dict__
+        sys.exit(2)
+
+    def parseArgs(self, argv):
+        import getopt
+        try:
+            options, args = getopt.getopt(
+                argv[1:],
+                'hHvql',
+                ['help','verbose','quiet', 'uri=', 'run-local-server'],
+                )
+            uri = None
+            for opt, value in options:
+                if opt in ('-h','-H','--help'):
+                    self.usageExit()
+                if opt in ('--uri',):
+                    uri = value
+                if opt in ('-q','--quiet'):
+                    self.verbosity = 0
+                if opt in ('-v','--verbose'):
+                    self.verbosity = 2
+                if opt in ('-l', '--run-local-server'):
+                    self.runLocalServer = True
+            if uri is None:
+                if self.runLocalServer:
+                    uri = "http://127.0.0.1:8000"
+                else:
+                    uri = self._defaultUri
+            self.uri = uri
+            if len(args) == 0 and self.defaultTest is None:
+                suite = TestSuite()
+                for module in self.modules:
+                    test = self.testLoader.loadTestsFromModule(module)
+                    suite.addTest(test)
+                self.test = suite
+                add_uri_attribute_to_test_cases(self.test, self.uri)
+                return
+            if len(args) > 0:
+                self.testNames = args
+            else:
+                self.testNames = (self.defaultTest,)
+            self.createTests()
+            add_uri_attribute_to_test_cases(self.test, self.uri)
+        except getopt.error, msg:
+            self.usageExit(msg)
+
+    def createTests(self):
+        self.test = self.testLoader.loadTestsFromNames(self.testNames)
+
+    def runTests(self):
+        if self.testRunner is None:
+            self.testRunner = TextTestRunner(verbosity=self.verbosity)
+
+        if self.runLocalServer:
+            self._serverProcess.start()
+        try:
+            result = self.testRunner.run(self.test)
+        finally:
+            if self.runLocalServer:
+                self._serverProcess.stop()
+        return result

Added: mechanize/tags/0.1.10/test-tools/twisted-localserver.py
===================================================================
--- mechanize/tags/0.1.10/test-tools/twisted-localserver.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test-tools/twisted-localserver.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,214 @@
+#!/usr/bin/env python
+"""
+%prog port
+
+e.g. %prog 8000
+
+Runs a local server to point the mechanize functional tests at.  Example:
+
+python test-tools/twisted-localserver.py 8042
+python functional_tests.py --uri=http://localhost:8042/
+
+You need twisted.web2 to run it.  On ubuntu feisty, you can install it like so:
+
+sudo apt-get install python-twisted-web2
+"""
+
+import sys, re
+
+from twisted.cred import portal, checkers
+from twisted.internet import reactor
+from twisted.web2 import server, http, resource, channel, \
+     http_headers, responsecode, twcgi
+from twisted.web2.auth import basic, digest, wrapper
+from twisted.web2.auth.interfaces import IHTTPUser
+
+from zope.interface import implements
+
+
+def html(title=None):
+    f = open("README.html", "r")
+    html = f.read()
+    if title is not None:
+        html = re.sub("<title>(.*)</title>", "<title>%s</title>" % title, html)
+    return html
+
+MECHANIZE_HTML = html()
+ROOT_HTML = html("Python bits")
+RELOAD_TEST_HTML = """\
+<html>
+<head><title>Title</title></head>
+<body>
+
+<a href="/mechanize">near the start</a>
+
+<p>Now some data to prevent HEAD parsing from reading the link near
+the end.
+
+<pre>
+%s</pre>
+
+<a href="/mechanize">near the end</a>
+
+</body>
+
+</html>""" % (("0123456789ABCDEF"*4+"\n")*61)
+REFERER_TEST_HTML = """\
+<html>
+<head>
+<title>mechanize Referer (sic) test page</title>
+</head>
+<body>
+<p>This page exists to test the Referer functionality of <a href="/mechanize">mechanize</a>.
+<p><a href="/cgi-bin/cookietest.cgi">Here</a> is a link to a page that displays the Referer header.
+</body>
+</html>"""
+
+
+BASIC_AUTH_PAGE = """
+<html>
+<head>
+<title>Basic Auth Protected Area</title>
+</head>
+<body>
+<p>Hello, basic auth world.
+<p>
+</body>
+</html>
+"""
+
+
+DIGEST_AUTH_PAGE = """
+<html>
+<head>
+<title>Digest Auth Protected Area</title>
+</head>
+<body>
+<p>Hello, digest auth world.
+<p>
+</body>
+</html>
+"""
+
+
+class TestHTTPUser(object):
+    """
+    Test avatar implementation for http auth with cred
+    """
+    implements(IHTTPUser)
+
+    username = None
+
+    def __init__(self, username):
+        """
+        @param username: The str username sent as part of the HTTP auth
+            response.
+        """
+        self.username = username
+
+
+class TestAuthRealm(object):
+    """
+    Test realm that supports the IHTTPUser interface
+    """
+
+    implements(portal.IRealm)
+
+    def requestAvatar(self, avatarId, mind, *interfaces):
+        if IHTTPUser in interfaces:
+            if avatarId == checkers.ANONYMOUS:
+                return IHTTPUser, TestHTTPUser('anonymous')
+
+            return IHTTPUser, TestHTTPUser(avatarId)
+
+        raise NotImplementedError("Only IHTTPUser interface is supported")
+
+
+class Page(resource.Resource):
+
+  addSlash = True
+  content_type = http_headers.MimeType("text", "html")
+
+  def render(self, ctx):
+    return http.Response(
+        responsecode.OK,
+        {"content-type": self.content_type},
+        self.text)
+
+def _make_page(parent, name, text, content_type, wrapper,
+               leaf=False):
+    page = Page()
+    page.text = text
+    base_type, specific_type = content_type.split("/")
+    page.content_type = http_headers.MimeType(base_type, specific_type)
+    page.addSlash = not leaf
+    parent.putChild(name, wrapper(page))
+    return page
+
+def make_page(parent, name, text,
+              content_type="text/html", wrapper=lambda page: page):
+    return _make_page(parent, name, text, content_type, wrapper, leaf=False)
+
+def make_leaf_page(parent, name, text,
+                   content_type="text/html", wrapper=lambda page: page):
+    return _make_page(parent, name, text, content_type, wrapper, leaf=True)
+
+def make_redirect(parent, name, location_relative_ref):
+    redirect = resource.RedirectResource(path=location_relative_ref)
+    setattr(parent, "child_"+name, redirect)
+    return redirect
+
+def make_cgi_bin(parent, name, dir_name):
+    cgi_bin = twcgi.CGIDirectory(dir_name)
+    setattr(parent, "child_"+name, cgi_bin)
+    return cgi_bin
+
+def require_basic_auth(resource):
+    p = portal.Portal(TestAuthRealm())
+    c = checkers.InMemoryUsernamePasswordDatabaseDontUse()
+    c.addUser("john", "john")
+    p.registerChecker(c)
+    cred_factory = basic.BasicCredentialFactory("Basic Auth protected area")
+    return wrapper.HTTPAuthResource(resource,
+                                    [cred_factory],
+                                    p,
+                                    interfaces=(IHTTPUser,))
+
+def require_digest_auth(resource):
+    p = portal.Portal(TestAuthRealm())
+    c = checkers.InMemoryUsernamePasswordDatabaseDontUse()
+    c.addUser("digestuser", "digestuser")
+    p.registerChecker(c)
+    cred_factory = digest.DigestCredentialFactory("MD5",
+                                                  "Digest Auth protected area")
+    return wrapper.HTTPAuthResource(resource,
+                                    [cred_factory],
+                                    p,
+                                    interfaces=(IHTTPUser,))
+
+def main():
+    root = Page()
+    root.text = ROOT_HTML
+    make_page(root, "mechanize", MECHANIZE_HTML)
+    make_leaf_page(root, "robots.txt",
+                   "User-Agent: *\nDisallow: /norobots",
+                   "text/plain")
+    make_leaf_page(root, "robots", "Hello, robots.", "text/plain")
+    make_leaf_page(root, "norobots", "Hello, non-robots.", "text/plain")
+    bits = make_page(root, "bits", "GeneralFAQ.html")
+    make_leaf_page(bits, "cctest2.txt",
+                   "Hello ClientCookie functional test suite.",
+                   "text/plain")
+    make_leaf_page(bits, "referertest.html", REFERER_TEST_HTML)
+    make_leaf_page(bits, "mechanize_reload_test.html", RELOAD_TEST_HTML)
+    make_redirect(root, "redirected", "/doesnotexist")
+    make_cgi_bin(root, "cgi-bin", "test-tools")
+    make_page(root, "basic_auth", BASIC_AUTH_PAGE, wrapper=require_basic_auth)
+    make_page(root, "digest_auth", DIGEST_AUTH_PAGE,
+              wrapper=require_digest_auth)
+
+    site = server.Site(root)
+    reactor.listenTCP(int(sys.argv[1]), channel.HTTPFactory(site))
+    reactor.run()
+
+main()

Added: mechanize/tags/0.1.10/test.py
===================================================================
--- mechanize/tags/0.1.10/test.py	                        (rev 0)
+++ mechanize/tags/0.1.10/test.py	2008-12-07 13:26:32 UTC (rev 93744)
@@ -0,0 +1,151 @@
+#!/usr/bin/env python
+
+"""Test runner.
+
+For further help, enter this at a command prompt:
+
+python test.py --help
+
+"""
+
+# Modules containing tests to run -- a test is anything named *Tests, which
+# should be classes deriving from unittest.TestCase.
+MODULE_NAMES = ["test_date", "test_browser", "test_response", "test_cookies",
+                "test_headers", "test_urllib2", "test_pullparser",
+                "test_useragent", "test_html", "test_opener",
+                ]
+
+import sys, os, logging, glob
+
+
+if __name__ == "__main__":
+    # XXX
+    # temporary stop-gap to run doctests &c.
+    # should switch to nose or something
+
+    top_level_dir = os.path.dirname(os.path.abspath(sys.argv[0]))
+
+    # XXXX coverage output seems incorrect ATM
+    run_coverage = "-c" in sys.argv
+    if run_coverage:
+        sys.argv.remove("-c")
+    use_cgitb = "-t" in sys.argv
+    if use_cgitb:
+        sys.argv.remove("-t")
+    run_doctests = "-d" not in sys.argv
+    if not run_doctests:
+        sys.argv.remove("-d")
+    run_unittests = "-u" not in sys.argv
+    if not run_unittests:
+        sys.argv.remove("-u")
+    log = "-l" in sys.argv
+    if log:
+        sys.argv.remove("-l")
+        level = logging.DEBUG
+#         level = logging.INFO
+#         level = logging.WARNING
+#         level = logging.NOTSET
+        logger = logging.getLogger("mechanize")
+        logger.setLevel(level)
+        handler = logging.StreamHandler(sys.stdout)
+        handler.setLevel(level)
+        logger.addHandler(handler)
+
+    # import local copy of Python 2.5 doctest
+    assert os.path.isdir("test")
+    sys.path.insert(0, "test")
+    # needed for recent doctest / linecache -- this is only for testing
+    # purposes, these don't get installed
+    # doctest.py revision 45701 and linecache.py revision 45940.  Since
+    # linecache is used by Python itself, linecache.py is renamed
+    # linecache_copy.py, and this copy of doctest is modified (only) to use
+    # that renamed module.
+    sys.path.insert(0, "test-tools")
+    import doctest
+    import testprogram
+
+    if run_coverage:
+        import coverage
+        print 'running coverage'
+        coverage.erase()
+        coverage.start()
+
+    import mechanize
+
+    class DefaultResult:
+        def wasSuccessful(self):
+            return True
+    result = DefaultResult()
+
+    if run_doctests:
+        # run .doctest files needing special support
+        common_globs = {"mechanize": mechanize}
+        pm_doctest_filename = os.path.join(
+            "test", "test_password_manager.special_doctest")
+        for globs in [
+            {"mgr_class": mechanize.HTTPPasswordMgr},
+            {"mgr_class": mechanize.HTTPProxyPasswordMgr},
+            ]:
+            globs.update(common_globs)
+            doctest.testfile(pm_doctest_filename, globs=globs)
+        try:
+            import robotparser
+        except ImportError:
+            pass
+        else:
+            doctest.testfile(os.path.join(
+                    "test", "test_robotfileparser.special_doctest"))
+
+        # run .doctest files
+        doctest_files = glob.glob(os.path.join("test", "*.doctest"))
+        for df in doctest_files:
+            doctest.testfile(df)
+
+        # run doctests in docstrings
+        from mechanize import _headersutil, _auth, _clientcookie, _pullparser, \
+             _http, _rfc3986, _useragent
+        doctest.testmod(_headersutil)
+        doctest.testmod(_rfc3986)
+        doctest.testmod(_auth)
+        doctest.testmod(_clientcookie)
+        doctest.testmod(_pullparser)
+        doctest.testmod(_http)
+        doctest.testmod(_useragent)
+
+    if run_unittests:
+        # run vanilla unittest tests
+        import unittest
+        test_path = os.path.join(os.path.dirname(sys.argv[0]), "test")
+        sys.path.insert(0, test_path)
+        test_runner = None
+        if use_cgitb:
+            test_runner = testprogram.CgitbTextTestRunner()
+        prog = testprogram.TestProgram(
+            MODULE_NAMES,
+            testRunner=test_runner,
+            localServerProcess=testprogram.TwistedServerProcess(),
+            )
+        result = prog.runTests()
+
+    if run_coverage:
+        # HTML coverage report
+        import colorize
+        try:
+            os.mkdir("coverage")
+        except OSError:
+            pass
+        private_modules = glob.glob("mechanize/_*.py")
+        private_modules.remove("mechanize/__init__.py")
+        for module_filename in private_modules:
+            module_name = module_filename.replace("/", ".")[:-3]
+            print module_name
+            module = sys.modules[module_name]
+            f, s, m, mf = coverage.analysis(module)
+            fo = open(os.path.join('coverage', os.path.basename(f)+'.html'), 'wb')
+            colorize.colorize_file(f, outstream=fo, not_covered=mf)
+            fo.close()
+            coverage.report(module)
+            #print coverage.analysis(module)
+
+    # XXX exit status is wrong -- does not take account of doctests
+    sys.exit(not result.wasSuccessful())


Property changes on: mechanize/tags/0.1.10/test.py
___________________________________________________________________
Added: svn:executable
   + 



More information about the Checkins mailing list