[Zope3-dev] RFC: Make HTTP streaming of large data simpler

Mon Dec 5 11:13:31 EST 2005

Philipp von Weitershausen wrote:
> Hi there,
> 
> while pondering on http://www.zope.org/Collectors/Zope3-dev/480, I came
> across an overcomplication of the Zope 3 publishing API:
> 
> 
> Status quo
> ----------
> 
> In order to stream large data over an HTTP connection, a view method
> may, instead of simply returning data, set an IResult object on the
> response. This would look like this (assuming that the view's context
> looks like a file handle)::
> 
>   from zope.publisher.interfaces.http import IResult
> 
>   class StreamResult(object):
>       implements(IResult)
> 
>       def __init__(self, context):
>           self.context = context
> 
>       headers = ()
> 
>       @property
>       def body(self):
>           chunk = self.context.read(CHUNKSIZE)
>           while chunk:
>               yield chunk
>               chunk = self.context.read(CHUNKSIZE)
> 
>   class StreamView(BrowserView):
> 
>       def __call__(self):
>           return StreamResult(self.context)
> 
> The publisher will call the view, obtain a result object and call
> HTTPResponse.setResult() with it. This method expects a) a string, b) an
> object providing IResult (like in this case) or c) an object that is
> adaptable to IResult. In case of an IResult object or adapter, the
> 'body' attribute is supposed to be an iterable, in the above example
> it's a generator which would probably be a typical iterator for this case.
> 
> 
> Proposal
> --------
> 
> This is too complicated, the indirection through an extra object seems
> unnecessary, especially becuase the iterable is not the object itself
> but the 'body' attribute. I'm not even sure if the whole IResult thing
> is necessary at all and whether somebody actually has a usecase for it.

First, I should apologize for not creating a proposal for this
new API.  It was created in a sprint and we took a short cut that we
shouldn't have.

There are a number of reasons we needed IResult:

- We want to be able to adapt existing output, especially
   string output and we needed an interface to adapt to.

- An adapter may need to affect outut headers, so IResult
   needed to provide header data.

- We needed iterable data for WSGI.

There are two interesting use cases that would drive
applications to pay attention to IResult:

A. Returning large amounts of data

B. Dribbling data from the application, for example
    to provide progress on a long-running application.

For A, you want to compute that data and then leave
application code.  You don't want to stay in the
application, holding application resources, like database
connections, while the data is being consumed.  In this case,
you generally want to create a temporary file and return that
as the IResult body.  This means that implementations like
the one you give:

> I would really like to be able to simply write:
> 
>   class StreamView(BrowserView):
> 
>       def __call__(self):
>           chunk = self.context.read(CHUNKSIZE)
>           while chunk:
>               yield chunk
>               chunk = self.context.read(CHUNKSIZE)

Won't work.

BTW, your implementation also doesn't work because it doesn't
set the content length.

Unfortunately, we still aren't addressing use base B above.
Some more API enhancements will be needed to address that.
There will need to be some way to signal that the publisher
should not release applicatuon resources (not call
  publication.endRequest and request.close) until after the data
has been streamed.  In any case, this needs more thought and
a proposal before we attack this.

We'll need a way to inspect the output to determine which
strategy is being used.  An interface seems to be a good
way to do this.

I think that either of these use cases is advanced and
should be handled explcitly.

Yet another use case was to make pluggable the traditional
implicit determination of output content type and text
encoding.  The adaptation to IResult allows this to be
customized.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org