[Zope3-dev] RFC: Make HTTP streaming of large data simpler
Philipp von Weitershausen
philipp at weitershausen.de
Mon Dec 5 22:31:49 EST 2005
Jim Fulton wrote:
> There are a number of reasons we needed IResult:
>
> - We want to be able to adapt existing output, especially
> string output and we needed an interface to adapt to.
I see. I presume this is for the reason you state below, namely to be able to customize
the setting of the Content Type, etc. I guess that's valid use case, but it seems like a
separate issue from streaming or trickling data. Maybe we should make them separate
interfaces: The IOutputHeaders adapter would be responsible for figuring out output
headers on a result object (incl. a string), the IBodyIterator adapter would be
responsible for streaming/trickling.
> - An adapter may need to affect outut headers, so IResult
> needed to provide header data.
Well, the view, before falling into iteration, could set response headers itself.
> - We needed iterable data for WSGI.
I don't understand how my example of a generator view fails there. It *does* provide
iterable data to the publishing framework. In fact, the generator itself is the iterable
data.
> There are two interesting use cases that would drive
> applications to pay attention to IResult:
>
> A. Returning large amounts of data
>
> B. Dribbling data from the application, for example
> to provide progress on a long-running application.
>
> For A, you want to compute that data and then leave
> application code. You don't want to stay in the
> application, holding application resources, like database
> connections, while the data is being consumed. In this case,
> you generally want to create a temporary file and return that
> as the IResult body.
Ah, yes, good point. So, while IResult seems to be needed for the decoupling of
application space and server space, I still think the interface itself is too
complicated. Instead of requiring this 'body' attribute which is iterable, IResult itself
should be iterable. I propose to change it to:
class IResult(Interface):
...
headers = Attribute('A sequence of tuples of result headers, such as'
'"Content-Type" and "Content-Length", etc.')
def __iter__(self):
"""Provide the body data of the response"""
Or, if we adopt my suggestion of separating headers from body iterators, we'd have two
interfaces:
class IOutputHeaders(IReadMapping):
"""Provide headers for the response output"""
class IBodyIterator(Interface):
"""Provide the response body in an iterable manner"""
def __iter__(self):
"""Provide the body data of the response"""
Implementations of IBodyIterator that would create temporary files like you suggest could
then easily implement __iter__ by returning iter(file_handle_of_the_tempfile).
> BTW, your implementation also doesn't work because it doesn't
> set the content length.
I don't think setting content length is mandatory. It's definitely nice, though,
especially for the usability of the app.
> Unfortunately, we still aren't addressing use base B above.
> Some more API enhancements will be needed to address that.
> There will need to be some way to signal that the publisher
> should not release applicatuon resources (not call
> publication.endRequest and request.close) until after the data
> has been streamed. In any case, this needs more thought and
> a proposal before we attack this.
Indeed. Also, I still haven't given up my implementation for case B, but of course I'm not
attached to it. My goal is to have a *simple* way of writing views that A) stream large
data (I guess the indirection of a temporary file masked by IResult/IBodyIterator is
needed here) and B) trickle data to the client.
I would presume for B) we could still also use IResult/IBodyIterator by writing something
like this (assuming my suggestion of making IResult objects iterable from above):
class StreamingView(BrowserView):
implements(IBodyIterator)
def __iter__(self):
return self
def next(self):
data = self.context.getMoreDataToTrickle()
if not data:
raise StopIteration
return data
This would be sufficiently simple I think, but a simple generator like my original
suggestion or the one below would still be more straight-forward.
> We'll need a way to inspect the output to determine which
> strategy is being used. An interface seems to be a good
> way to do this.
Yes.
> I think that either of these use cases is advanced and
> should be handled explcitly.
Sure. So what would be wrong with:
class TrickleView(BrowserView):
# tell the publisher that we'll be trickling data the client slowly
# so that all resources stay available for that period of time
implements(IWantToStayInApplicationSpacePlease)
def __call__(self):
yield self.context.getDataToTrickle()
yield self.context.getMoreDataToTrickle()
yield self.context.getEvenMoreDataToTrickle()
> Yet another use case was to make pluggable the traditional
> implicit determination of output content type and text
> encoding. The adaptation to IResult allows this to be
> customized.
Like I said, this seems like a separate issue from streaming/trickling data to the client.
Philipp
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
More information about the Zope3-dev
mailing list