[ZODB-Dev] RFC: Blobs in S3

Jim Fulton jim at zope.com
Thu Jul 7 10:06:19 EDT 2011


Gaaaa, I sent this before I was done.  Had some sort of gmail missfire,
I thought the email was lost. :/

On Wed, Jul 6, 2011 at 2:44 PM, Jim Fulton <jim at zope.com> wrote:
> We're evaluating AWS for some of our applications and I'm thinking of adding
> some options to support using S3 to store Blobs:
>
> 1. Allow a storage in a ZEO storage server to store Blobs in S3.
>    This would probably be through some sort of abstraction to make
>    this not actually depend on S3.  It would likely leverage the fact that
>    a storage server's interaction with blobs is more limited than application
>    code.
>
> 2. Extend blob objects to provide an optional URL to fetch data
>    from. This would allow applications to provide S3 (or similar service)
>    URLs for blobs, rather than serving blob data themselves.
>
>
>    2.1 If I did this I think I'd also add a blob size property, so you could
>          get a blob's size without opening the blob file or downloading
>          it from a database server.
>
> Option 3.  Handle blob URLs at the application level.
>
>   To make this work for the S3 case, I think we'd have to use  a
>   ZEO server connection to be called by application code.  Something like:
>
>       self.blob = ZODB.blob.Blob()
>       f = self.blob.open('w')
>       f.write(some_data)
>
>
> Option 1 is fairly straightforward, and low risk.
>
> Option 2 is much trickier:
>
> - It's an API change
> - There are bits of implementation that depend on the
>  current blob record format.  I'm not sure if these
>  bits extend beyond the ZODB code base.
> - The handling of blob object state would be a little
>   delicate, since some of the state would be set on the storage
>   server.
> -  The win depends on being able to load a blob
>    file independently of loading blob objects, although
>    the ZEO blob cache implementation already depends
>    on this.

Before I accidentally sent this, I was going to mention a 3rd option
involving ZEO extension methods. The idea being that you'd do
something like:

  self.blob = ZODB.blob.Blob()
  f = self.blob.open()
  f.write(some_data)
  f.close()
  transaction.commit()
  self.url = self._p_jar.db().storage.get_blob_url(self.blob._p_oid)
  transaction.commit()

This is less risky, from an API point of view, but is messy in a
number of ways.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton


More information about the ZODB-Dev mailing list