[ZODB-Dev] RFC: Blobs in S3

Jim Fulton jim at zope.com
Wed Jul 6 14:44:11 EDT 2011


We're evaluating AWS for some of our applications and I'm thinking of adding
some options to support using S3 to store Blobs:

1. Allow a storage in a ZEO storage server to store Blobs in S3.
    This would probably be through some sort of abstraction to make
    this not actually depend on S3.  It would likely leverage the fact that
    a storage server's interaction with blobs is more limited than application
    code.

2. Extend blob objects to provide an optional URL to fetch data
    from. This would allow applications to provide S3 (or similar service)
    URLs for blobs, rather than serving blob data themselves.


    2.1 If I did this I think I'd also add a blob size property, so you could
          get a blob's size without opening the blob file or downloading
          it from a database server.

Option 3.  Handle blob URLs at the application level.

   To make this work for the S3 case, I think we'd have to use  a
   ZEO server connection to be called by application code.  Something like:

       self.blob = ZODB.blob.Blob()
       f = self.blob.open('w')
       f.write(some_data)


Option 1 is fairly straightforward, and low risk.

Option 2 is much trickier:

- It's an API change
- There are bits of implementation that depend on the
  current blob record format.  I'm not sure if these
  bits extend beyond the ZODB code base.
- The handling of blob object state would be a little
   delicate, since some of the state would be set on the storage
   server.
-  The win depends on being able to load a blob
    file independently of loading blob objects, although
    the ZEO blob cache implementation already depends
    on this.



-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton


More information about the ZODB-Dev mailing list