[ZODB-Dev] Getting all OIDs from a storage.

Jim Fulton jim at zope.com
Mon May 1 14:14:35 EDT 2006


Tim Peters wrote:
> [Christian Theune]
> 
>>>Hmm. Sorry, but could you point out where the API is defined? I might
>>>not have looked hard enough. I only found internals to exploit. :(
> 
> 
> [Jim Fulton]
> 
>>I wish I could.  I'm almost certain that Chris McDonough implemented
>>one at PyCon 2005 and that Stephan Richter made use of this, but
>>I can't find it.
> 
> 
> He did, and it's described in NEWS.txt for ZODB 3.4a1:
> 
> """
> - Added a record iteration protocol to FileStorage.  You can use the
>   record iterator to iterate over all current revisions of data
>   pickles in the storage.
> 
>   In order to support calling via ZEO, we don't implement this as an
>   actual iterator.  An example of using the record iterator protocol
>   is as follows::
> 
>       storage = FileStorage('anexisting.fs')
>       next_oid = None
>       while True:
>           oid, tid, data, next_oid = storage.record_iternext(next_oid)
>           # do something with oid, tid and data
>           if next_oid is None:
>               break
> 
>   The behavior of the iteration protocol is now to iterate over all
>   current records in the database in ascending oid order, although
>   this is not a promise to do so in the future.
> """
> 
> I don't believe it was implemented for ZEO, or for anything else other
> than FileStorage.

Ah. Thanks for the reminder. :)

Obviously, this wasn't meant to be an end-user API. :)

The intent was that there's be a database API that woudl use this.
The database API *would* return a Python iterator.

I'm surprised I couldn't find a proposal for this.  I
think I'll work one one.

> [Dieter Maurer]
> 
>>Are you aware that such an API would pose interesting
>>concurrency issues?

...

I think the quality of service for this should be modest.

I suggest that:

- The storage should be required to return OIDs in the
   database at aproximately time the call was made.  It should
   be acceptable to omit recent items.  The idea is that OIDs
   generated while the request is being satisfied might be
   ommitted.

- It should be acceptable to return OIDs that have been deleted.

It should be noted that this API is aimed at the use case
of data conversion.  As a result, it doesn't matter if new data are
written while the iteration is taking place because new data
should not need to be converted.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


More information about the ZODB-Dev mailing list