[Zope-dev] Experiments with ORMapping

Shane Hathaway shane@digicool.com
Mon, 14 May 2001 10:28:04 -0400


"Phillip J. Eby" wrote:
> 
> At 05:42 PM 5/11/01 -0400, Shane Hathaway wrote:
> >"Phillip J. Eby" wrote:
> >> I'm not quite clear on how exactly you suggest mapping from RDMBS ->
> >> ZODB.  There's a *significant* (IMHO) impedance mismatch between ZODB's
> >> arbitrarily identified variably structured single records and SQL's
> >> content-identified fixed-structure record sets.  This is what application
> >> frameworks/toolkits (such as your own DBAPI product) are needed for.
> >
> >If you implement this at the Storage level, yes, there is a major
> >mismatch.  But at the Connection level it makes a lot of sense.
> >Connection basically exposes a pile of pickles as objects; an OR mapping
> >exposes a complex schema as objects.
> >
> >I think that understanding will change the rest of your response. :-)
> >
> 
> Nope.  As far as I can see, all that does is leverage ZODB Connections as a
> crude sort of Rack that only provides a mapping-like interface for
> retrieving objects, without helping any of the higher-order needs of an
> application.  I guess if you define O-R mapping as "can you store and
> retrieve an object's properties from a database", then I guess you've
> succeeded at that point.  But if that was all my O-R apps needed, there
> would've been no reason to use the RDBMS; I could've just used ZODB and a
> BTree with less bother.

I'm telling you there's a lot more you can do with the code that makes
up ZODB.  The mapping interface is not the key to the way Connection
does its work.  OIDs don't have to be strings.  If we just use
cPersistent, cPickleCache (which is misnamed), and Transaction to
implement an OR mapping, here's what we get for free:

- A transparent, transactional database.

- Potential interoperability with everything written for ZODB.

- Robust, tested code.

- Optimizations in C.

Since you challenged me :-) I decided to put together a prototype.  What
I came up with was a database connection that sets a different _p_jar
for each persistent object.  The _p_jar is an object that manages the
storage for only one persistent object, making it possible to customize
the storage.  There was only one hurdle in this approach but I hacked
around it: the tpc_* messages get sent to each _p_jar.  Then I wrote a
very minimal test that doesn't connect to an RDBMS but stores and
retrieves simple objects in a dictionary.

OIDs are still necessary to support multithreaded invalidation
messages.  But the OIDs in the prototype are a tuple consisting of a
schema ID and a record ID.  That way the record IDs only need to be
unique within an RDBMS table (or combination of tables.)

I didn't think anything like this was possible before I got to know
Jim.  I still didn't understand when he presented the idea months ago. 
But now I see the idea really is workable.  The advantage is that it
lets us completely isolate RDBMS storage details from the application.

The next thing to do is to write a fishbowl proposal.

Shane