[ZODB-Dev] Newbie ZODB Questions

Jeremy Hylton jeremy@alum.mit.edu
Mon, 17 Sep 2001 23:52:14 -0400 (EDT)


>>>>> "KH" == Kent Hoxsey <khoxsey@caspiannetworks.com> writes:

  KH> I'm an Oracle DBA as well, and have spent some time getting used
  KH> to the Zope mindset and the way the ZODB operates. There are
  KH> several features of the FileStorage that make it operate in a
  KH> way that is quite similar to an Oracle database in ARCHIVELOG
  KH> mode.

Kent,

Thanks for answering these questions.  It's really helpful to me to
see these answers from someone who knows both systems.  Would you and
Jennifer mind if I collected the questions and answers in the ZODB
Wiki? 

  >> (3) I did a some reading and saw that the Data.fs is a file where
  >> transactions are appended.  Where does the rest of the object
  >> data exist and is that data in a platform independent format?  If
  >> not, is there a way to get the data out in platform-independent
  >> way?

  [...]

  KH> In contrast, in Zope when an object is changed within a
  KH> transaction and the transaction committed, the entire object is
  KH> appended to the Data.fs file. I believe a "rollback"
  KH> (transaction.abort) doesn't actually touch the Data.fs file,
  KH> because the transaction was not committed, hence no objects were
  KH> written. (ZODB-Gurus please help out).

That's right.  Oracle uses pessimistic concurrency control; when you
modify an object, you acquire a lock on the object and then update the
value.  When all the objects are updated, you release the locks.  You
need to rollback because you update data as you go.  ZODB uses
optimistic concurrency control with backwards validation.  In the case
of FileStorage, it writes all the data to the end of the Data.fs.
Once the data is written, it updates the index to point to the new
locations and marks the transaction as committed.  If there is a
failure before the transaction is marked as committed, the old data is
still considered "current" and there is no need to rollback changes.

  KH> The object data itself is stored in "pickles", which are
  KH> platform-independent as far as I know.

Right.  ZODB uses Python's cPickle module, which uses a
platform-independent format.

  >> (4) Is there any way to perform adhoc queries on ZODB?

  KH> Not really in the sense you would with Oracle. However, once you
  KH> get the hang of referencing objects from the python command line
  KH> in the ZODB, you can do the same sorts of things.

Right.  There is no query language for ZODB, but you can write a
little Python script to perform an arbitrary query.

  >> (6) Is RAID the only way to implement redundancy for ZODB?

  KH> If by "redundancy" you mean for the data only, then I would
  KH> agree that currently the options available are raid-like,
  KH> including clustered storage like a NetApp. One thing to remember
  KH> is that the other redundancy available in Zope is the
  KH> distribution and load-sharing model of ZEO clients. Since each
  KH> ZEO client maintains its own object cache, losing your data
  KH> server does not mean your service must be off the air.

We have a new project underway called MasterSlaveReplication.  See
http://www.zope.org/Wikis/ZODB/MasterSlaveReplication.  At a high
level, it will work like the Data Guard facility in Oracle 9i: You can
keep one or more standby storages that are always up-to-date, but
aren't used unless you need to take the main storage down or there is
a failure.

We don't yet have a delivery schedule for this product, but it is my
top priority at the moment.  I'd guess we'll have an alpha in weeks
rather than months.  

In the interim, you can use Toby Dickenson's ReplicatedFile package,
which creates standby FileStorages.  (Our strategy for standby
databases will support any combination of storages.)

  >> (8) Where do the application usernames/passwords physically exist
  >>     in
  >> Zope?

  KH> They are stored as attributes of objects in the ZODB. I
  KH> recommend the EncryptedUserFolders patch. We've been using it in
  KH> concert with the CMF for a while now, with no noticeable
  KH> problems.

The user and password data is normally stored in the Data.fs along
with everything else.  Shane Hathaway recently implemented a password
encryption mechanism, which works like LDAP's SSHA mechanism.  I
expect it will be available in the next version of Zope.

  >> (11) If I am "reading" an object, do I continue to get a
  >>      read-consistent
  >> view of it, even if someone else performs a write to it, while I
  >> am reading it?

  KH> There has been some discussion about reporting
  KH> ReadConflictErrors, and there are proposals for transaction
  KH> isolation levels for the ZODB.  However, I am not sure how much
  KH> of either is currently available.

Neither is currently available.  If transaction 1 (T1) reads an object
and T2 modifies the object before T1 commits, T1 will raise a
ConflictError caused by its read.  We expect to implement
multi-version concurrency control, which would allow T1 to read
consistent but possibly stale data.  Not sure about the schedule.

Jeremy