[Zope-dev] FTP interface being worked on?

Mon, 19 Mar 2001 10:24:42 -0500

I hadn't thought of the issues you raise.  Thanks for mentioning them.

"John D. Heintz" wrote in part:
> If we
> standardize "properties" to an XML file, then optionally dump other
> files to expose specific aspects of an instance for serialized editing
> it might not be as big a problem as I was thinking.

I think that is the shared vision.  Some aspects of each object could be
serialized into a format that is easy to edit.  For those aspects we
leave it up to the developer of the object to write a serialization
method -- we don't try to guess what an "easy to use" format would look
like.

Other aspects of objects might be impossible to serialize into a
meaningful format.  For those we have a default like XML pickle --
essentially a black box.

> I guess I would suggest that the serialized form of a Zope instance by
> default would be a single XML file, but that arbitrary sections of that
> XML file could be custom dumped to separate serialized files with
> similiar names.  That way authors would have a pretty easy job of
> overriding sections of the dump process to spit out one or more simple
> files that have little parsing overhead.

Sounds reasonable.

> >> 2) A lesser problem is when trying to edit the serialized "files".
> >> Because objects are methods and state how you modify an object can
> >> be guided if not controlled.  When we have serialized the
> >> objects in a Zope system to files, we have exported only the state
> >> of the objects in the ZODB.  We then have to live with the ability
> >> to foul up invariant across many objects by changing some data in
> >> the serialized format.  A good example would be ZCatalogs. [...]
> >
> > Yup... it's probably easiest to make ZCatalogs a black box.
> 
> Black box doesn't solve this problem, only the first one.  Imagine that
> I move a serialized version of a Zope object that is indexed by an
> instance of ZCatalog (or many for that matter).  When I move it the
> ZCatalogs must be notified to handle the change, but only at import time
> because ZCatalogs are serialized as binary for lots of good reasons.

I see the problem.  I think the example you give can be handled
adequately at import time.

But I can see other examples where allowing edits to the serialized
representation could create problems that would be impossible to resolve
at import.

So it seems like we might want to make some things read only.  That is,
when you serialize the objects in the Zope ODB to a filesystem, some of
those serialized files are read-only "black boxes".  A comment in those
files could let a developer know that to change the information in that
file she needs to do an import, or edit the ODB directly.

> When I import the object
> from the serialized format all I can know is that something changed, but
> without expensive processing (XML diffing is hard in the general case,
> we might be able to limit the structures to managable scope though) we
> can't know that the "foo" ZCatalog should be updated instead of the
> "bar" ZCatalog.

Seems like we will need to consider the import code very carefully.

I don't know enough about how ZCatalog works to discuss the options
intelligently.  But in other indexing systems I have worked with, there
have been solutions for reindexing when making updates to the corpus.

> >> a) XML is structured enough that it can reliably hold the
> >> data from the
> >> ZODB.  The current XML dump is not useful for this - it
> >> would need to
> >> create individual files and folders to represent
> >> containment.
> >
> >
> > This is pretty easy right now.  Ten lines of recursive code
> > can walk the whole tree if necessary and export only leaf
> > objects.

Great.  Maybe I am closer than I realize to the CVS management
solution.  I need to look more closely at ZCVSmixin to see what it
does.  But for our immediate need (which is to allow a distributed team
of developers to share code and track changes via a central CVS
repository), maybe it makes the most sense just to segment the existing
XML export into directories and files and enhance the existing import to
allow overwriting objects.

> >> b) A hybrid XML and custom dump solution.  An Image for
> >> example could dump out as a binary image file with meta-data in a
> >> similiarly name XML file.
> >
> > Yes, each object should make its own policy regarding its
> > body.  Its metadata format should be standardized, however.

I like this idea.

After I have the XML export/import working in a way that fits better
with CVS (even if the sreialized representation is essentially a black
box), then I can tackle how each object represents its body in a
"morally plain text" serialized format.

In other words, first get the default XML representation and
export/import working for all objects.  Then start with the easiest type
of objects to serialize (such as DTML Methods) and create an easy to use
serialization representation.  Then work on the import for that
serialized format.

I think this approach would be different than FSDump and ZCVSMixin,
right?  As far as I understand it, FSDump just goes one way (ZODB ->
filesystem) and only for certain types of objects.  I don't understand
what ZCVSMixin does (will need to spend some time looking at it --
unlike FSDump, ZCVSMixin is not obvious from the documentation and a
quick review).

Thanks for helping with this project!
Fred
-- 
Fred Wilson Horch			mailto:fhorch@ecoaccess.org
Executive Director, EcoAccess		http://ecoaccess.org/
P.O. Box 2823, Durham, NC 27715-2823	phone: 919.419-8354