[Zope] doing a "diff" of two XML export files.

Robert Leftwich robert@leftfieldcorp.com
Wed, 15 Sep 1999 18:02:20 +1000


If you are not fundamentally opposed to Java, see
http://www.alphaWorks.ibm.com/tech/xmltreediff

from which I obtained the following text :

XML TreeDiff is a package of beans that provide the ability to
efficiently differentiate and update
  DOM trees, just like diff and patch differentiate and update data
files. 

  XMLTreeDiff is a set of Java beans designed to perform fast
differentiation and update of DOM
  structures. XMLTreeDiff works in many ways like diff and patch.
However, rather than differentiating
  the file representations of the documents (that is, the XML files),
XMLTreeDiff runs directly on the
  DOM's themselves. This way, the differences are directly expressed in
terms of native tree
  operations like change node, delete node or insert node, rather than
line mismatches. The
  advantages of this approach are several: it avoids the need to convert
the DOM trees to file format
  prior to comparing them; with that, it eliminates the 'false negative'
reports caused by dissimilar file
  representations of equivalent DOM structures; finally it avoids the
need to infer the tree structural
  meaning of a line difference report. 

  It is well known that the process of differentiating two labeled tree
structures is an expensive one,
  with a cost (for ordered trees) at least quadratic in the number of
tree nodes. This has traditionally
  held developers back from using direct tree to tree comparison tools.
XMLTreeDiff uses an optimal
  tree differentiating algorithm together with a fast subtree matching
procedure to make direct tree
  differentiation a practical tool. XMLTreeDiff is particularly well
suited to do version management of
  XML documents and tree structured data in general. 

  XMLTreeDiff is packaged as a set of Java beans, and allows both
command line and programming
  access to the differentiation and updated tools. It includes a
differentiating tool, and update tool,
  and a graphical user interface to display the differences directly on
the compared trees. Difference
  reports are output in XML format as well. 


Anthony Baxter wrote:
> 
> It seems to me that it would be a Cool Thing to be able to take
> two Zope XML export files, say from different days, and feed it into
> something that would read the two, and produce a "change file" which
> specifies what has changed (think "diff" for XML).
> 
> This would be useful for
> 
> ZClass based packages, and making new versions of same.
> Tracking what's done on a site on a day-by-day basis (something
> that would produce a list at the end of the day showing exactly
> what changed, and where).
> 
> Just doing a diff of the XML files is one extraordinarily yucky way
> to do this, but it's not exactly easy to then reapply that as a patch.
> Something that walked through the files would be more useful...
> 
> thoughts? A little bird told me that people are already considering
> something for this task... any ideas?
> 
> would the XML folks have already put something like this together?
> 
> Anthony
> 
> _______________________________________________
> Zope maillist  -  Zope@zope.org
> http://www.zope.org/mailman/listinfo/zope
> 
> (To receive general Zope announcements, see:
> http://www.zope.org/mailman/listinfo/zope-announce
> 
> For developer-specific issues, zope-dev@zope.org -
> http://www.zope.org/mailman/listinfo/zope-dev )