[ZODB-Dev] Experiences with Spread?

Tim Peters tim at zope.com
Mon Jan 3 15:27:29 EST 2005


[A.M. Kuchling]
> I'm thinking about applying the Spread toolkit to a data replication
> problem.  Because the Zope Replication Service uses Spread, I'd like to
> ask about your experiences with Spread.  Did it work OK for you? Does it
> seem reasonably free of bugs?  Any design, configuration, or usage issues
> to be aware of?

Spread works fine, although the Spread users list is probably a better place
to ask for dirt.

Use of Spread in ZRS may be overkill given the relatively simple one-way
replication ZRS supplies.  We picked Spread for ZRS when more ambitious
plans seemed more imminently realistic, partly based on (the more ambitious)
Postgres-R's use of Spread.

Some downsides in practice:

- As for all systems that need to be told about network topology, Spread
  configuration is delicate and utterly unforgiving.  I don't know why,
  but sysadmins seem to have a hard time keeping config files in
  synch across machines.  Earlier versions of Spread exacerbated this
  problem by failing to do even simple sanity checks (name too long,
  name duplicated, ...) on spread.conf files.  This is better in the
  current Spread, in part based on our feedback about typos in ZRS users'
  spread.conf files that caused no end of grief.  Getting Spread
  running has usually been a real effort, but has usually come down to
  no more than that Spread's config files on participating machines
  contained incorrect info about the actual network topology, and/or
  inconsistent info across participating machines (e.g., calling a machine
  "andrew" in one box's spread.conf but "amk" in another's).  IOW, pilot
  error.  Unfortunately, also as for other networked systems, so long as
  the config is incorrect the only real symptom is "huh -- nothing seems
  to be happening".

- Spread has extensive logging facilities, but you can't change what's
  being logged short of restarting Spread.  When an error is logged,
  chances are high it won't make any sense to you; OTOH, the logging is
  good in the sense that if you post the relevant piece of the log file
  to the Spread user's list, one of the Spread developers can usually
  deduce a lot from it.

- There doesn't appear to be a practical way to rotate Spread log
  files short of restarting Spread.  Some attempts at piping Spread's
  log output to a process that did its own rotation didn't work out,
  although I don't recall any details; there was something peculiar
  about Spread's output behavior that interfered with "the obvious"
  workarounds.

- Spread doesn't work in the presence of NAT.  "Real" network addresses
  get embedded in Spread packets, and NAT breaks that.

- The Python Spread wrapper module is suffering from neglect, and is
  still set up to work with Spread 3.17.1.  I don't even know if it
  *can* work with 3.17.3; I do know it at least needs changes to work
  with 3.17.3 on Windows because the Spread project changed the names
  of some files on Windows.  OTOH, there are no bug reports open against
  it, apart from one crazy bug due to someone changing symbols in
  Spread's .h files and getting into alignment problems as a result.
  So it's somewhat out of date, but solid.



More information about the ZODB-Dev mailing list