[Zope-dev] Xron fragility

Chris McDonough chrism@digicool.com
Wed, 11 Oct 2000 08:42:03 -0400


Steve,

I am also interested in scheduling... though I haven't looked closely at
Xron.


> The Xron product seems rather "fragile" in use.
>
> That is, if things go wrong when an Xron DTML Method is triggered, that
> method doesn't get rescheduled.

Does Xron take an optimistic approach to repeating jobs?  In other words,
does it assume every job is a one-time job and that the last duty that a
repeating job performs is to reschedule itself?

> An example of this is that my intranet DNS server needed to be rebooted.
> Xron couldn't look up the appropriate domain name, and stopped working.
>
> ------
> 2000-10-11T08:16:34 PROBLEM(100) Products.Xron.Loggerr
> Trigger event: http://my.development.server:4080/my_xron_method
> Trigger time: 2000/10/11 09:14:00 GMT+1
> Failed to trigger event.
> Type=bci.NotAvailable
> Val=host not found (File:
> http://my.development.server/my_xron_method/trigger Line: [])
> None None for None
> ------
> 2000-10-11T08:19:04 PROBLEM(100) Products.Xron.Loggerr Failed to disarm
> event
>
>
> Also, if something bad happens in the Dispatcher thread, the thread
> dies.

What does the Dispatcher thread do?  Does it fire off worker threads?

> Before I leap in and start patching Xron, I'd like to have a discussion
> about how Xron should handle problems.
>
> My first thought is that on errors in the dispatcher thread, Xron should
> enter an "error state" where it probes every so often to see whether it
> can resume normal operation. The length of time between probes could
> increase with each probe, to give good performance with transient
> problems whilst preventing Xron from choking resources.

What are other threads in Xron doing while the Dispatcher thread is hosed?
What are the other threads?