[Zope3-dev] WORKFLOW: Notes on WFMC and XPDL

Mon Nov 29 14:31:28 EST 2004

I've been studying some literature on workflow, in preparation for
proposing an activity-based workflow framework for Zope 3.  The
Workflow-Management Coalition (WFMC) has published a number of
documents and standards:

   http://www.wfmc.org/standards/docs.htm

to promote interoperability of workflow systems.  The WFMC standards
are interesting for two (somewhat obvious) reasons:

- They represent years of industry experience.

- They provide interfaces that it would be useful, for
   interoperability, for Zope-based workflow systems to support

One of the more important standards is the XML-based Workflow Process
Description Language (XPDL,
http://www.wfmc.org/standards/docs/TC-1025_10_xpdl_102502.pdf), which
specifies both an interchange format and process-definition
architecture.  The WFMC/XPDL model is not without it's limitations
(see, for example:
http://tmitwww.tm.tue.nl/research/patterns/download/ce-xpdl.pdf), but
it seems like a good place to start.

This document provides a summary of the XPDL from an architectural
point of view and some thoughts on how to interpret the architecture
in Zope 3.

Ignored Features
----------------

For this summary, I'll ignore a number of features:

- Packages and models provide advanced packing support that are not
   essential for thinking about the architecture.

- Redefinable headers are poorly doumented and are inessential to the
   architecture.

- Block activities (using activity sets) and subflows are advanced
   features to support very complex workflows.  These are features that
   have potential value and that we might want eventually, but are
   inessential, especially for an initial architecture.

- Extended attribuses are used to support vendor extentions.

- Route activities can me modeled as activities without tools.

- Activity start and finish modes are used to accomidate "manual"
   activities, which are used to model activities not controlled by a
   workflow system. This is a feature that seems poorly documented by
   the standard and is a complexity that I think we can omit for
   now. (It can be handled using an "automatic" activity with a work
   item that provides manual control.)

- Simulation data is used to simulate workflow models. This is a
   complication that we can ignore for now.

- Process access levels provide hiding of subprocesses.  Because I'm
   are ignoring subpresses for now, I'll also ignore this.

- External references are an implementation detail of the XML-based
   implementation.

- Multiple transition restrictions on activities is very poorly
   documented. In particular, the sementics of multiple joins or splits
   on an activity aren't specified. For some chosen semantics, the
   effect can be easily provided by introducing "route" acrivities.

- Transition lists on activity splits aren't needed and don't make
   sense if only one split is allowed on an activity.

Adapting Process Definitions to a System
----------------------------------------

Before getting into the details of what makes up process definitions,
it's worth while considering how process definitions fit into a system
architecture.  Process definitions, as their name suggests, model
dynanic processes. They specify activities and transitions between
them. They also specify **abstractly** the functions to be performed
for activities and the workflow participants that perform the
activities.  The *specific* functions to be performed and the people
or systems that perform them are specified as a part of the integration
of a process definition with a system.  The process definition can be
seen as a component that provides only control logic for an application.
It does *not* provide the application logic.

Process definitions maintain data, called workflow-relevent data.
This data is used in the control logic.  Data are updated by
calling applications during activities. The applications can take
workflow-relevent data as inputs and produce new values for
workflow-relevent data values as outputs.

Contents of a workflow definition
---------------------------------

Here, briefly, are the contents of a workflow definition, in outline
form:

- id

- name

- creation time

- description

- priority

- time limit

- valid from

- valid to

- time estimation

   - waiting time (time necessary to set up process)

   - working time (time spent actually executing the process)

   - duration

- data fields (schema for workflow-relevent data)

- Duration unit

- Participants

   This is a series of abstract participants. Actual participants are
   provided for these at run time.

   For each participant:

   - id

   - name

   - type (resource set, organizational unit, role, human, or system)

- Applications

   These are abstract definitions of the applications to be provided by
   the application to perform workflow tasks in activities.

   For each application:

   - id

   - name

   - description

   - formal parameters

- Activities

   - id

   - name

   - description

   - documentation url

   - time limit

   - priority

   - icon

   - tools

     The work done by an activity is performed by application of zero or
     more tools. An activity with no tools is called a "route" activity
     and is used to either provide start and stop activities, or to
     provide more flexibility in modeling flow.  Tools map activities
     and workflow-relevent data onto the applications defined in the
     applications section.

     - id

     - name

     - description

     - application id

     - formal parameters

   - performer

     The performer is an expression (in terms of workflow-relevent
     data) that selects a workflow participant.

   - join type ("and" or "xor")

     An "and" join delays start of an ativity until each of the
     incoming transitions have been activated.

     An "xor" join causes the activiuty to commence if any of the
     incoming transitions fire.

   - split type ("and" or "xor")

     An "and" split fires all of the outgoing transitions with
     conditions that evaluate to true.  (There is a special case of an
     "otherwise" condition that fires if and only iff none of the other
     outgoing transitions fire.)

     An "xor" split fires one transition, evaluating transitions in
     order until one fires. (Note that a transition with an "otherwise"
     condition will be evaluated last and will fire if no other
     transitions have fired.)

   - deadline

     - deadline condition

     - exception name

     - execution ("async" or "sync")

       If a deadline is reached (by virtue of a deadline condition
       evaluating to true) then transitions will be allowed to fire.
       The activity will contibue of the execution is "async", and will
       end if the execution is "sync".

- Transitions

   - id

   - name

   - dscription

   - from activity id

   - to activity id

   - condition

Possible Zope 3 Realization
---------------------------

It seems reasonable to realize process definitions as components that
ecapsulate the process models and provide process control.  Process
definitions act as factories for process instances, which, in turn,
are factories and containers for activity instances.  Process
definitions would then be realized as utilities.

A key issue is how to interface process definitions with the rest of
the system.  Process definitions integrate with the rest of the system
through application and participant definitions. Applications appear
to be like functions, with input and output arguments, however, there
is an important difference.  Application invocation and return are
asynchronous. An application instance is created and usually can't be
executed immediately (for example if there is a human performer).  I
suggest modelling applications as factories that are passed activity
instances and return work items.  When a work item is completed, an
workItemCompleted method of the activity is called, passing the work
item and any output values.  An important task in integrating a
workflow definition with a system is the provision of concrete
application implementations for the abstract application definitions
in the workflow.

Similarly, a workflow definition provides abstract participant
definitions that need to be realized witin an environment. It's much
less clear what the role of participants is.  Work is ultimately
performed by applications. It's up to applications to decide how to
use participant information.

I'm unsure of the best way to configure what concrete applications
(work-item factories) or concrete particpants should be used.  I can
think of a number of options.  Here are two:

1. Make the application and participant mapping part of the workflow
    configuration.  For example, there might be a zcml directive that
    defines a workflow definition from an xpdl file and that provides the
    application and participant mapping.

    Here's an example::

      <workflow:definition
          id="my.reviewprocess"
          file="process.xpdl"
          >

        <application id="prepare" factory="my.workitems.Prepare" />
        <application id="review"  factory="my.workitems.Review"  />
        <application id="publish" factory="my.workitems.Publish" />
        <application id="reject"  factory="my.workitems.Reject"  />

        <participant id="author"   factory="my.participants.Author"   />
        <participant id="reviewer" factory="my.participants.Reviewer" />

      </workflow:definition>

2. Register applications and participants as (IWorkflowApplication and
    IWorkflowParticpant) utilities.  The process definition would then
    look up these utilities as needed.

    Here's an example::

      <workflow:definition
          id="my.reviewprocess"
          file="process.xpdl"
          />

      <utility
          provides="zope.app.workflow.interfaces.IWorkflowApplication"
          name="my.reviewprocess.prepare"
          factory="my.workitems.Prepare"
          />

      <utility
          provides="zope.app.workflow.interfaces.IWorkflowApplication"
          name="my.reviewprocess.review"
          factory="my.workitems.Review"
          />

      <utility
          provides="zope.app.workflow.interfaces.IWorkflowApplication"
          name="my.reviewprocess.publish"
          factory="my.workitems.Publish"
          />

      <utility
          provides="zope.app.workflow.interfaces.IWorkflowApplication"
          name="my.reviewprocess.reject"
          factory="my.workitems.Reject"
          />

      <utility
          provides="zope.app.workflow.interfaces.IWorkflowParticpant"
          name="my.reviewprocess.author"
          factory="my.participants.Author"
          />

      <utility
          provides="zope.app.workflow.interfaces.IWorkflowParticpant"
          name="my.reviewprocess.reviewer"
          factory="my.participants.Reviewer"
          />

      </workflow:definition>

    This second option tekes advantage of the component architecture to
    keep the workflow-definition directive simpler and to make the
    application definitions a bit more pluggable. Someone can redefine
    (override) an application or particupant definition without
    overriding the entire workflow configuration.  This second approach
    is also far more verbose and the connection between the workflow
    definition and the applications is far less explicit than in the
    first option.

I think I prefer the first option. Thoughts?

I'm inclined to use XPDL for process definitions.  This allows process
definition using third-party process definition tools, including the
free editor, "jawe", http://jawe.objectweb.org/.

Activity instances will have a 'process' attribute that provides
access to their process instances.  Process instances will have two
attributes, 'workflow_relevent_data' and 'application_relevent_data'.
The 'workflow_relevent_data' attribute will be a mapping object
containing workflow-relevent data, as defined by the process
definition.  Application code should not write to this data.  The
'application_relevent_data' attribute is a writable dictionary that
application code can use to pass information between applications
(work items).  For example, work items in a review process could store
in-progress work in the application-relevent data mapping.  Process and
activity instances and their data mappings will be persistent.  Work
items should be persistent as well.

Workflow definition components should only be responsible for process
control. Features like the management of work lists or
workflow-process instances will be provided by other components.
Workflow definitions will generate workflow-relevent events.  Among
other things, these events could be used to add and remove workflow
instances to and from containers as they are created and destroyed, if
necessary.

There are many more details that would need to be addressed if this
approach was pursued.  My goal here is to provide a relatively
high-lvel summary and solicit initial comments and questions.

Comments? Questions?

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org