Wednesday, November 8, 2017

SMF service - dependencies

Another set of descriptions on a SMF service metadata are related to the interdependencies with other services, instances and milestones. The idea is to build cross-services conditions on whether participating instances should start or stay running (otherwise stop). This is an important information which provides for overall orchestration, ranging from start-up optimization, passing by running states consistency and up to troubleshooting of service availability, all major features of SMF.

Except for the correct preposition 😉,
the basic idea of interdependencies is straightforward:
 
  • One may depend upon others (dependency tag)
  • Others may depend on one (dependent tag)

Either way, upon or on, an interdependency relationship is always given in terms of a FMRI (see smf(5) and svcs(1) for the definition of a FMRI). In general, this FMRI refers just to a service name, but they can be instance specific as well.

Each kind of interdependency relationship is grouped under the aforementioned tags, where a few attributes specify how SMF is supposed to manage availability consistency, that is, the fault propagation model, in face of the interdependencies defined by the SMF service developer.

The common interdependency grouping attributes are:

name

For the dependent tag it's highly recommend to add a distinguishing prefix to the corresponding value attribute, such as a namespace marking to avoid name clashing.


grouping

Specifies the external relationship to a group.
The associated value attribute can be:
require_any  :  any must be running or exist.
require_all  :  all must be running or exist.
exclude_all  : none must be running or exist.
optional_all :  either require_all or exclude_all

restart_on

States which interdependencies events trigger a restart.
The associated value attribute can be:
none    : never
error   : upon interdependency restart from a fault
restart : upon interdependency restart of any kind
refresh : upon interdependency restart or refresh 
But in fact it's a bit more complicated than just the above one-line descriptions, because they don't let clear the complementary stopping strategies. If a somehow required interdependency may not or is not actually restarted, then SMF may (depending on grouping) also have to take action to stop a "dependency-declaring" (dependent) instance in order to maintain consistency of the overall cross-declared availability rules.

There are two stopping strategies depending on grouping:
  1. The require _* and optional_all dependencies
  2. The exclude_all dependencies
case 1)

"... if a running (online or degraded) dependency is stopped, the "dependency-declaring" (dependent) instance will also stop according to the dependency stop-event  and the restart_on rule as in the following table:

            |          restart_on
 
dependency |--------------------------------
 stop-event |  none  error  restart  refresh
------------+--------------------------------
     fault  |
     -   stop     stop     stop 
    normal  |     -      -     stop     stop 
   refresh  |
     -      -        -     stop 
 
case 2)

"... if an "impeding" dependency starts running, the "dependency-declaring" (dependent) instance will stop, except when restart_on is none.


The dependency tag supports an additional type attribute.
The associated value attribute can be:

  • service
    Items in the dependency group are given by FMRIs.
       
  • path
    Items in the dependency group are given by file paths.
    An interesting kind of dependency, isn't it?


NOTE
Both dependency types use the same sub-tag service_fmri.
But I wonder if fnmatch(5) is supported as in svcs(1).

NOTE
A service is considered to be running if it is online or degraded.
The term exist is a way to refer to a path dependency.

NOTE
Interestingly, the interdependency group can optionally contain single (propval tag) or multi-valued (property tag) properties for associating some more metadata.

Example:

...

<instance enabled="..." name="..." >

  <!-- my (re)start depends upon -->
  <dependency type="path" name="..."
   grouping="..." restart_on="..." >
    <service_fmri value="/..." />
    ...
    <propval name="..." type="..." value="..." />

    ...
  </dependency >

  
  <!-- my (re)start depends upon -->
  <dependency type="service" name="..."
   grouping="..." restart_on="..." >
    <service_fmri value="svc:/..." />
    ...
  </dependency >

 
  <!-- my (re)start is to come before of -->
  <dependent name="NAMESPACE_..."
   grouping="..." restart_on="..." >
    <service_fmri value="svc:/..." />

    ...
    <propval name="..." type="..." value="..." />

    ...
  </dependent >


...

<instance >

...

NOTE
For the dependent tag it seems important to avoid conflicts on the name attribute value, hence it's recommended to add some sort of prefix to it (NAMESPACE_ in the above example).

Without delving into the complexities and subtleties of SMF milestones, let's just recap that a SMF milestone is a collection of other SMF services upon which it depends, that is, a SMF milestone is essentially a big dependency list upon other SMF services. Furthermore, SMF milestones are somewhat loosely related to (thus not exactly as) the traditional system run levels it will eventually supersede. For instance, run-level 1 (system administration mode) is not exactly the same (see init(1M)) as run-level S (single-user mode), but for SMF both are svc:/milestone/single-user:default.

Nevertheless the issues on transitioning from system run-levels to SMF milestones, one usually useful milestone to use either as a dependency or as a dependent while developing new SMF services is the svc:/milestone/multi-user:default. The possibilities are:

  • An instance may choose to depend upon svc:/milestone/multi-user:default, which means it will start only after that milestone is achieved.
     
  • Otherwise, an instance may choose to have svc:/milestone/multi-user:default as a dependent, which essentially means that the instance will become part of this milestone.

Example:

...

<dependency type="service" name="..."
 grouping="require_all" restart_on="none" >
  <service_fmri value="svc:/milestone/multi-user:default" />
</dependency >

...

or

...

<dependent name="NAMESPACE_..."
 grouping="require_all" restart_on="none" >
  <service_fmri value="svc:/milestone/multi-user:default" />
</dependent >

...