Learnings on Solaris™: 2017

Saturday, November 18, 2017

The Firefox issue

The Firefox versions that were shipped on Solaris 11 Express (even across all of its SRUs) and Solris 11.3 GA (and probably across all of its SRUs as well) are, saddly, rather outdated (for Solaris 11 Express that's even worse). For Solaris 11.3 the version is 31.8.0:

Multiple X displays

I once talked about Compiz Fusion which led me to a little X11 issue related to X11 screen composition settings and extensions. By that time I had just one monitor connected to my nVIDIA GT 610 video card:

$ xrandr
Screen 0: minimum 8 x 8, current 1680 x 1050, ...
DVI-I-0 disconnected primary (...)
VGA-0 disconnected (...)
DVI-I-1 connected 1680x1050+0+0 (...)
...
HDMI-0 disconnected (...)
...

Since then I've revived another old monitor, a 19" (TFT) Samsung SyncMaster 950B, that could be physically rotated, which I thought it would be useful as a secondary monitor for easing long readings, helping in more lengthy writings or codings and so on.

SMF service - dependencies

Another set of descriptions on a SMF service metadata are related to the interdependencies with other services, instances and milestones. The idea is to build cross-services conditions on whether participating instances should start or stay running (otherwise stop). This is an important information which provides for overall orchestration, ranging from start-up optimization, passing by running states consistency and up to troubleshooting of service availability, all major features of SMF.

Except for the correct preposition 😉,
the basic idea of interdependencies is straightforward:

One may depend upon others (dependency tag)
Others may depend on one (dependent tag)

Either way, upon or on, an interdependency relationship is always given in terms of a FMRI (see smf(5) and svcs(1) for the definition of a FMRI). In general, this FMRI refers just to a service name, but they can be instance specific as well.

Each kind of interdependency relationship is grouped under the aforementioned tags, where a few attributes specify how SMF is supposed to manage availability consistency, that is, the fault propagation model, in face of the interdependencies defined by the SMF service developer.

The common interdependency grouping attributes are:

name
For the dependent tag it's highly recommend to add a distinguishing prefix to the corresponding value attribute, such as a namespace marking to avoid name clashing.

grouping
Specifies the external relationship to a group.
The associated value attribute can be:

require_any : any must be running or exist.
require_all : all must be running or exist.
exclude_all : none must be running or exist.
optional_all :  either require_all or exclude_all

restart_on
States which interdependencies events trigger a restart.
The associated value attribute can be:

none    : never
error   : upon interdependency restart from a fault
restart : upon interdependency restart of any kind
refresh : upon interdependency restart or refresh

But in fact it's a bit more complicated than just the above one-line descriptions, because they don't let clear the complementary stopping strategies. If a somehow required interdependency may not or is not actually restarted, then SMF may (depending on grouping) also have to take action to stop a "dependency-declaring" (dependent) instance in order to maintain consistency of the overall cross-declared availability rules.

There are two stopping strategies depending on grouping:

The require _* and optional_all dependencies

The exclude_all dependencies

case 1)

"... if a running (online or degraded) dependency is stopped, the "dependency-declaring" (dependent) instance will also stop according to the dependency stop-event and the restart_on rule as in the following table:

|       restart_on
dependency |--------------------------------
stop-event | none error restart refresh
------------+--------------------------------
   fault |     -   stop stop     stop
    normal |     -      - stop stop
   refresh |     -      -     -     stop

case 2)

"... if an "impeding" dependency starts running, the "dependency-declaring" (dependent) instance will stop, except when restart_on is none.

The dependency tag supports an additional type attribute.
The associated value attribute can be:

service
Items in the dependency group are given by FMRIs.
path
Items in the dependency group are given by file paths.
An interesting kind of dependency, isn't it?

NOTE

Both dependency types use the same sub-tag service_fmri.
But I wonder if fnmatch(5) is supported as in svcs(1).

NOTE

A service is considered to be running if it is online or degraded.
The term exist is a way to refer to a path dependency.

NOTE

Interestingly, the interdependency group can optionally contain single (propval tag) or multi-valued (property tag) properties for associating some more metadata.

Example:

...

<instance enabled="..." name="..." >


<dependency type="path" name="..."
   grouping="..." restart_on="..." >
    <service_fmri value="/..." />
    ...
    <propval name="..." type="..." value="..." />
    ...
</dependency >


<dependency type="service" name="..."
   grouping="..." restart_on="..." >
    <service_fmri value="svc:/..." />
    ...
</dependency >


<dependent name="NAMESPACE_..."
   grouping="..." restart_on="..." >
    <service_fmri value="svc:/..." />
    ...
    <propval name="..." type="..." value="..." />
    ...
</dependent >

...

<instance >

...

NOTE

For the dependent tag it seems important to avoid conflicts on the name attribute value, hence it's recommended to add some sort of prefix to it (NAMESPACE_ in the above example).

Without delving into the complexities and subtleties of SMF milestones, let's just recap that a SMF milestone is a collection of other SMF services upon which it depends, that is, a SMF milestone is essentially a big dependency list upon other SMF services. Furthermore, SMF milestones are somewhat loosely related to (thus not exactly as) the traditional system run levels it will eventually supersede. For instance, run-level 1 (system administration mode) is not exactly the same (see init(1M)) as run-level S (single-user mode), but for SMF both are svc:/milestone/single-user:default.

Nevertheless the issues on transitioning from system run-levels to SMF milestones, one usually useful milestone to use either as a dependency or as a dependent while developing new SMF services is the svc:/milestone/multi-user:default. The possibilities are:

An instance may choose to depend upon svc:/milestone/multi-user:default, which means it will start only after that milestone is achieved.
Otherwise, an instance may choose to have svc:/milestone/multi-user:default as a dependent, which essentially means that the instance will become part of this milestone.

Example:

...

<dependency type="service" name="..."
grouping="require_all" restart_on="none" >
<service_fmri value="svc:/milestone/multi-user:default" />
</dependency >

...

or

...

<dependent name="NAMESPACE_..."
grouping="require_all" restart_on="none" >
<service_fmri value="svc:/milestone/multi-user:default" />
</dependent >

...

Friday, November 3, 2017

SMF service - contexts

An important SMF service metadata is the process context under which a service restarter method should be invoked. The importance spans throughout major subsystems, including security (the least-privilege principle), accounting (projects) and resource control.

For this purpose, the SMF framework provides the method_context tag together with a few sub-tags: method_credential, method_profile and method_environment.

NOTE

Note that method_credential and method_profile are mutually exclusive and that method_context must precede any method same scope.

Example:

...

<instance enabled="..." name="..." >

<method_context
   working_directory="..."
   resource_pool="..." project="..." >

    
    <method_credential
     user="..." group="..." supp_groups="..."
     privileges="..." limit_privileges="..." />

    
    

    <method_environment >
      <envvar name="..." value="..." />
...
</method_environment >

</method_context >


<exec_method type="method" name="start"
   exec="..."
   timeout_seconds="..." />

...

</instance >

...

SMF service - methods

Related to an SMF service restarter are the methods it supports for influencing the (running) state of a service instance which have been assigned to it. A method implementation typically invokes a binary (daemon), a script (for more complex start-ups) or yet a special SMF method keyword (:true or :kill).

For the master restarter, the default restarter, two mandatory methods must be implemented (or inherited from the service) by an instance:

start
May invoke a binary or a script to launch the instance.
Except for child model, the instance must fully come on-line.
Any non-zero return value or exceed timeout is an error.
stop
If (re)invoked to an already stopp(ed/ing) instance: not an error.
If the instance doesn't (imperatively) stop: must flag en error.

An optional refresh method is also supported by svc.startd for:

Reloading configuration from the SMF repository or standard files.
By all means, the instance must not be otherwise disturbed.
For daemons, traditionally implemented via a SIGHUP.
If not supported by the instance, it must be omitted.

NOTE

It's not clear in the public documentation if omitting the refresh method, when it's the case, would have the same net effect of implementing it as :true.

Example:

...

<service type="service" version="..."
name="..." >

...


<exec_method type="method" name="stop"
   exec="..."
   timeout_seconds="..." />

<instance enabled="..." name="..." >


    <exec_method type="method" name="start"
     exec="..."
     timeout_seconds="..." />

    ...

</instance >

<instance enabled="..." name="..." >


    <exec_method type="method" name="start"
     exec="..."
     timeout_seconds="..." />

    ...

</instance >

...

</service >

...

NOTE

In the example above, the timeout_seconds aren't shown, but the good practice recommends scaling as needed, but according to an instance typical start-up time, similarly to following sample scheme:

Up to   Time-out
-----   --------
   1s :    60s
30s :     300s

From which perhaps, I could infer that:

Up to   Time-out
-----   --------
300s :     900s
600s :    1200s

SMF service - restarter

SMF properties are also important to communicate to the SMF framework how a SMF service is to be handled, that is, how the framework should attempt to maintain availability of a particular service in terms of how and which restart methods and capabilities should be attempted by the designated restarter (by default, svc.startd, a.k.a. system/svc/restarter or the master restarter). This all has to do with the bright words: service model.

A service model is related to the supported service restart capabilities of the service restarter associated to a service. At the SMF development level, the interface to the default service restarter is achieved by certain overrides to a special property group known as startd of type framework. This way, it's possible to communicate to the framework, which service model the service restarter should apply to the given service.

Documentation about the service models supported by svc.startd seems old and scarce. In essence, it's supported by the duration property of the startd special property group.

Currently, 3 service models have been documented:

Contract (default)
The usual daemons that should run forever.
Death of all processes is an error and will trigger a restart.
Wait
A service that runs for the lifetime of the child process.
When that child process exit the service is restarted.
Transient
Usually a first-run / one-time service.
Example: a configuration and/or cleanup service.
The SMF framework won't bother maintaining availability.

Example:

...

<service type="service" version="..."
name="..." >

...


<property_group type="framework" name="startd" >

<propval type="astring"
name="duration" value="transient" />

</property_group >

...

</service >

...

NOTE

Another possibly useful property of svc.startd is ignore_error, specially when the given instance has sub-processes from which eventual coredumps aren't to be considered faults meriting SMF attention or to which external kill signals are not to be considered as errors by SMF.

Example:

<propval type="astring"
name="ignore_error" value="core,signal" />

NOTE

Beyond svc.startd (the default restarter), Solaris 11.3 documentation also refers to a svc.periodicd restarter, which seems to behave similarly to crontab.

SMF service - properties

An usually important part of a SMF service or instance metadata is a series of key / values entries known as service properties, which can be single-valued (the propval tag) or multi-valued (the property, typename_list and value_node tags) and must always appear within a property_group tag to the bottom of an instance description or service description, respectively depending on if the properties refer to an specific instance or are common to all instances (factored out up one level to the abstract service description, which is the common parent description of all of its concrete instances descriptions).

Example:

...

<service type="service" version="..."
name="..." >

<instance enabled="..." name="default" />
<single_instance />

...


<property_group type="application" name="..." >


<propval type="..." name="..." value="..." />
...


    <property type="typename" name="..." >
<typename_list >
        <value_node value="..." />
        <value_node value="..." />
        ...
</typename_list >
</property >
...

</property_group >

...

</service >

<service type="service" version="..."
name="..." >

<instance enabled="..." name="..." >

...

<property_group type="application" name="..." >


<propval type="..." name="..." value="..." />
...


      <property type="typename" name="..." >
<typename_list >
          <value_node value="..." />
          <value_node value="..." />
          ...
</typename_list >
</property >
...

</property_group >

...

</instance >

<instance enabled="..." name="..." >

...


<property_group type="application" name="..." >


<propval type="..." name="..." value="..." />
...

<property_group />


<property_group type="application" name="..." >


<propval type="..." name="..." value="..." />
...


      <property type="typename" name="..." >
<typename_list >
          <value_node value="..." />
          <value_node value="..." />
          ...
</typename_list >
</property >
...

</property_group >

...

</instance >

...


<property_group type="application" name="..." >


<propval type="..." name="..." value="..." />
...


    <property type="typename" name="..." >
<typename_list >
        <value_node value="..." />
        <value_node value="..." />
        ...
</typename_list >
</property >
...

</property_group >

...

</service >

...

NOTE

The above example is a simplified excerpt of a service bundle description containing two services descriptions, one single-instance and one multi-instance. For a single-instance service, there's no real purpose on having instance specific properties (the instance "easily can get confused" with the service). The type attribute of the property_group tag should be set to application, meaning that it's service specific (internal) and the framework shouldn't care about it. Lastly, there can be more than one property group, which may help better organization of properties, if needed.

NOTE

The datatype documentation for properties isn't very easily found, but in the scf_value_create(3SCF) man page. Possible values are:

count          : unsigned int (64-bits)
integer        : int (64-bits)
boolean        : true or false (bit)
astring        : Null-terminated ASCII string
ustring        : Null-terminated utf-8 string
net_address_v4 : IPv4 address
net_address_v6 : IPv6 address
net_address    : net_address_v4 or net_addrss_v6
hostname       : fully-qualified domain name (FQDN)
host           : hostname or net_address
time           : int, 64-bits seconds or 32-bits nanoseconds
fmri           : obvious
uri            : obvious
opaque         : a byte sequence

NOTE

Sometime it's good to know that after a service has been integrated into the SMF framework, the enabled property is automatically located on a special property group called general. This may be of particular interest when interactively creating a new instance of a service.

Saturday, November 18, 2017

Wednesday, November 8, 2017

Friday, November 3, 2017