In The Name Of Allah The Beneficent The Merciful
The
In some cases, services do not have running processes associated with them. Tasks such as bringing a network interface up or mounting a disk partition do not require continuously running processes. The
The following table lists some SMF services, their associated processes, and their restarter FMRI:
If you want to know the restarter for a service, use
The
In Unix, runlevel one is single user mode, two is multiuser mode, and three is multiuser mode with file sharing or network services. In each runlevel, there is a core set of services that must be brought online.
For example, levels one, two, and three all require a minimum amount of local filesystems to be mounted, and network interfaces to be online. Runlevel two requires all internet services to be online, and users must be able to log on to the host. Runlevel three requires everything level two does, plus the ability to share files by NFS.
Milestones are services that don't run any applications but do have a dependent list of services. Once those services are online, the milestone is marked online. The milestone ensures an expected group of services are up and running, so you don't have to check each individual service.
Here is a list of milestones currently online. In this case, seven milestones are online because they all had their dependencies met.
Consider the dependencies for
One of the dependent services listed is
To change the milestone level of the host, use the
As far as shutting down the host, the
In most Unix environments, the startup process consists of a handful of autonomous boot scripts. They act independently of one another; unaware of what scripts have already run or which ones will run after them. When they are invoked, there is no serious error checking and no recourse if the script fails.
For Solaris 10, Sun introduced the Service Management Facility. SMF is a framework that handles system boot-up, process management, and self-healing. It addresses the shortcomings of startup scripts and creates an infrastructure to manage daemons after the host has booted.
A System V Unix host will start the
sendmail
daemon with the script S80sendmail
from either the /etc/rc2.d
or /etc/rc3.d
directory. The script contains commands to start or stop sendmail, depending its invocation. The S
portion of the filename denotes that this is a startup script, and the 80
is a sequence number that says when the script should run.
When
S80sendmail
runs, it won't be aware of any previous problems such as a NIS failure or /var never properly mounting. You could write tests into the script, but that increases startup time and the complexity of each script.
In the SMF environment,
sendmail
is a service. Solaris 10 defines a service as a persistent program that handles system or user requests. Services are expected to be fault tolerant and manageable by the operating system.
Services are identified by a URI known as a Fault Management Resource Identifier. The FMRI is broken up in a category hierarchy to help identify the service and what it is responsible for.
Here is the FMRI for
sendmail
, ssh
, and other services running on a host:svc:/network/smtp:sendmail
svc:/network/ssh:default
svc:/network/system/filesystem/local:default
lrc:/etc/rc2_d/S99audit
Here is the breakdown of the FMRI structure:
schema | service name | instance | ||
---|---|---|---|---|
category | ||||
svcs: | /network | /smtp | :sendmail | |
svcs: | /network | /ssh | :default | |
svcs: | /system | /filesystem | /local | :default |
lrc: | /etc | /rc2_d | S99audit |
Each service has a manifest that describes the service and its management needs. It lists the service dependencies, the control scripts, and the actions to take when the service fails. The manifest starts out as an XML file that SMF imports into a central repository, which records the properties of all the services.
Sendmail will not run without the following dependencies:
- Local filesystems are mounted
- Basic network services are up
- The host is aware of its domain name
- The /etc/nsswitch.conf file exists
- The /etc/mail/sendmail.cf file exists
- Any nameservices in use (NIS, LDAP) are running
- The auto filesystem, if in use, is running
- Syslog, if in use, is running
Services in the SMF environment start up in parallel, but each service will become available only when all its listed dependencies are. This means the host will have a faster boot-up, and it will reduce the chances of a cascading failure of services. There is no explicit order to service startup, so
sendmail
or its dependencies could start up at any time.
Almost all services under the SMF are controlled by one service known as the restarter. The restarter controls the
svc.startd
daemon, which in turn starts the other services, tests their dependencies, and restarts them if they fail. When Solaris 10 boots up,svc.startd
is one of the first programs spawned from /sbin/init
.
It's still possible to use
rcN.d
scripts under Solaris 10; however, the programs started from these scripts will not be under SMF control. These are referred to as legacy run scripts. They have an FMRI, like normal services do, but the schema prefix is lrc:
. Legacy run scripts are not initialized until all SMF services are up and running. When the host shuts down, they are the first stop scripts run before the SMF services are disabled.Administering SMF
The two most common commands used to administer services aresvcs
and svcadm
. Thesvcs
command reports on the state of configured services, while the svcadm
command controls the services.$ svcs
STATE STIME FMRI
...
legacy_run Sep_22 lrc:/etc/rc2_d/S99audit
...
online Sep_22 svc:/system/svc/restarter:default
online Sep_22 svc:/system/filesystem/autofs:default
online Sep_22 svc:/system/system-log:default
online Sep_22 svc:/network/smtp:sendmail
online Sep_22 svc:/system/filesystem/local:default
online Sep_22 svc:/network/ssh:default
online Sep_22 svc:/system/dumpadm:default
online Sep_22 svc:/network/loopback:default
...
Running svcs
without arguments lists all running (online) services. The STATE
column reports the service status; the STIME
refers to when the service state last changed; and the FMRI
identifies the service. If you want to list all services, not just those that are running, use the -a
option.The
svcs
command can also examine a single service by using either a full or partial FMRI. You can add the -v
or -x
options for extended output on the service. The -d
option will list all the dependencies of a service.$ svcs svc://localhost/network/ssh:default
STATE STIME FMRI
online Sep_22 svc:/network/ssh:default
$ svcs -v svc:/network/ssh
STATE NSTATE STIME CTID FMRI
online - Sep_22 52 svc:/network/ssh:default
$ svcs -x network/ssh
svc:/network/ssh:default (SSH server)
State: online since Thu Sep 22 07:51:15 2005
See: sshd(1M)
See: /var/svc/log/network-ssh:default.log
Impact: None.
$ svcs -d ssh STATE STIME FMRI online
Sep_22 svc:/network/loopback:default online Sep_22
svc:/network/physical:default online Sep_22
svc:/system/cryptosvc:default online Sep_22
svc:/system/filesystem/local:default online Sep_22
svc:/system/utmp:default online Sep_22
svc:/system/filesystem/autofs:default
You can add the hostname localhost
to an FMRI, or you can abbreviate it by removing the instance name and/or the categories. If the abbreviation results in multiple matches, they will all be listed. Here are two services that each have the name local
in the last segment of the service name:$ svcs local
STATE STIME FMRI
online Sep_22 svc:/system/device/local:default
online Sep_22 svc:/system/filesystem/local:default
You can also perform basic glob matching on service names:$ svcs "*network*"
STATE STIME FMRI
disabled Sep_22 svc:/network/rpc/keyserv:default
disabled Sep_22 svc:/network/rpc/nisplus:default
disabled Sep_22 svc:/network/nis/client:default
.....
online Sep_22 svc:/network/nfs/client:default
online Sep_22 svc:/network/security/ktkt_warn:default
online Sep_22 svc:/network/telnet:default
online Sep_22 svc:/network/nfs/rquota:default
$
Services can manage a running process or an OS state. By using the -p
option with svcs
, you can identify the processes associated with a service.$ svcs -p svc:/network/ssh
STATE STIME FMRI
online Sep_22 svc:/network/ssh:default
Sep_22 345 sshd
The time the process started is listed under the STIME
column.In some cases, services do not have running processes associated with them. Tasks such as bringing a network interface up or mounting a disk partition do not require continuously running processes. The
svc:/system/filesyste/local:default
service runs the mount command once to mount all local filesystems, and then the script exits. SMF refers to these as transient services.$ svcs -p svc:/system/filesystem/local:default
STATE STIME FMRI
online Sep_22 svc:/system/filesystem/local:default
Finally, there are services that have running processes only when they are in use. When Sun designed the Service Management Framework, it merged the behavior of inetd
and the way it handles network daemons. All the daemons that previously appeared in the/etc/inetd.conf file are now SMF-managed services. The difference is that these services use the inetd
daemon as a starter, instead of svc.startd
.$ svcs -p rlogin
STATE STIME FMRI
online Sep_22 svc:/network/login:rlogin
$ rlogin localhost
Password:
Last login: Sun Feb 19 23:49:56 from localhost
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
$ svcs -p rlogin
STATE STIME FMRI
online Sep_22 svc:/network/login:rlogin
23:50:41 23833 in.rlogind
23:50:41 23836 bash
23:50:48 23840 svcs
$ exit
logout
Connection to localhost closed.
$ svcs -p rlogin
STATE STIME FMRI
online Sep_22 svc:/network/login:rlogin
If you kill a process under the control of service management, the program that originally started it will restart it. Here's an example of an Apache2 service that has been running since January 5. First, I double-checked the service by grepping for the process IDs, which match the ones listed with the service. Then, I sent the TERM
signal to the parent of all of the child processes.# svcs -p http
STATE STIME FMRI
online Jan_05 svc:/application/http:apache2
Jan_05 12377 httpd
Jan_05 12378 httpd
Jan_05 12379 httpd
Jan_05 12380 httpd
# ps -ef | grep http
root 12377 1 0 Jan 05 ? 2:14 /opt/apache2/bin/httpd -DPERL
root 23521 23520 0 20:33:01 pts/1 0:00 grep http
http 12378 12377 0 Jan 05 ? 0:00 /opt/apache2/bin/httpd -DPERL
http 12380 12377 0 Jan 05 ? 0:00 /opt/apache2/bin/httpd -DPERL
# kill -TERM 12377
# ps -ef | grep http
root 23527 23520 0 20:33:25 pts/1 0:00 grep http
root 23580 1 0 20:33:09 ? 0:01 /opt/apache2/bin/httpd -DPERL
http 23581 23580 0 20:33:10 ? 0:00 /opt/apache2/bin/httpd -DPERL
http 23582 23580 0 20:33:12 ? 0:00 /opt/apache2/bin/httpd -DPERL
http 23583 23580 0 20:33:12 ? 0:00 /opt/apache2/bin/httpd -DPERL
# svcs -p svc:/application/http:apache2
STATE STIME FMRI
online 20:33:09 svc:/application/http:apache2
20:33:09 23580 httpd
20:33:10 23581 httpd
20:33:11 23582 httpd
20:33:11 23583 httpd
I then rechecked for the httpd
processes to find that the svc.start daemon started new Apache servers. Then I examined the http service. It reported that the service time had changed, and listed the new process IDs.The following table lists some SMF services, their associated processes, and their restarter FMRI:
Service | Processes | Restarter |
---|---|---|
svc:/system/svc/restarter:sendmail | svc.startd | none |
svc:/network/smtp:sendmail | sendmail | svc:/system/svc/restarter:default |
svc:/network/ssh:default | sshd | svc:/system/svc/restarter:default |
svc:/system/sac:default | sac ttymon | svc:/system/svc/restarter:default |
svc:/network/inetd:default | inetd | svc:/system/svc/restarter:default |
svc:/network/telnet:default | in.telnetd | svc:/network/inetd:default |
svcs -l
. Use svcs -R
with a full FMRI to list all of the services a restarter service controls.$ svcs -l network/ssh
fmri svc:/network/ssh:default
name SSH server
enabled true
state online
next_state none
state_time Thu Sep 22 07:51:15 2005
logfile /var/svc/log/network-ssh:default.log
restarter svc:/system/svc/restarter:default
contract_id 52
dependency require_all/none svc:/system/filesystem/local (online)
dependency optional_all/none svc:/system/filesystem/autofs (online)
dependency require_all/none svc:/network/loopback (online)
dependency require_all/none svc:/network/physical (online)
dependency require_all/none svc:/system/cryptosvc (online)
dependency require_all/none svc:/system/utmp (online)
dependency require_all/restart file://localhost/etc/ssh/sshd_config (online)
$ svcs -R svc:/system/svc/restarter:default
STATE STIME FMRI
disabled Sep_22 svc:/system/metainit:default
disabled Sep_22 svc:/network/rpc/keyserv:default
online Sep_22 svc:/system/svc/restarter:default
online Sep_22 svc:/network/pfil:default
online Sep_22 svc:/milestone/name-services:default
online Sep_22 svc:/network/loopback:default
....
Controlling Services
Enable or disable a service using thesvcadm
command:# svcs -x telnet
svc:/network/telnet:default (Telnet server)
State: online since Thu Sep 22 07:51:11 2005
See: in.telnetd(1M)
See: telnetd(1M)
Impact: None.
# svcadm disable svc:/network/telnet:default
# svcs -x telnet
svc:/network/telnet:default (Telnet server)
State: disabled since Sun Feb 19 23:32:40 2006
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: in.telnetd(1M)
See: telnetd(1M)
Impact: This service is not running.
The configuration state of a service is recorded in the service repository, so changes to that state persist across reboots. If you disable telnet
, rebooting the host won't bring it back up. You must explicitly reenable it from the command line. Make a temporary change to the state of a service by adding the -t
option to svcadm
:# svcadm disable -t network/telnet
There are six different service states for configured SMF services.online
- The service is enabled and is running or available to run, or the tasks associated with this service are complete.
offline
- The service is enabled but has not yet reached the online state. It is either in the process of starting up, or the dependencies of the service are not yet online.
disabled
- The service is not enabled and should not be running.
degraded
- The service is running but in a limited capacity. The Sun documentation is very vague about what "degraded" means, and suggests that the programs associated with the service are responsible for making that determination.
maintenance
- The service has a problem, and it cannot continue to run or complete a task. A service in this state usually requires administrative intervention. The restarter for the service won't try to bring the service online until it has been cleared.
legacy-run
- This is the default state for legacy run services.
The
svcadm
command gives administrators a standard interface for controlling services.svcadm
recognizes several service management commands:enable
- Brings the service online.
disable
- Takes the service offline.
restart
- Restarts the service process, either by performing a disable followed by an enable, or a specific programmed method to restart the service.
refresh
- The refresh method rereads the service properties from the repository. This is useful if someone made configuration changes to the service definition. If that service is controlled by svc.startd and that service also defines an internal refresh method, then the refresh method runs. A program that is usually refreshed rereads its configuration file.
- clear
- Resets a service that is in the maintenance state.
- mark (degraded or maintenance)
- Deliberately sets the state of a service to either degraded or maintenance. This is usually used for debugging a service.
svcadm
command is more picky about wildcards, unlike svcs
. You can still use abbreviated FMRIs and wildcards, as long as they match only one full FMRI.# svcadm refresh svc:/network/login
svcadm: Pattern 'svc:/network/login' matches multiple instances:
svc:/network/login:rlogin
svc:/network/login:klogin
svc:/network/login:eklogin
# svcadm refresh "svc:/*rlogin"
# svcs "*rlogin"
STATE STIME FMRI
online 23:24:17 svc:/network/login:rlogin
Boot-up and Runlevels
Because rc scripts are no longer the preferred method used to manage programs, Sun has enhanced the runlevel model with service milestones.In Unix, runlevel one is single user mode, two is multiuser mode, and three is multiuser mode with file sharing or network services. In each runlevel, there is a core set of services that must be brought online.
For example, levels one, two, and three all require a minimum amount of local filesystems to be mounted, and network interfaces to be online. Runlevel two requires all internet services to be online, and users must be able to log on to the host. Runlevel three requires everything level two does, plus the ability to share files by NFS.
Milestones are services that don't run any applications but do have a dependent list of services. Once those services are online, the milestone is marked online. The milestone ensures an expected group of services are up and running, so you don't have to check each individual service.
Here is a list of milestones currently online. In this case, seven milestones are online because they all had their dependencies met.
$ svcs "svc:/milestone/*"
online Sep_22 svc:/milestone/name-services:default
online Sep_22 svc:/milestone/network:default
online Sep_22 svc:/milestone/devices:default
online Sep_22 svc:/milestone/single-user:default
online Sep_22 svc:/milestone/sysconfig:default
online Sep_22 svc:/milestone/multi-user:default
online Sep_22 svc:/milestone/multi-user-server:default
Here is a list of milestones and their equivalant rc levels.Milestone | RC Level | Description |
---|---|---|
svc:/milestone/devices:default | Devices | |
svc:/milestone/network:default | Network interfaces online | |
svc:/milestone/single-user:default | 1 | Single-user mode |
svc:/milestone/sysconfig:default | Basic system configuration | |
svc:/milestone/name-services:default | Any one of the NIS, NIS+, DNS, or LDAP services | |
svc:/milestone/multi-user:default | 2 | Multiuser mode |
svc:/milestone/milti-user-server:default | 3 | Multiuser server mode |
svc:/milestone/multi-user:default
:$ svcs -d milestone/multi-user
STATE STIME FMRI
disabled Sep_22 svc:/network/smtp:sendmail
online Sep_22 svc:/milestone/name-services:default
online Sep_22 svc:/milestone/single-user:default
online Sep_22 svc:/system/filesystem/local:default
online Sep_22 svc:/network/rpc/bind:default
online Sep_22 svc:/milestone/sysconfig:default
online Sep_22 svc:/system/utmp:default
online Sep_22 svc:/network/inetd:default
online Sep_22 svc:/network/nfs/client:default
online Sep_22 svc:/system/system-log:default
Milestones are checkpoints in the operating system. Before multiuser mode can be online, network/smtp
, milestone/name-services
, milestone/single-user
, rpc/bind
, and the other services listed must be online as well.One of the dependent services listed is
milestone/single-user
, which has its own list of dependencies:$ svcs -d milestone/single-user
STATE STIME FMRI
disabled Sep_22 svc:/system/metainit:default
online Sep_22 svc:/network/loopback:default
online Sep_22 svc:/milestone/network:default
online Sep_22 svc:/milestone/devices:default
online Sep_22 svc:/system/filesystem/minimal:default
online Sep_22 svc:/system/manifest-import:default
online Feb_21 svc:/system/identity:node
Instead of making all milestones dependent on common services, the milestones are set up as cascading checkpoints. When you change the dependency list formilestone/single-user
, you don't need to change the dependencies formilestone/multi-user-server
.To change the milestone level of the host, use the
svcadm
command:$ svcadm milestone -d [milestone FMRI]
The -d
option lets you set your choice as the default milestone. This option will persist across reboots.As far as shutting down the host, the
shutdown
or init
commands are still the preferred methods of performing a safe shutdown or reboot.Debugging Problems with Services
Sometimes services fail due to unavoidable circumstances. For example, a bad configuration file will prevent the Apache process from starting. If the service fails, it will usually end up being marked in the maintenance state. To correct this problem, you need to know where to look for problems.# svcs http
STATE STIME FMRI
maintenance 20:51:31 svc:/application/http:apache2
# svcs -x http
svc:/application/http:apache2 (Apache2 Server)
State: maintenance since Mon Feb 20 20:51:31 2006
Reason: Method failed.
See: http://sun.com/msg/SMF-8000-8Q
See: httpd(8)
See: /var/svc/log/application-http:apache2.log
Impact: This service is not running.
Each service keeps a log with the output from the method script. Most errors will appear in this file, as long as the program writes out errors to stdout or stderr.# tail /var/svc/log/application-http\:apache2.log
Syntax error on line 23 of /etc/opt/apache2/httpd.conf:
Invalid command 'Kisten', perhaps mis-spelled or defined by a module not included in the server configuration
[ Feb 20 20:50:30 Method "stop" exited with status 0 ]
[ Feb 20 20:51:31 Method or service exit timed out. Killing contract 957 ]
[ Feb 20 20:51:31 Rereading configuration. ]
Another option is to check the log of svc.startd
, as it is the restarter process for the Apache service.# tail /var/svc/log/svc.stard.log
Feb 20 20:51:31/3: svc:/application/http:apache2: Method or service exit timed
out. Killing contract 957.
Feb 20 20:51:31/520: application/http:apache2 failed
After you have corrected the error, use the svcadm
command to clear the maintenance state.# svcadm clear application/http:apache2
# svcs -x http
svc:/application/http:apache2 (Apache2 Server)
State: online since Mon Feb 20 21:00:22 2006
See: httpd(8)
See: /var/svc/log/application-http:apache2.log
Impact: None.
The important thing to remember is that the Service Management Facility isn't designed to block normal access to programs or processes. If you really need to perform serious testing of Apache httpd
or other programs, it's still possible to invoke these commands from the command line. If a service is in the maintenance state, then go ahead and runhttp -t
, or sendmail -bD
, or whatever command you need to run. SMF will not interfere with processes that did not initiate from its own starter.
No comments:
Post a Comment