sanlock - Man Page

shared storage lock manager

Synopsis

sanlock [COMMAND] [ACTION] ...

Description

sanlock is a lock manager for shared storage environments. It allows applications running on multiple hosts to coordinate access to shared resources, such as data objects on shared storage systems like a SAN, preventing data corruption and ensuring data integrity.

Options

COMMAND can be one of three primary top level choices

sanlock daemon start daemon
sanlock client send request to daemon (default command if none given)
sanlock direct access storage directly (no coordination with daemon)

Daemon Command

sanlock daemon [options]

-D     no fork and print all logging to stderr

-Q 0|1 quiet error messages for common lock contention

-R 0|1 renewal debugging, log debug info for each renewal

-L pri write logging at priority level and up to logfile (-1 none)

-S pri write logging at priority level and up to syslog (-1 none)

-U uid user id

-G gid group id

-H num renewal history size

-t num max worker threads

-g sec seconds for graceful recovery

-w 0|1 use watchdog through wdmd

-o sec io timeout

-h 0|1 use high priority (RR) scheduling

-l num use mlockall (0 none, 1 current, 2 current and future)

-b sec seconds a host id bit will remain set in delta lease bitmap

-e str unique local host name used in delta leases as host_id owner

Client Command

sanlock client action [options]

sanlock client status

Print processes, lockspaces, and resources being managed by the sanlock daemon.  Add -D to show extra internal daemon status for debugging. Add -o p to show resources by pid, or -o s to show resources by lockspace.

sanlock client host_status

Print state of host_id leases read during the last renewal.  State of all lockspaces is shown (use -s to select one).  Add -D to show extra internal daemon status for debugging.

sanlock client gets

Print lockspaces being managed by the sanlock daemon.  The LOCKSPACE string will be followed by ADD or REM if the lockspace is currently being added or removed.  Add -h 1 to also show hosts in each lockspace.

sanlock client renewal -s LOCKSPACE

Print a history of renewals with timing details. See the Renewal history section below.

sanlock client log_dump

Print the sanlock daemon internal debug log.

sanlock client shutdown

Ask the sanlock daemon to exit.  Without the force option (-f 0), the command will be ignored if any lockspaces exist.  With the force option (-f 1), any registered processes will be killed, their resource leases released, and lockspaces removed.  With the wait option (-w 1), the command will wait for a result from the daemon indicating that it has shut down and is exiting, or cannot shut down because lockspaces exist (command fails).

sanlock client init -s LOCKSPACE

Tell the sanlock daemon to initialize a lockspace on disk.  The -o option can be used to specify the io timeout to be written in the host_id leases. The -Z and -A options can be used to specify the sector size and align size, and both should be set together.   Use -N 1 to include the NO_TIMEOUT flag in the newly formatted leases.  Use -C 1 to request the use of CAW leases if supported, or -C 0 to not use CAW leases. (Also see sanlock direct init.)

sanlock client init -r RESOURCE

Tell the sanlock daemon to initialize a resource lease on disk. The -Z and -A options can be used to specify the sector size and align size, and both should be set together.  Use -C 1 to request the use of CAW leases if supported, or -C 0 to not use CAW leases. (Also see sanlock direct init.)

sanlock client read -s LOCKSPACE

Tell the sanlock daemon to read a lockspace from disk.  Only the LOCKSPACE path and offset are required.  If host_id is zero, the first record at offset (host_id 1) is used.  The complete LOCKSPACE is printed. Add -D to print other details. (Also see sanlock direct read_leader.)

sanlock client read -r RESOURCE

Tell the sanlock daemon to read a resource lease from disk.  Only the RESOURCE path and offset are required.  The complete RESOURCE is printed. Add -D to print other details. (Also see sanlock direct read_leader.)

sanlock client init_host -s LOCKSPACE

Tell the sanlock daemon to initialize a single host_id lease on disk. The host_id specified in the -s arg will be used, and written as the lease owner. Optionally specify host name with -e, generation with -g, and timestamp with -t.  Use -Z to specify sector size. Use -N 1 to include the NO_TIMEOUT flag in the reformatted lease. Use -C 1 to request the use of CAW leases if supported, or -C 0 to not use CAW leases.  (Also see sanlock direct init_host for more information.)

sanlock client add_lockspace -s LOCKSPACE

Tell the sanlock daemon to acquire the host_id lease for the host_id specified in LOCKSPACE.  This is also referred to as "joining" the lockspace.  With a host_id lease held for the lockspace, the host is then able to acquire resource locks in the lockspace.  Use -o <sec> to specify the io timeout of the acquiring host, which will be written in the host_id lease.

sanlock client inq_lockspace -s LOCKSPACE

Inquire about the state of the lockspace in the sanlock daemon, whether it is being added or removed, or is joined.

sanlock client rem_lockspace -s LOCKSPACE

Tell the sanlock daemon to release the specified host_id in the lockspace. Any processes holding resource leases in this lockspace will be killed, and the resource leases not released.

sanlock client command -r RESOURCE -c path args

Register with the sanlock daemon, acquire the specified resource lease, and exec the command at path with args.  When the command exits, the sanlock daemon will release the lease.  -c must be the final option.

sanlock client spawn -r RESOURCE -c COUNT CMD [ARG...] [-c COUNT CMD [ARG...]]...

Register with the sanlock daemon, acquire the specified resource lease, fork and exec each command specified by -c sequentially as separate processes, checking the exit status of each process, and only moving to the next process on success. After all processes are successfully executed, or at the first failure, the lease is released explicitly before exiting.  Use -P 1 for persistent locks that will not be dropped if the spawn process dies while a child process is running.  Use -h 1 to report a conflicting lock owner.  Use -O 1 to acquire an orphan lock.  Use -d 1 to skip the on-disk resource lease update when releasing the resource lease after all commands complete successfully (useful when the commands have removed the lease storage.)

sanlock client acquire -r RESOURCE -p pid
sanlock client release -r RESOURCE -p pid

Tell the sanlock daemon to acquire or release the specified resource lease for the given pid.  The pid must be registered with the sanlock daemon. acquire can optionally take a versioned RESOURCE string RESOURCE:lver, where lver is the version of the lease that must be acquired, or fail. Use -C in place of -p to specify client_id.

sanlock client convert -r RESOURCE -p pid

Tell the sanlock daemon to convert the mode of the specified resource lease for the given pid.  If the existing mode is exclusive (default), the mode of the lease can be converted to shared with RESOURCE:SH.  If the existing mode is shared, the mode of the lease can be converted to exclusive with RESOURCE (no :SH suffix). Use -C in place of -p to specify client_id.

sanlock client inquire -p pid

Print the resource leases held the given pid.  The format is a versioned RESOURCE string "RESOURCE:lver" where lver is the version of the lease held. Use -C in place of -p to specify client_id.

sanlock client request -r RESOURCE -f force_mode

Request the owner of a resource do something specified by force_mode.  A versioned RESOURCE:lver string must be used with a greater version than is presently held.  Zero lver and force_mode clears the request.

sanlock client examine -r RESOURCE

Examine the request record for the currently held resource lease and carry out the action specified by the requested force_mode.

sanlock client examine -s LOCKSPACE

Examine requests for all resource leases currently held in the named lockspace.  Only lockspace_name is used from the LOCKSPACE argument.

sanlock client set_event -s LOCKSPACE -i host_id -g gen -e num -d num

Set an event for another host.  When the sanlock daemon next renews its host_id lease for the lockspace it will: set the bit for the host_id in its bitmap, and set the generation, event and data values in its own host_id lease.  An application that has registered for events from this lockspace on the destination host will get the event that has been set when the destination sees the event during its next host_id lease renewal.

sanlock client set_config -s LOCKSPACE

Set a configuration value for a lockspace. Only lockspace_name is used from the LOCKSPACE argument. The USED flag has the same effect on a lockspace as a process holding a resource lease that will not exit.  The USED_BY_ORPHANS flag means that an orphan resource lease will have the same effect as the USED.  The -o <sec> option can be used to update the lockspace's io timeout.
-u 0|1 Set (1) or clear (0) the USED flag.
-O 0|1 Set (1) or clear (0) the USED_BY_ORPHANS flag.

sanlock client set_host -s LOCKSPACE -i host_id -g gen -F flag_name

When flag_name is DEAD_EXT, the DEAD_EXT flag is set in the host_id lease for the specified host_id.  If the current host_id lease generation does not match the specified generation, then the command will fail.  With DEAD_EXT set, the host_id+generation will be considered dead, and resource locks held by the specified host_id+generation will be free for other hosts to acquire.  DEAD_EXT should only be set for a host if that host can no longer modify the shared resources that were protected by the resource locks in the lockspace.

Direct Command

sanlock direct action [options]

-o sec io timeout in seconds

sanlock direct init -s LOCKSPACE
sanlock direct init -r RESOURCE

Initialize storage for a lockspace or resource.  Use the -Z and -A flags to specify the sector size and align size.  The max hosts that can use the lockspace/resource (and the max possible host_id) is determined by the sector/align size combination.  Possible combinations are: 512/1M, 4096/1M, 4096/2M, 4096/4M, 4096/8M.  Lockspaces and resources both use the same amount of space (align_size) for each combination.  When initializing a lockspace, sanlock initializes host_id leases (delta leases) for max_hosts in the given space.  When initializing a resource, sanlock initializes a single resource lock (paxos lease) in the space.  With -s, the -o option specifies the io timeout to be written in the host_id leases.  With -r, the -z 1 option invalidates the resource lease on disk so it cannot be used until reinitialized normally.  Use -N 1 to include NO_TIMEOUT in newly formatted lockspace host_id leases.  Use -C 1 to request the use of COMPARE AND WRITE (CAW) leases if supported, or -C 0 to not use CAW leases.

sanlock direct init_host -s LOCKSPACE

Initialize a single host_id lease.  The host_id specified in the -s arg will be used, and written as the lease owner (leader.owner_id). Optionally specify host name (leader.resource_name) with -e, generation number (leader.owner_generation) with -g, and timestamp (leader.timestamp) with -t (timestamp value 1 is special, and causes the current time to be written in the timestamp field.  A timestamp value of 0 is means the host_id lease is free, as usual.) The -Z and -o options apply as with direct init. Use -N 1 to include NO_TIMEOUT in the reformatted host_id lease. Use -C 1 to request the use of CAW leases if supported, or -C 0 to not use CAW leases.

sanlock direct read_leader -s LOCKSPACE
sanlock direct read_leader -r RESOURCE

Read a leader record from disk and print the fields.  The leader record is the single sector of a delta lease, or the first sector of a paxos lease.

sanlock direct read -s LOCKSPACE
sanlock direct read -r RESOURCE

Read a complete lockspace or resource from disk and print it.

sanlock direct dump path[:offset[:size]]

Read disk sectors and print leader records for delta or paxos leases.  Add -f 1 to print the request record values for paxos leases, host_ids set in delta lease bitmaps.

LOCKSPACE option string

-s lockspace_name:host_id:path:offset

lockspace_name name of lockspace
host_id local host identifier in lockspace
path path to storage to use for leases
offset offset on path (bytes)

RESOURCE option string

-r lockspace_name:resource_name:path:offset

lockspace_name name of lockspace
resource_name name of resource
path path to storage to use leases
offset offset on path (bytes)

RESOURCE option string with suffix

-r lockspace_name:resource_name:path:offset:lver

lver leader version

-r lockspace_name:resource_name:path:offset:SH

SH indicates shared mode

Defaults

sanlock help shows the default values for the options above.

sanlock version shows the build version.

Other

Request/Examine

The first part of making a request for a resource is writing the request record of the resource (the sector following the leader record).  To make a successful request:

  • RESOURCE:lver must be greater than the lver presently held by the other host.  This implies the leader record must be read to discover the lver, prior to making a request.
  • RESOURCE:lver must be greater than or equal to the lver presently written to the request record.  Two hosts may write a new request at the same time for the same lver, in which case both would succeed, but the force_mode from the last would win.
  • The force_mode must be greater than zero.
  • To unconditionally clear the request record (set both lver and force_mode to 0), make request with RESOURCE:0 and force_mode 0.

The owner of the requested resource will not know of the request unless it is explicitly told to examine its resources via the "examine" api/command, or otherwise notfied.

The second part of making a request is notifying the resource lease owner that it should examine the request records of its resource leases.  The notification will cause the lease owner to automatically run the equivalent of "sanlock client examine -s LOCKSPACE" for the lockspace of the requested resource.

The notification is made using a bitmap in each host_id lease.  Each bit represents each of the possible host_ids (1-2000).  If host A wants to notify host B to examine its resources, A sets the bit in its own bitmap that corresponds to the host_id of B.  When B next renews its host_id lease, it reads the host_id leases for all hosts and checks each bitmap to see if its own host_id has been set.  It finds the bit for its own host_id set in A's bitmap, and examines its resource request records.  (The bit remains set in A's bitmap for set_bitmap_seconds.)

force_mode determines the action the resource lease owner should take:

  • FORCE (1): kill the process holding the resource lease.  When the process has exited, the resource lease will be released, and can then be acquired by anyone.  The kill signal is SIGKILL (or SIGTERM if SIGKILL is restricted.)
  • GRACEFUL (2): run the program configured by sanlock_killpath against the process holding the resource lease.  If no killpath is defined, then FORCE is used.

Persistent and orphan resource leases

A resource lease can be acquired with the PERSISTENT flag (-P 1).  If the process holding the lease exits, the lease will not be released, but kept on an orphan list.  Another local process can acquire an orphan lease using the ORPHAN flag (-O 1), or release the orphan lease using the ORPHAN flag (-O 1).  All orphan leases can be released by setting the lockspace name (-s lockspace_name) with no resource name.

Renewal history

sanlock saves a limited history of lease renewal information in each lockspace. See sanlock.conf renewal_history_size to set the amount of history or to disable (set to 0).

IO times are measured in delta lease renewal (each delta lease renewal includes one read and one write).

For each successful renewal, a record is saved that includes:

  • the timestamp written in the delta lease by the renewal
  • the time in milliseconds taken by the delta lease read
  • the time in milliseconds taken by the delta lease write

 Also counted and recorded are the number io timeouts and other io errors that occur between successful renewals.

Two consecutive successful renewals would be recorded as:

timestamp=5332 read_ms=482 write_ms=5525 next_timeouts=0 next_errors=0
timestamp=5353 read_ms=99 write_ms=3161 next_timeouts=0 next_errors=0

Those fields are:

  • timestamp is the value written into the delta lease during that renewal.
  • read_ms/write_ms are the milliseconds taken for the renewal read/write ios.
  • next_timeouts are the number of io timeouts that occurred after the renewal recorded on that line, and before the next successful renewal on the following line.
  • next_errors are the number of io errors (not timeouts) that occurred after renewal recorded on that line, and before the next successful renewal on the following line.

The command 'sanlock client renewal -s lockspace_name' reports the full history of renewals saved by sanlock, which by default is 180 records, about 1 hour of history when using a 20 second renewal interval for a 10 second io timeout.

Configurable watchdog timeout

Watchdog devices usually have a 60 second timeout, but some devices have a configurable timeout.  To use a different watchdog timeout, set sanlock.conf watchdog_fire_timeout (in seconds) to a value supported by the device.  The same watchdog_fire_timeout must be configured on all hosts (so all hosts must have watchdog devices that support the same timeout).  Unmatching values will invalidate the lease protection provided by the watchdog.

watchdog_fire_timeout and io_timeout should usually be configured together.  By default, sanlock uses watchdog_fire_timeout=60 with io_timeout=10.  Other combinations to consider are:
watchdog_fire_timeout=30 with io_timeout=5
watchdog_fire_timeout=10 with io_timeout=2

Smaller values make it more likely that a host will be reset by the watchdog while waiting for slow io to complete or for temporary io failures to be resolved.  Spurious watchdog resets will also become more likely due to independent, overlapping lockspace outages, each of which would be inconsequential by itself.

Files

/etc/sanlock/sanlock.conf

The current settings in use by the sanlock daemon can be seen in the output of 'sanlock status -D'.

See Also

wdmd(8)

Referenced By

lvmlockd(8), sanlock_selinux(8).

2026-02-27