pbs_server_attributes - Man Page

pbs server attributes

Description

Server attributes can be read by any client; privilege is not required. Most server attributes are alterable by a privileged client, run by a user with administrator or operator privilege. Certain attributes require the user to have full administrator privilege. The following is a list of the server attributes.

accounting_keep_days

This defines the number of days that accounting files will be kept. Default value: unset - pbs_server will never delete accounting files Format: integer

acl_group_hosts

List of group/host combinations that authorize users to submit jobs. Groups are expanded on the server. Format: "group@hostname.domain[,...]"; default value: defer to acl_users.

acl_group_sloppy

This is a default value for the queue attribute of the same name. Format: boolean, "TRUE", "True", "true", "Y", "y", "1", "FALSE", "False", "false", "N", "n", "0"; default value: false = disabled.

acl_host_enable

Attribute which when true directs the server to use the acl_hosts access control lists. Requires full manager privilege to set or alter. Format: boolean, "TRUE", "True", "true", "Y", "y", "1", "FALSE", "False", "false", "N", "n", "0"; default value: false = disabled.

acl_hosts

List of hosts which may request services from this server. This list contains the network name of the hosts. Local requests, i.e. from the server's host itself, are aways accepted even if the host is not included in the list. See section 10.1, Authorization, in the PBS External Reference Specification. Requires full manager privilege to set or alter. Format: "[+|-]hostname.domain[,...]"; default value: all hosts.

acl_logic_or

This is a default value for the queue attribute of the same name. Format: boolean, "TRUE", "True", "true", "Y", "y", "1", "FALSE", "False", "false", "N", "n", "0"; default value: false = disabled.

acl_user_enable

Attribute which when true directs the server to use the server level acl_users access list. Requires full manager privilege to set or alter. Format: boolean (see acl_group_enable); default value: disabled.

acl_user_hosts

List of user/host combinations that authorize users to submit jobs. Format: "user@hostname.domain[,...]"; default value: defer to acl_users.

acl_users

List of users allowed or denied the ability to make any requests of this  server. See section 10.1, Authorization, in the PBS External Reference Specification.  If acl_user_enable is set to true, only users listed in acl_users may submit to or execute jobs in the queue. Requires full manager privilege to set or alter. Format: "[+|-]user[@host][,...]"; default value: all users allowed.

acl_roots

List of super users who may submit to and execute jobs at this server. If the job execution id would be zero (0), then the job owner, root@host, must be listed in this access control list or the job is rejected. Format: "[+|-]user[@host][,...]"; default value: no root jobs allowed.

allow_node_submit

Allow job submissions from compute nodes regardless of ruserok(). Requires full manager privilege to set or alter. Format: boolean; default value: disabled.

allow_proxy_user

Specifies users can proxy from one user to another. Proxy requests will be either validated by ruserok() or by the scheduler. Format: boolean; default value: false.

auto_node_np

Automatically configure a node's np value based on the ncpus value from the status update. Requires full manager privilege to set or alter. Format: boolean; default value: disabled.

automatic_requeue_exit_code

This is an exit code, defined by the admin, that tells pbs_server to requeue the job instead of considering it as completed. This allows the user to add some additional checks that the job can run meaningfully, and if not, then the job script exits with the specified code to be requeued. Format: long; default value: none.

clone_batch_delay

Number of seconds to delay between cloning a batch for a job array. Format: integer; default value: 1.

clone_batch_size

Number of jobs to clone in a batch for a job array. Format: integer; default value: 256.

comment

A text string which may be set by the scheduler or other privileged client to provide information to the batch system users.  Format: any string; default value: none.

copy_on_rerun

When set to true, copy the output and error files over to the user-specified directory when a job is rerun. Format: boolean; default value: false.

cray_enabled

When set to true, specifies that this instance of pbs_server has Cray hardware that reports to it. Format: boolean; default value: false.

default_node

A node specification to use if there is no other supplied specification. This attribute is only used by servers where a nodes file exist in the server_priv directory providing a list of nodes to the server.  If the nodes file does not exist, this attribute is not set by default and is ignored if set.  The default value allows for jobs to share a single node. Format: a node specification string; default value: 1#shared.

default_queue

The queue which is the target queue when a request does not specify a queue name.  Format: a queue name;  default value: none, must be set to an existing queue.

disable_automatic_requeue

Normally, if a job cannot start due to a transient error, the MOM returns a special exit code to the server so that the job is requeued instead of completed. When this parameter is set, the special exit code is ignored and the job is completed. Format: boolean; default value: false.

display_job_server_suffix

When set to TRUE, Torque will display both the job ID and the host name. When set to FALSE, only the job ID will be displayed. Format: boolean; default value: true.

dont_write_nodes_file

When set to TRUE, the nodes file cannot be overwritten for any reason; qmgr commands to edit nodes will be rejected. Format: boolean; default value: false.

note_append_on_error

Append the error string to the end of the current nodes note if MOM reports a message beginning with the string "ERROR". Format: boolean; default value: true.

down_on_error

Set a node's state to "down" if MOM reports a message beginning with the string "ERROR".  This might interfere with moab's node error handling.  See the HEALTH CHECK section in pbs_mom(8B).  This is an EXPERIMENTAL feature and may be removed in the future.  Format: boolean; default value: true.

disable_server_id_check

Makes it so the user for the job doesn't have to exist on the server. The user must still exist on all the compute nodes or the job will fail when it tries to execute. Format: boolean; default value: false

extra_resc

Add additional string-type job resources.  They have no effect within TORQUE and are only advisible to the scheduler.  They can not be used for resources_default/min/max.  Format: list; default value: none.

interactive_jobs_can_roam

By default, interactive jobs run from the login node that they submitted from. When TRUE, interactive jobs may run on login nodes other than the one where the jobs were submitted to.  Only applies when cray_enabled is set. Format: boolean; default value: false.

job_exclusive_on_use

When job_exclusive_on_use is set to TRUE, pbsnodes will show job-exclusive on a node when there's at least one of its processors running a job. This differs with the default behavior which is to show job-exclusive on a node when all of its processors are running a job. Format: boolean; default value: false.

job_force_cancel_time

If configured, number of seconds after a delete where a job will be purged by the server. If not configured, no such thing happens. Format: integer; default value: not used.

job_full_report_time

Sets the time in seconds that a job should be fully reported after any kind of change to the job, even if condensed output was requested. Format: integer; default value: 300 seconds.

job_log_file_max_size

This specifies a soft limit (in kilobytes) for the job log's maximum size. The file size  is checked every five minutes and if the current day file size is greater than or equal to this value, it is rolled from <filename> to <filename.1> and a new empty log is opened. If the current day file size exceeds the maximum size a second time, the <filename.1> log file is rolled to <filename.2>, the current log is rolled to <filename.1>, and a new empty log is opened. Each new log causes all other logs to roll to an extension that is one greater than its current number. Any value less than 0 is ignored by pbs_server (meaning the log will not be rolled). Format: integer; default value: none.

job_log_file_roll_depth

This sets the maximum number of new log files that are kept in a day if the job_log_file_max_size parameter is set. For example, if the roll depth is set to 3, no file can roll higher than <filename.3>. If a file is already at the specified depth, such as <filename.3>, the file is deleted so it can be replaced by the incoming file roll, <filename.2>. Format: integer; default value: none.

job_log_keep_days

This maintains job logs for the number of days designated. If set to 4, any job log file older than 4 days old is deleted. Format: integer; default value: none.

job_nanny

Enables the "job deletion nanny" feature.  All job cancels will create a repeating task that will resend KILL signals if the initial job cancel failed.  Further job cancels will be rejected with the message "job cancel in progress." This is useful for temporary failures with a job's execution node during a job delete request.  It is possible that the job nanny might interfere with job restarts, migrations, and checkpointing.  Format: boolean; default value: false.

job_start_timeout

Specifies the pbs_server to pbs_mom TCP socket timeout in seconds that is used when the pbs_server sends a job start to the pbs_mom.  It is useful when the mom has extra overhead involved in starting jobs.  If not specified then the tcp_timeout value is used.

job_stat_rate

Moderates how often job stat requests will be issued from pbs_server to the MOM daemons.  If poll_jobs is unset or false, then all jobs that haven't been updated in job_stat_rate seconds will trigger a stat request.   If poll_jobs is true, then all jobs will be updated every job_stat_rate.
(see poll_jobs)  On active clusters, 60 or 120 might be reasonable. Default value: 45 seconds  (PBS_RESTAT_JOB in server_limits.h) Minimum value: 4 seconds  (PBS_JOBSTAT_MIN in server_limits.h)

job_sync_timeout

When a stray job is reported on multiple nodes, the server sends a kill signal to one node at a time. This timeout determines how long the server waits between kills if the job is still being reported on any nodes. Format: integer; default value: 60.

keep_completed

Number of seconds to retain completed jobs in the C state.  This is overridden by the execution queue attribute of the same name. Format: integer; default value: 0.

kill_delay

The amount of the time delay between the sending of SIGTERM and SIGKILL when a qdel command is issued against a running job.  This is overridden by the execution queue attribute of the same name. Format: integer seconds; default value: 2 seconds.

lock_file

Specifies the name and location of the lock file used to determine which high availability server should be active.
If a full path is specified, it is used verbatim by TORQUE. If a relative path is specified, TORQUE will prefix it with $TORQUE_HOME/server_priv. Format: string; default value: $TORQUE_HOME/server_priv/server.lock

lock_file_update_time

Specifies how often (in seconds) the thread will update the lockfile. (for threaded high availability) Format: integer; default value: 3

lock_file_check_time

Specifies how often (in seconds) a high availability server will check to see if it should become active. (for threaded high availability) Must be greater than lock_file_update_time. Format: integer; default value: 9

log_events

A bit string which specifies the type of events which are logged, see the section on Event Logging in chapter 3 of the ERS.  Format: integer; default value: 511, all events.

log_file_max_size

If this is set to a value > 0 then pbs_server will roll the current log file to logfile.1 when its size is greater than or equal to the value of log_file_max_size. This value is interpreted as kilobytes.

log_file_roll_depth

If this is set to a value >=1 and log_file_max_size is set then pbs_server will continue rolling the log files to logfile.log_file_roll_depth.

log_keep_days

If this is set then logs older than X days will be removed by the server. Format: integer; default value: not enforced;

log_level

Controls the verbosity of server logs.  This value ranges from 0 to 7 with 7 representing maximum verbosity.  Format: integer; default value: 0,  minimum verbosity.

mail_body_fmt

Override the default format for the body of outgoing mail messages. A number  of printf-like format specifiers and escape sequences can be used:

\n

new line

\t

horizontal tab

\\

backslash

\'

single quote

\"

double quote

%d

details concerning the message

%h

PBS host name

%i

PBS job identifier

%j

PBS job name

%m

long reason for message

%r

short reason for message

%%

a single %

Format: a printf-like format string; Default value:  "PBS Job Id: %i\nJob Name:   %j\nExec host:  %h\n%m\n%d\n".

mail_domain

Override the default domain for outgoing mail messages.  If set, emails will be addressed to "euser@mail_domain".  If unset, the job's Job_Owner attribute will be used.  If set to "never", TORQUE will never send emails.  Format: a domain name; Default value: none.

mail_from

Set the From: header for all outgoing emails from pbs_server.  Format: an email address; Default value: "adm".

mail_subject_fmt

Override the default format for the subject of outgoing mail messages. A number  of printf-like format specifiers and escape sequences can be used:

\n

new line

\t

horizontal tab

\\

backslash

\'

single quote

\"

double quote

%d

details concerning the message

%h

PBS host name

%i

PBS job identifier

%j

PBS job name

%m

long reason for message

%r

short reason for message

%%

a single %

Format: a printf-like format string; Default value: "PBS JOB %i".

mail_uid

The uid from which server generated mail is sent to users. Format: integer uid; default value: 0 for root.

managers

List of users granted batch administrator privileges. Format: user@host.sub.domain[,user@host.sub.domain...]. The host, sub-domain, or domain name may be wild carded by the use of an * character,  see the description of user access control lists in chapter 10.1.1 of the ERS. Requires full manager privilege to set or alter. Default value: root on the local host.

max_job_array_size

Specifies the maximum number of jobs that can be in any requested job array. Arrays requesting more jobs than configured will be rejected. Format: integer; default value: Unlimited.

max_slot_limit

No array can request a slot limit greater than 10. Any array that does not request a slot limit receives a slot limit of 10. Using the example above, slot requests greater than 10 are rejected. A slot limit is the maximum number of jobs from an array that can run concurrently. Format: integer; default value: Unlimited.

max_running

The maximum number of jobs allowed to be selected for execution at any given time. Advisory to the Scheduler, not enforced by the server. Format: integer.

max_user_run

The maximum number of jobs owned by a single user that are allowed to be running from this queue at one time.  This attribute is advisory to the Scheduler, it is not enforced by the server.  Format: integer; default value: none.

max_group_run

The maximum number of jobs owned by any users in a single group that are allowed to be running from this queue at one time. This attribute is advisory to the Scheduler, it is not enforced by the server. Format: integer; default value: none.

max_threads

This is the maximum number of threads that should exist in the thread pool at any  time. Format: integer; default value: ((2 * the number of procs listed in /proc/cpuinfo) + 1) * 20

max_user_queuable

When set, max_user_queuable places a system-wide limit on the amount of jobs that an individual user can queue. Format: integer; default value: unlimited.

min_threads

This is the minimum number of threads that should exist in the thread pool at any time. Format: integer; default value: (2 * the number of procs listed in /proc/cpuinfo) + 1.

moab_array_compatible

This parameter places a hold on jobs that exceed the slot limit in a job array. When one of the active jobs is completed or deleted, one of the held jobs goes to a queued state. Format: boolean; default value: true.

mom_job_sync

Enables the "job sync on MOM" feature.   When MOMs send a status update, and it includes a list of jobs, server will issue job deletes for any jobs that don't actually exist.  Format: boolean; default value: true.

next_job_number

This hidden attribute is used to allow a manager to set the value of the  next job ID via qmgr. This attribute should rarely be modified. Some sites  may find it useful if they need to recreate their pbs_server  database (perhaps due to a format change between major TORQUE versions) and  they keep a database of job information indexed by the job ID. The manager  should be careful to avoid setting the value to something that would allow  the next job number to conflict with a job already queued, however Torque  will handle this in a sane manner: the job submission will be rejected and  the next job number will be incremented.

net_counter

Lists the 3 numbers representing the number of connections in the last 5 seconds, 30 seconds, and 60 seconds.  This is a read-only attribute. Format: string; default value: none.

no_mail_force

This attribute can be set by a manager via qmgr. When set true, Torque will not force mail to be sent to the user when the jobs mail_points is set to "n". User will not receive notice of job failures or deletions. Format: boolean; default value: false.

node_check_rate

In OpenPBS, this was the rate at which pbs_server would poll each node.  In TORQUE, nodes periodically send updates without solicitation from pbs_server; this attribute is now used as the maximum number of seconds allowed without an update before pbs_server will consider the node down. Format: integer; default value: 150

node_ping_rate

Specifies the maximum interval (in seconds) between successive pings sent from the pbs_server daemon to the pbs_mom daemon to determine node/daemon health. Format: integer; default value: 300

node_pack

Controls how multiple processor nodes are allocated to jobs. If this attribute is set to true, jobs will be assigned to the multiple processor nodes with the fewest free processors. This packs jobs into the fewest possible nodes leaving multiple processor nodes free for jobs which need many processors on a node. If set to false,  jobs will be scattered across nodes reducing conflicts over memory between jobs.    If unset, the jobs are packed on nodes in the order that the nodes are  declared to the server (in the nodes file). Default value: unset - assigned to nodes as nodes in order that were declared.

node_suffix

Adds a domainname to node names before IP lookups. Format: string; default value: none.

np_default

np_default allows the administrator to unify the number of processors (np) on all nodes. The value can be dynamically changed. A value of 0 tells pbs_server to use the value of np found in the nodes file. The maximum value is 32767. Format: integer; default value: not used.

operators

List of users granted batch operator privileges. Format of the list is identical with managers above. Requires full manager privilege to set or alter. Default value: root on the local host.

owner_purge

Allows job owners to forcibly purge their own jobs from the server. This short-circuits the normal flow of events and is hightly discouraged. Default value: unset

pass_cpuclock

If set to TRUE, the pbs_server daemon passes the option and its value to the pbs_mom daemons for direct implementation by the daemons, making the CPU frequency adjustable as part of a resource request by a job submission. If set to FALSE, the pbs_server daemon creates and passes a PBS_CPUCLOCK job
environment variable to the pbs_mom daemons that contains the value of the  cpuclock attribute used as part of a resource request by a job submission.  The CPU frequencies on the MOMs are not adjusted. The environment variable is  for use by prologue and epilogue scripts, enabling administrators to log and  research when users are making cpuclock requests, as well as researchers and  developers to perform CPU clock frequency changes using a method outside of  that employed by the Torque pbs_mom daemons. Format: boolean;  default value: true.

poll_jobs

Controls how pbs_server will send job status requests to MOMs.  When unset or false, statjob requests from clients (ie: qstat(1B) or the scheduler) may trigger job status requests to MOMs and and must wait until the MOMs have replied; this is suitable for small to medium sized clusters.  When set to true, pbs_server will send periodic job status requests; this is suitable for busy clusters with lots of jobs, lots of clients, qstat(1B) is too slow, or your scheduler times out.  (see job_stat_rate)  Default value: TRUE

query_other_jobs

The setting of this attribute controls if general users, other than the job owner, are allowed to query the status of or select the job. Format: boolean (see acl_host_enable); Requires full manager privilege to set or alter. default value: false - users may not query or select jobs owned by other users.

record_job_info

This must be set to TRUE in order for job logging to be enabled. Format: boolean;  default value: false.

record_job_script

If set to TRUE, this adds the contents of the script executed by a job to the log. For record_job_script to take effect, record_job_info must be set to TRUE. Format: boolean;  default value: false.

resources_available

The list of resource and amounts available to jobs run by this server. The sum of the resource of each type used by all jobs running by this server cannot exceed the total amount listed here. Advisory to the Scheduler, not enforced by the server. Format: "resources_available.resource_name=value[,...]".

resources_cost

The cost factors of various types of resources.  These values are used in determining the order of releasing members of synchronous job sets, see the section on Synchronize Job Starts.  For the most part, these value are purely arbitrary and have meaning only in the relative values between systems.  The cost of the resources requested by a job is the sum of the products of the various resources_cost s and the amount of each resource requested by the job.   It is not necessary to assign a cost for each possible resource, only those which the site wishes to be considered in synchronous job scheduling. Format: "resources_cost.resource_name=value[,...]"; default value: none, cost of resource is not computed.

resources_default

The list of default resource values that are set as limits for a job  executing on this server when the job does not specify a limit, and there is no queue default. Format: "resources_default.resource_name=value[,...]"; default value: no limit.

resources_max

The maximum amount of each resource which can be requested by a single job executing on this server if there is not a resources_max valued defined for the queue in which the job resides. Format: "resources_max.resource_name=value[,...]"; default value: infinite usage.

sched_version

A string specifying the scheduler version.  Schedulers should check this string when starting and not become active if the wrong string is found.  This is ignored by pbs_server.

scheduler_iteration

Has no effect.

scheduling

Controls if the server will request job scheduling by the PBS job scheduler. If true, the scheduler will be called as required; if false, the scheduler will not be called and no job will be placed into execution unless the server is directed to do so by an operator or administrator.  Setting or  resetting this attribute to true results in an immediate call to the scheduler. For more information, see the section Scheduler - Server Interaction in the PBS Administrator Guide. Format: boolean (see acl_host_enable); default value: value of -a option when server is invoked, if -a is not specified, the value is recovered from the prior server run.  If it has never been set, the value is "false".

server_name

The name of the server which is the same as the host name.  If the hostname resolves to an external IP address, then set this to a name that resolves to the internal IP.

submit_hosts

A list of hostnames allowed to submit jobs to this batch server regardless of ruserok().

system_cost

An arbitrary value factored into the resource cost of any job managed by this server for the purpose of selecting which member of synchronous set is released first, see resources_cost and section 3.2.2, Synchronize Job Starts. [default value: none, cost of resource is not computed]

tcp_timeout

Specifies the pbs_server to pbs_mom TCP socket timeout in seconds. Format: integer; default value: 6.

thread_idle_seconds

This is the number of seconds a thread can be idle in the thread pool before it is deleted. If threads should not be deleted, set to -1. Torque will always maintain at least min_threads number of threads, even if all are idle. Format: integer; default value: 300.

timeout_for_job_delete

The specific timeout used when deleting jobs because the node they are executing on is being deleted. Format: integer; default value: 120.

timeout_for_job_requeue

The specific timeout used when requeuing jobs because the node they are executing on is being deleted. Format: integer; default value: 120.

use_jobs_subdirs

Lets an administrator direct the way pbs_server will store its job-related files. When use_jobs_subdirs is unset (or set to FALSE), job and job array files will be stored directly under $PBS_HOME/server_priv/jobs and $PBS_HOME/server_priv/arrays.  When use_job_subdirs is set to TRUE, job and job array files will be distributed over 10 subdirectories under their respective parent directories. This method helps to keep a smaller number of files in a given directory.  Format: boolean; default value: false.

The following attributes are read-only, they are maintained by the server and cannot be changed by a client.

pbs_version

The release version number of the server.

resources_assigned

The total amount of certain types of resources allocated to running jobs.

server_state

The current state of the server:

Active

The server is running and will invoke the job scheduler as required to schedule jobs for execution.

Idle

The server is running but will not invoke the job scheduler.

Scheduling

The server is running and there is an outstanding request to the job scheduler.

Terminating

The server is terminating.  No additional jobs will be scheduled.

Terminating, Delayed

The server is terminating in delayed mode.  The server will not run any new jobs and will shutdown when the last currently executing job completes.

state_count

The total number of jobs managed by the server currently in each state.

total_jobs

The total number of jobs currently managed by the server.

See Also

the PBS ERS, qmgr(1B), pbs_resources(7B)

Info

Local PBS