xs man page

xs — extensible shell

Synopsis

xs [-silevxnpo] [-c command | file] [arguments]

Description

Xs is a command interpreter and programming language which combines the features of other Unix shells and the features of a functional programming language such as Scheme. The syntax is derived from rc(1). The xs shell is simply an extension of the es(1) shell in this goal. Xs is intended for use both as an interactive shell and a programming language for scripts.

Xs is an extremely customizable language. The semantics can be altered radically by redefining functions that are called to implement internal operations. This manual page describes the default, initial configuration. See the section entitled Hook Functions for details on entry points which can be redefined to give the shell extended semantics.

Language

Xs is an interpreter which reads commands and executes them. The simplest form of command in xs is a sequence of words separated by white space (space and tab) characters. A word is either a string or a program fragment (see below). The first word is the command to be executed; the remaining words are passed as arguments to that command. If the first word is a string, it is a interpreted as the name of a program or shell function to run. If the name is the name of a shell function, that function is executed. Otherwise, the name is used as the name of an executable file. If the name begins with /, ./, or ../, then it is used as the absolute path name of a file; if not, xs looks for an executable file in the directories named by $path.

Commands are terminated by newline or semicolon (;). A command may also be terminated by an ampersand (&), which causes the command to be run in the background: the shell does not wait for the command to finish before continuing execution. Background processes have an implicit redirection of /dev/null as their standard input that may be overridden by an explicit redirection.

Quoting

Xs gives several characters special meaning; special characters automatically terminate words. The following characters, along with space, tab, and newline, are special:

# $ & ´ ( ) ; < : = > \ ^ ` { | }

The single quote (') prevents special treatment of any character other than itself. Any characters between single quotes, including newlines, backslashes, and control characters, are treated as an uninterpreted string. A quote character itself may be quoted by placing two quotes in a row. A single quote character is therefore represented by the sequence ''''. The empty string is represented by ''. Thus:

echo 'What''s the plan, Stan?'

prints out

What's the plan, Stan?

The backslash (\) quotes the immediately following character, if it is one of the special characters, except for newline. In addition, xs recognizes backslash sequences similar to those used in C strings:

\a
alert (bell)
\b
backspace
\e
escape
\f
form-feed
\n
newline
\r
carriage return
\t
tab
\xnn
hexadecimal character nn
\nnn
octal character nnn

Comments

The number sign (#) begins a comment in xs. All characters up to but not including the next newline are ignored.

Line continuation

A long logical line may be continued over several physical lines by terminating each line (except the last) with a backslash (\). The backslash-newline sequence is treated as a space. Note that line continuation does not work in comments, where the backslash is treated as part of the comment, and inside quoted strings, where the backslash and newline are quoted.

Lists

The primary data structure in xs is the list, which is a sequence of words. Parentheses are used to group lists. The empty list is represented by (). Lists have no hierarchical structure; a list inside another list is expanded so that the outer list contains all the elements of the inner list. (This is the same as perl's "list interpolation".) Thus, the following are all equivalent:

one two three
(one two three)
((one) () ((two three)))

Note that the null string, '', and the empty list, (), are two very different things. Assigning the null string to variable is a valid operation, but it does not remove its definition.

Since lists can span multiple lines without explicit line continuations, they are ideal for long commands. For example:

switch $x \
 error { result 1 } \
 warning { break } \
 good { echo no problem }
switch $x (
 error { result 1 }
 warning { break }
 good { echo no problem }
)

Finally, note that there are some uses of parentheses not following ordinary list rules: in let/local/%closure bindings, and in assignments

Concatenation

Two lists may be joined by the concatenation operator (^). A single word is a list of length one, so

echo foo^bar

produces the output

foobar

For lists of more than one element, concatenation produces the cross (Cartesian) product of the elements in both lists:

echo (a- b- c-)^(1 2)

produces the output

a-1 a-2 b-1 b-2 c-1 c-2

Variables

A list may be assigned to a variable, using the notation

var = list

Unfortunately, whitespace is required around the assignment operator, this allows it to be treated as a normal string character normally. This is true also for assignment in let-forms and the like. Any sequence of non-special characters, except a sequence including only digits, may be used as a variable name. Xs exports all user-defined variables into the environment unless it is explicitly told not to.

The value of a variable is referenced with the notation:

$var

Any variable which has not been assigned a value returns the empty list when referenced. In addition, multiple references are allowed:

a = foo
b = a
echo $$b

prints

foo

A variable's definition may also be removed by assigning the empty list to a variable:

var=

Multiple variables may be assigned with a single assignment statment. The left hand side of the assignment operation consists of a list of variables which are assigned, one by one, to the values in the list on the right hand side. If there are more variables than values in the list, the empty list is assigned to the remaining variables. If there are fewer variables than elements in the list, the last variable is bound to all the remaining list values.

For example,

(a b) = 1 2 3

has the same effect as

a = 1
b = 2 3

and

(a b c) = 1 2

is the same as

a = 1
b = 2
c =

Note that when assigning values to more than one variable,
the list of variables must be enclosed in parentheses.

For “free careting” (see below) to work correctly,
xs
must make certain assumptions
about what characters may appear in a variable name.
Xs
assumes that a variable name consists only of alphanumeric characters,
percent
(%),
star
(*),
dash
(-),
and underscore
(_).
To reference a variable with other
characters in its name, quote the variable name.
Thus:

echo $'we$IrdVariab!le'

A variable name produced by some complex operation, such as concatenation, should be enclosed in parentheses:

$(var)

Thus:

Good-Morning = Bonjour
Guten = Good
Morgen = Morning
echo $($Guten^-^$Morgen)

prints

Bonjour

Each element of the list in parentheses is treated as an independent variable and expanded separately. Thus, given the above definitions,

echo $(Guten Morgen)

prints

Good Morning

To count the number of elements in a variable, use

$#var

This returns a single-element list with the number of elements in $var.

Subscripting

Variables may be indexed with the notation

$var(n)

where n is a list of integers or ranges. Subscript indexes are based at one. The list of subscripts need not be in order or even unique. Thus, if

a = one two three

then

echo $a(3 3 3)

prints

three three three

Subscript indices which refer to nonexistent elements expand to the empty list. Thus, given the definition above

echo $a(3 1 4 1 5 9 2 6 5)

prints

three one one two

Subscript ranges are of the form lo...hi and refer to all the elements between lo and hi. If lo is omitted, then 1 is used as a default value; if hi is omitted, the length of the list is used. Thus

* = $*(2 ...)

removes the first element of *, similar to the effect of shift in rc(1) or sh(1).

The notation $n, where n is an integer, is a shorthand for $*(n). Thus, xs's arguments may be referred to as $1, $2, and so on.

Note that the list of subscripts may be given by any xs expression, so

$var(`{awk 'BEGIN{for(i=1;i<=10;i++)print i;exit }'})

returns the first 10 elements of $var.

Free Carets

Xs inserts carets (concatenation operators) for free in certain situations, in order to save some typing on the user's behalf. For example, the following are all equivalent:

cc -O -g -c malloc.c alloca.c
cc -^(O g c) (malloc alloca)^.c
opts=O g c; files=malloc alloca; cc -$opts $files.c

Xs inserts a free-caret between the “-” and $opts, as well as between $files and .c. The rule for free carets is as follows: if a word or keyword is immediately followed by another word, keyword, dollar-sign or backquote without any intervening spaces, then xs inserts a caret between them.

Flattened Lists

To create a single-element list from a multi-element list, with the components space-separated, use

$^var

Flattening is useful when the normal list concatenation rules need to be bypassed. For example, to append a single period at the end of $path, use:

echo $^path.

Wildcard Expansion

Xs expands wildcards in filenames if possible. When the characters *, [ or ? occur in an argument or command, xs looks at the argument as a pattern for matching against files. (Contrary to the behavior some other shells exhibit, xs will only perform pattern matching if a metacharacter occurs unquoted and literally in the input. Thus,

foo = '*'
echo $foo

will always echo just a star. In order for non-literal metacharacters to be expanded, an eval statement must be used in order to rescan the input.) Pattern matching occurs according to the following rules: a * matches any number (including zero) of characters. A ? matches any single character, and a [ followed by a number of characters followed by a ] matches a single character in that class. The rules for character class matching are the same as those for ed(1), with the exception that character class negation is achieved with the tilde (~), not the caret (^), since the caret already means something else in xs. The filename component separator, slash (/), must appear explicitly in patterns. * and ? do not match a dot character (.) at the beginning of a filename component.

A tilde (~) as the first character of an argument is used to refer to home directories. A tilde alone or followed by a slash (/) is replaced by the value of $home, which is usually the home directory of the current user. A tilde followed by a username is replaced with the home directory of that user, according to getpwent(3).

Pattern Matching

The tilde (~) operator is used in xs for matching strings against wildcard patterns. The command

~ subject pattern pattern ...

returns a true value if and only if the subject matches any of the patterns. The matching follows the same rules as wildcard expansion, except that slashes (/) are not considered significant, leading dots (.) do not have to be matched explicitly, and home directory expansion does not occur. Thus

~ foo f*

returns zero (true), while

~ (bar baz) f*

returns one (false). The null list is matched by the null list, so

~ $foo ()

checks to see whether $foo is empty or not. This may also be achieved by the test

~ $#foo 0

Note that inside a ~ command xs does not match patterns against file names, so it is not necessary to quote the characters *, [ and ?. However, xs does expand the subject against filenames if it contains metacharacters. Thus, the command

~ * ?

returns true if any of the files in the current directory have a single-character name. Note that if the ~ command is given a list as its first argument, then a successful match against any of the elements of that list will cause ~ to return true. For example:

~ (foo goo zoo) z*

is true.

Pattern Extraction

The double-tilde (~~) operator is used in xs for extracting the parts of strings that match patterns. The command

~~ subject pattern pattern ...

returns the parts of each matching subject which correspond to the wildcards.

Each subject is checked in order against each pattern; if it matches the pattern, the parts of the subject which matched each *, ?, or [] character range are extracted, and processing moves on to the next subject. If the subject does not match, the next pattern is tried.

For example, the result of the extraction operation

~~ (foo.c foo.x bar.h) *.[ch]

is the list (foo c bar h).

Arithmetic Substitution

A single list element can be formed from an infix arithmetical expression like so:

(  expression )
The expression can use any of the

+, -, * and / operators, and use variable substitution of the form $var. Parentheses can be used in normal infix fashion for order of evaluation. Numbers can be entered in floating-point or integer, but are always decimal. Calculation rules are similar to C - any operation involving floats produces a float, any operation with integers produces an integer; this includes division, so

.Cr "echo :(1 / 2)"
.Cr "echo :(1.0 / 2)"

produce 0 and 0.5 (with trailing zeros) respectively. Integers will wrap around at the same point as the native platform's int, and floats at the same point as the native platform's double.

Command Substitution

A list may be formed from the output of a command by using backquote substitution:

`{ command }

returns a list formed from the standard output of the command in braces. The characters stored in the variable $ifs (for “input field separator”) are used to split the output into list elements. By default, $ifs has the value space-tab-newline. The braces may be omitted if the command is a single word. Thus `ls may be used instead of `{ls}. This last feature is useful when defining functions that expand to useful argument lists. A frequent use is:

fn src { echo *.[chy] }

followed by

wc `src

(This will print out a word-count of all C and Yacc source files in the current directory.)

In order to override the value of $ifs for a single command substitution, use:

`` ifs-list { command }

$ifs will be temporarily ignored and the command's output will be split as specified by the list following the double backquote. For example:

`` :\n {cat /etc/passwd}

splits up /etc/passwd into fields.

Return Values

The return value of a command is obtained with the construct

<={ command }

The return value of an external program is its exit status (which in other shells can be found in special variables such as $? or $status), as either a small integer or the name of signal. Thus

echo <={test -f /etc/motd} <={test -w /vmunix} <=a.out

might produce the output

0 1 sigsegv+core

along with any output or error messages from the programs.

Xs functions and primitives can produce “rich return values,” that is, arbitrary lists as return values.

When return values are interpreted as truth values, an extension of the normal shell conventions apply. If any element of a list is not equal to `` (or the empty string), that list is considered false.

The return value of an assignment operation is the assigned value.

Logical Operators

There are a number of operators in Xs which depend on the exit status of a command.

command1 && command2

executes the first command and then executes the second command if and only if the first command has a “true” return value.

command1 || command2

executes the first command and then executes the second command if and only if the first command has a “false” return value.

! command

inverts the truth value of the exit status of a command.

Input and output

The standard output of a command may be redirected to a file with

command > file

and the standard input may be taken from a file with

command < file

File descriptors other than 0 and 1 may be specified also. For example, to redirect standard error to a file, use:

command >[2] file

In order to duplicate a file descriptor, use >[n=m]. Thus to redirect both standard output and standard error to the same file, use

command > file >[2=1]

To close a file descriptor that may be open, use >[n=]. For example, to close file descriptor 7:

command >[7=]

In order to place the output of a command at the end of an already existing file, use:

command >> file

If the file does not exist, then it is created.

To open a file for reading and writing, use the <> redirection operator; for reading and appending, use <>>. Both of these operators use file descriptor 0 (standard input) by default. Similarly, >< truncates a file and opens it for reading and writing, and >>< opens a file for reading and appending; these operators use file descriptor 1 by default.

“Here documents” are supported as in sh(1) with the use of

command << 'eof-marker'

If the end-of-file marker is quoted, then no variable substitution occurs inside the here document. Otherwise, every variable is substituted by its space-separated-list value (see Flat Lists, below), and if a ^ character follows a variable name, it is deleted. This allows the unambiguous use of variables adjacent to text, as in

$variable^follow

To include a literal $ in a here document created with an unquoted end-of-file marker, use $$.

Additionally, xs supports “here strings”, which are like here documents, except that input is taken directly from a string on the command line. Its use is illustrated here:

cat <<< 'this is a here string' | wc

(This feature enables xs to export functions that use here documents.)

Pipes

Two or more commands may be combined in a pipeline by placing the vertical bar (|) between them. The standard output (file descriptor 1) of the command on the left is tied to the standard input (file descriptor 0) of the command on the right. The notation |[n=m] indicates that file descriptor n of the left process is connected to file descriptor m of the right process. |[n] is a shorthand for |[n=0]. As an example, to pipe the standard error of a command to wc(1), use:

command |[2] wc

The exit status of a pipeline is considered true if and only if every command in the pipeline exits true.

Input/Output Substitution

Some commands, like cmp(1) or diff(1), take their input from named files on the command line, and do not use standard input. It is convenient sometimes to build nonlinear pipelines so that a command like cmp can read the output of two commands at once. Xs does it like this:

cmp <{command1} <{command2}

compares the output of the two commands. Note: on some systems, this form of redirection is implemented with pipes, and since one cannot lseek(2) on a pipe, commands that use lseek will hang. For example, most versions of diff seek on their inputs.

Data can be sent down a pipe to several commands using tee(1) and the output version of this notation:

echo hi there | tee >{sed 's/^/p1 /'} >{sed 's/^/p2 /'}

Program Fragments

Xs allows the intermixing of code with strings. A program fragment, which is a group of commands enclosed in braces ({ and }), may be used anywhere a word is expected, and is treated as an indivisible unit. For example, a program fragment may be passed as an argument, stored in a variable, or written to a file or pipe. If a program fragment appears as the first word in a command, it is executed, and any arguments are ignored. Thus the following all produce the same output:

{ echo hello, world }
{ echo hello, world } foo bar
xs -c { echo hello, world }
x = { echo hello, world }; $x
echo { echo hello, world } | xs
echo { echo hello, world } > foo; xs < foo

Since program fragments in the first position in a command are executed, braces may be used as a grouping mechanism for commands. For example, to run several commands, with output from all of them redirected to the same file, one can do

{ date; ps agux; who } > snapshot

In addition, program fragments can continue across multiple physical lines without explicit line continuations, so the above command could also be written:

{
 date
 ps agux
 who
} > snapshot

A lambda is a variant on a program fragment which takes arguments. A lambda has the form

{ | parameters | commands }

The parameters are one or more variable names, to which arguments of the lambda are assigned while the commands are run. The first argument is assigned to the first variable, the second to the second, and so on. If there are more arguments than parameters, the last named variable is assigned all the remaining arguments; if there are fewer, the parameters for which there are no arguments are bound to the empty list.

Lambdas, like other program fragments, can appear anywhere in a list. A more complicated example in the same spirit:

{ |cmd arg| $cmd $arg } { |*| echo $* } hi

This command executes a lambda which runs its first argument, named cmd, using its second argument, named arg, as the argument for the first. The first argument of this function is another lambda, seen previously, and the second argument is the word hi.

These lambda expressions

{ |a b c| echo $c $b $a } 1 2
{ |a b c| echo $c $b $a } 1 2 3 4 5

produce this output:

2 1
3 4 5 2 1

Functions

A function in xs is introduced with the syntax

fn name parameters { commands }

If the function name appears as the first word of a command, the commands are run, with the named parameters bound to the arguments to the function.

The similarity between functions and lambdas is not coincidental. A function in xs is a variable of the form fn-name. If name for which the appropriate fn- variable exists is found in the first position of a command, the value of the variable is substituted for the first word. The above syntax for creating functions is equivalent to the variable assignment

fn-name = { | parameters | commands }

Functions may be deleted with the syntax

fn name

which is equivalent to the assignment

fn-name=

If, as the most common case, a function variable is bound to a lambda, when the function is invoked, the variable $0 is bound (dynamically, see below) to the name of the function.

Lambdas are just another form of code fragment, and, as such, can be exported in the environment, passed as arguments, etc. The central difference between the two forms is that lambdas bind their arguments, while simple brace-enclosed groups just ignore theirs.

Local Variables

Variable assignments may be made local to a set of commands with the local construct:

local (var = value; var = value ...) command

The command may be a program fragment, so for example:

local (path = /bin /usr/bin; ifs = ) {
 ...
}

sets path to a minimal useful path and removes ifs for the duration of one long compound command.

Local-bound variables are exported into the environment, and will invoke appropriately named settor functions (see below).

Lexically Scoped Variables

In addition to local variables, xs supports a different form of temporary variable binding, using let-bound, or “lexically scoped,” variables. (Lexical scoping is the form of binding used by most compiled programming languages, such as C or Scheme.) A lexically scoped variable is introduced with a let statement:

let (var = value; var = value ...) command

(Also, the "= value" can be left out with same effect as "var =")

All references to any of the variables defined in a let statement by any code located lexically (that is, textually) within the command portion of the statement will refer to the let-bound variable rather than any environment or local-bound variable; the immediate text of the let statement is the complete extent of that binding. That is, lexically bound variables surrounding code fragments follow those code fragments around.

An example best shows the difference between let and local (also known as “dynamic”) binding: (note that “; ” is xs's default prompt.)

; x = foo
; let (x = bar) {
 echo $x
 fn lexical { echo $x }
}
bar
; local (x = baz) {
 echo $x
 fn dynamic { echo $x }
}
baz
; lexical
bar
; dynamic
foo
;

Lexically bound variables are not exported into the environment, and never cause the invocation of settor functions. Function (lambda) parameters are lexically bound to their values.

For loops

The command

for var list { command }

Runs the command once for each element of the list, with the named variable bound lexically to each element of the list, in order. Note that if list consists of more than a single term, for example (a b c) it must be parenthesized.

If multiple bindings are given in the for statement, the looping occurs in parallel and stops when all lists are exhausted. When one list is finished before the others, the corresponding variable is bound to the empty list for the remaining iterations. Thus the loop

for i (a b c); j (x y) { echo $#i $i $#j $j}

produces the output

1 a 1 x
1 b 1 y
1 c 0

Settor Functions

A settor function is a variable of the form set-var, which is typically bound to a lambda. Whenever a value is assigned to the named variable, the lambda is invoked with its arguments bound to the new value. While the settor function is running, the variable $0 is bound to the name of the variable being assigned. The result of the settor function is used as the actual value in the assignment.

For example, the following settor function is used to keep the shell variables home and HOME synchronized.

set-HOME = { |*|
 local (set-home = )
 home = $*
 result $*
}

This settor function is called when any assignment is made to the variable HOME. It assigns the new value to the variable home, but disables any settor function for home to prevent an infinite recursion. Then it returns its argument unchanged for use in the actual assignment to HOME.

Settor functions do not apply to lexically bound variables.

Primitives

Primitives are internal xs operations that cannot or should not (for reasons of performance) be written in the interpreter's language. The set of primitives makes up the run-time library for xs.

Primitives can be used with the syntax

$&name

A primitive can be used anywhere a lambda is expected. The list of primitives is returned as the result of running the primitive $&primitives.

For details on specific primitives, see the section entitled Primitives below.

Exceptions

Exceptions in xs are used for many forms of non-structured control flow, notably error reporting, signals, and flow of control constructs such as escape.

Exceptions are passed up the call chain to catching routines. A catcher may decide to intercept an exception, retry the code that caused the exception, or pass the exception along. There can only be one exception raised at any time.

Exceptions are represented by lists. The first word of an exception is, by convention, the type of exception being raised. The following exceptions are known:

eof
Raised by %parse when the end of input is reached.
error source message
A run-time error. Almost all shell errors are reported with the error exception. The default interactive loop and the outermost level of the interpreter catch this exception and print the message. Source is the name of the routine (typically a primitive) which raised the error.
retry
When raised from a signal catcher, causes the body of the catch clause to be run again.
signal signame
Raised when the shell itself receives a signal, and the signal is listed in the variable signals. Signame is the name of the signal that was raised.

See the builtin commands catch and throw for details on how to manipulate exceptions.

Special Variables

Several variables are known to xs and are treated specially. Redefining these variables can change interpreter semantics. Note that only dynamically bound (top-level or local-bound) variables are interpreted in this way; the names of lexically bound variables are unimportant.

*
The argument list of xs. $1, $2, etc. are the same as $*(1), $*(2), etc.
$0
Holds the value of argv[0] with which xs was invoked. Additionally, $0 is set to the name of a function for the duration of the execution of that function, and $0 is also set to the name of the file being interpreted for the duration of a . command.
apid
The process ID of the last process started in the background.
history
The name of a file to which commands are appended as xs reads them. This facilitates the use of a stand-alone history program (such as history(1)) which parses the contents of the history file and presents them to xs for reinterpretation. If history is not set (the default), then xs does not append commands to any file.
home
The current user's home directory, used in tilde (~) expansion, as the default directory for the builtin cd command, and as the directory in which xs looks to find its initialization file, .esrc, if xs has been started up as a login shell. Like path and PATH, home and HOME are aliased to each other.
ifs
The default input field separator, used for splitting up the output of backquote commands for digestion as a list. The initial value of ifs is space-tab-newline.
noexport
A list of variables which xs will not export. All variables except for the ones on this list and lexically bound variables are exported.
path
This is a list of directories to search in for commands. The empty string stands for the current directory. Note also that an assignment to path causes an automatic assignment to PATH, and vice-versa. If neither path nor PATH are set at startup time, path assumes a default value suitable for your system. This is typically /usr/ucb /usr/bin /bin ''.
pid
The process ID of the currently running xs.
prompt
This variable holds the two prompts (in list form) that xs prints. $prompt(1) is printed before each command is read, and $prompt(2) is printed when input is expected to continue on the next line. (See %parse for details.) xs sets $prompt to ('; ' '') by default. The reason for this is that it enables an xs user to grab commands from previous lines using a mouse, and to present them to xs for re-interpretation; the semicolon prompt is simply ignored by xs. The null $prompt(2) also has its justification: an xs script, when typed interactively, will not leave $prompt(2)'s on the screen, and can therefore be grabbed by a mouse and placed directly into a file for use as a shell script, without further editing being necessary.
signals

Contains a list of the signals which xs traps. Any signal name which is added to this list causes that signal to raise an xs exception. For example, to run some commands and make sure some cleanup routine is called even if the user interrupts or disconnects during the script, one can use the form:

local (signals = $signals sighup sigint) {
 catch { |e|
 cleanup
 throw $e
 } {
 ...
 }
}

A signal name prefixed by a hyphen (-) causes that signal to be ignored by xs and all of its child processes, unless one of them resets its handler. A signal prefixed by a slash (/) is ignored in the current shell, but retains default behavior in child processes. In addition, the signal sigint may be preceeded by the prefix (.) to indicate that normal shell interrupt processing (i.e., the printing of an extra newline) occurs. By default xs starts up with the values

.sigint /sigquit /sigterm
in $signals; other values will be on the list if the shell starts up with some signals ignored.

The values of path and home are derived from the environment values of PATH and HOME if those values are present. This is for compatibility with other Unix programs, such as sh(1). $PATH is assumed to be a colon-separated list.

Syntactic Sugar

xs internally rewrites much of the syntax presented thus far in terms of calls to shell functions. Most features of xs that resemble traditional shell features are included in this category. This rewriting occurs at parse time, as commands are recognized by the interpreter. The shell functions that are the results of rewriting are some of the hook functions documented below.

The following tables list all of the major rewriting which xs does, with the forms typically entered by the user on the left and their internal form on the right. There is no reason for the user to avoid using the right-hand side forms, except that they are usually less convenient. To see the internal form of a specific command, a user can run xs with the -n and -x options; when invoked in this way, the shell prints the internal form of its commands rather than executing them.

Control Flow

! cmd	%not {cmd}
cmd &	%background {cmd}
cmd1 ; cmd2	%seq {cmd1} {cmd2}
cmd1 && cmd2	%and {cmd1} {cmd2}
cmd1 || cmd2	%or {cmd1} {cmd2}
fn name args { cmd }	fn-^name = { |args| cmd}

Input/Output Commands

cmd < file	%open 0 file {cmd}
cmd > file	%create 1 file {cmd}
cmd >[n] file	%create n file {cmd}
cmd >> file	%append 1 file {cmd}
cmd <> file	%open-write 0 file {cmd}
cmd <>> file	%open-append 0 file {cmd}
cmd >< file	%open-create 1 file {cmd}
cmd >>< file	%open-append 1 file {cmd}
cmd >[n=]	%close n {cmd}
cmd >[m=n]	%dup m n {cmd}
cmd << tag input tag	%here 0 input {cmd}
cmd <<< string	%here 0 string {cmd}
cmd1 | cmd2	%pipe {cmd1} 1 0 {cmd2}
cmd1 |[m=n] cmd2	%pipe {cmd1} m n {cmd2}
cmd1 >{ cmd2 }	%writeto var {cmd2} {cmd1 $var}
cmd1 <{ cmd2 }	%readfrom var {cmd2} {cmd1 $var}

Expressions

$#var	<={%count $var}
$^var	<={%flatten ' ' $var}
`{cmd args}	<={%backquote <={%flatten '' $ifs} {cmd args}}
“ifs {cmd args}	<={%backquote <={%flatten ” ifs} {cmd args}}

Builtins

Builtin commands are shell functions that exist at shell startup time. Most builtins are indistinguishable from external commands, except that they run in the context of the shell itself rather than as a child process. Many builtins are implemented with primitives (see above).

Some builtin functions have names that begin with a percent character (%). These are commands with some special meaning to the shell, or are meant for use only by users customizing the shell. (This distinction is somewhat fuzzy, and the decisions about which functions have %-names are somewhat arbitrary.)

All builtins can be redefined and extended by the user.

Builtin Commands

. [-einvx] file [args ...]
Reads file as input to xs and executes its contents. The options are a subset of the invocation options for the shell (see below).
access [-n name] [-1e] [-rwx] [-fdcblsp] path ...

Tests if the named paths are accessible according to the options presented. Normally, access returns zero (true) for files which are accessible and a printable error message (which evaluates as false, according to shell rules) for files which are not accessible. If the -1 option is used, the name of the first file which the test succeeds for is returned; if the test succeeds for no file, the empty list is returned. However, if the -e option was used, access raises an error exception. If the -n option is used, the pathname arguments are treated as a list of directories, and the name option argument is used as a file in those directories (i.e., -n is used for path searching).

The default test is whether a file exists. These options change the test:

-r
Is the file readable (by the current user)?
-w
Is the file writable?
-x
Is the file executable?
-f
Is the file a plain file?
-d
Is the file a directory?
-c
Is the file a character device?
-b
Is the file a block device?
-l
Is the file a symbolic link?
-s
Is the file a socket?
-p
Is the file a named pipe (FIFO)?
alias alias-name expansion...

Define a new function, alias-name , which calls expansion The first command in expansion is replaced with it's whatis value to prevent recursion. This can be used to serve a somewhat similar purpose as in bash. For example, the following will force ls to use color, and make l be ls in long form, with color (due to previous alias):

alias ls ls --color=yes
alias l ls -l
break value
Exits the current loop. Value is used as the return value for the loop command.
catch catcher body
Runs body. If it raises an exception, catcher is run and passed the exception as an argument.
cd [directory]
Changes the current directory to directory. With no argument, cd changes the current directory to $home.
echo [-n] [--] args ...
Prints its arguments to standard output, terminated by a newline. Arguments are separated by spaces. If the first argument is -n no final newline is printed. If the first argument is --, then all other arguments are echoed literally; this is used for echoing a literal -n.
escape lambda

Run lambda with one argument, an escape block which when evaulated will return to the point after this escape. This is more formally refered to as an escape continuation. In fact, it's behaviour is a simple subset of exceptions, and is implemented fairly simply using catch. Escape is useful to replace return/break like constructs; for example

fn f { escape |fn-return| {
 ...; return 0;
 ...
}
will exit the function with result 0 when it reached the return.
eval list
Concatenates the elements of list with spaces and feeds the resulting string to the interpreter for rescanning and execution.
exec cmd

Replaces xs with the given command. If the exec contains only redirections, then these redirections apply to the current shell and the shell does not exit. For example,

exec {>[2] err.out}
places further output to standard error in the file err.out. Unlike some other shells, xs requires that redirections in an exec be enclosed in a program fragment.
exit [status]
Causes the current shell to exit with the given exit status. If no argument is given, zero (true) is used. (This is different from other shells, that often use the status of the last command executed.)
false
Always returns a false (non-zero) return value.
forever cmd
Runs the command repeatedly, until the shell exits or the command raises an exception. This is equivalent to a while {true} {cmd} loop except that forever does not catch any exceptions, including break.
fork cmd

Runs a command in a subshell. This insulates the parent shell from the effects of state changing operations such as cd and variable assignments. For example:

fork {cd ..; make}
runs make(1) in the parent directory (..), but leaves the shell in the current directory.
if [test then-action] [else else-action]

Evaluates the command test. If the result is true, the command then is run and if completes. If the result of the test is false, the else command is run. The else-action doesn't require braces no matter the number of actions, so one can write code like:

... } else if {~ $a $b} { ... }

Note that:

...}
else if {~ $a $b} { ... }

, with the else on a seperate line, will only work if the if-command has parentheses wrapping it's body and else-statements.

limit [-h] [resource [value]]

Similar to the csh(1) limit builtin, this command operates upon the resource limits of a process. With no arguments, limit prints all the current limits; with one argument, limit prints the named limit; with two arguments, it sets the named limit to the given value. The -h flag displays/alters the hard limits. The resources which can be shown or altered are cputime, filesize, datasize, stacksize, coredumpsize and memoryuse. For example:

limit coredumpsize 0
disables core dumps.
The limit values must either be the word “unlimited” or a number with an optional suffix indicating units. For size limits, the suffixes k (kilobytes), m (megabytes), and g (gigabytes) are recognized. For time limits, s (seconds), m (minutes), and h (hours) are known; in addition, times of the form hh:mm:ss and mm:ss are accepted. See getrlimit(2) for details on resource limit semantics.
map action list
Call action with a single argument for each element of list. Since lists auto-expand, list contains the rest of the arguments to the command. Returns the list of results of each action. If action returns a list, it is expanded inside into a new process group. This builtin is useful for making xs behave like a job-control shell in a hostile environment. One example is the NeXT Terminal program, which implicitly assumes that each shell it forks will put itself into a new process group. Note that the controlling tty for the process must be on standard error (file descriptor 2) when this operation is run.
omap action list
Like map , but return the list of the outputs of action, in the same form as if “ ” action were called.
popd
cd into the directory that was last pushed by pushd , popping the directory off of an internal stack. If it's internal stack is empty (for example, if pushd has not been called), then stays in the current directory. Also prints out the stack, starting from the top.
pushd [dir]
Add directory's absolute path onto popd 's stack. Also outputs the stack.
result value ...
Returns its arguments. This is xs's identity function.
switch value [case1 action1]...[default-action]
Go through the list of cases, testing if they are equal to value. The matching action of the first case which matches is executed. The break exception can be used to manually exit the switch, but is not necessary to signify the end of an action (unlike in C).
until test body
Identical to while, except test is negated
throw exception arg ...
Raise the named exception, passing all of the arguments to throw to the enclosing exception handler.
time cmd arg ...
Prints, on the shell's standard error, the real, user, and system time consumed by executing the command.
true
Always returns a true (zero) return value.
umask [mask]
Sets the current umask (see umask(2)) to the octal mask. If no argument is present, the current mask value is printed.
unwind-protect body cleanup
Runs body and, when it completes or raises an exception, runs cleanup.
var var ...
Prints definitions of the named variables, suitable for being used as input to the shell.
vars [-vfs] [-epi]

Prints all shell variables, functions, and settor functions (in a form suitable for use as shell input), which match the criteria specified by the options.

-v
variables (that are not functions or settor functions)
-f
functions
-s
settor functions
-e
exported values
-p
private (not exported) values
-i
internal (predefined and builtin) values
-a
all of the above
If none of -v, -f, or -s are specified, -v is used. If none of -e, -p, or -i are specified, -e is used.
wait [pid]
Waits for the specified pid, which must have been started by xs. If no pid is specified, waits for any child process to exit.
whatis progam ...
For each named program, prints the pathname, primitive, lambda, or code fragment which would be run if the program appeared as the first word of a command.
while test body
Evaluates the test and, if it is true, runs the body and repeats.
%read
Reads from standard input and returns either the empty list (in the case of end-of-file) or a single element string with up to one line of data, including possible redirections. This function reads one character at a time in order to not read more data out of a pipe than it should. The terminating newline (if present) is not included in the returned string.

Hook Functions

A subset of the %-named functions are known as “hook functions.” The hook functions are called to implement some internal shell operations, and are available as functions in order that their values can be changed. Typically, a call to a hook function is from code generated by the syntactic sugar rewritings.

%and cmd ...
Runs the commands in order, stopping after the first one that has a false return value. Returns the result of the last command run.
%append fd file cmd
Runs the command with file descriptor fd set up to append to the file.
%background cmd
Runs the command in the background. The shell variable apid contains the process ID of the background process, which is printed if the shell is interactive (according to %is-interactive).
%backquote separator cmd
Runs the command in a child process and returns its standard output as a list, separated (with the same rules used in %split) into elements according to separator.
%batch-loop
Parses commands from the current input source and passes the commands to the function %dispatch, which is usually a dynamically bound identifier. This function catches the exception eof which causes it to return. This function is invoked by the shell on startup and from the dot (.) and eval commands, when the input source is not interactive. (See also %interactive-loop.)
%close fd cmd
Runs the command with the given file descriptor closed.
%count list
Returns the number of arguments to the primitive.
%create fd file cmd
Runs the command with file descriptor fd set up to write to the file.
%dup newfd oldfd cmd
Runs the command with the file descriptor oldfd copied (via dup(2)) to file descriptor newfd.
%eval-noprint cmd
Run the command. (Passed as the argument to %batch-loop and %interactive-loop.)
%eval-print cmd
Print and run the command. (Passed as the argument to %batch-loop and %interactive-loop when the -x option is used.)
%exec-failure file argv0 args ...
This function, if it exists, is called in the context of a child process if an executable file was found but execve(2) could not run it. If the function returns, an error message is printed and the shell exits, but the function can exec a program if it thinks it knows what to do. Note that the name of the program appears twice in the arguments to %exec-failure, once as a filename and once as the first element of the argv array; in some cases the two will be identical, but in others the former will be a full pathname and the latter will just be the basename. Some versions of xs may provide a builtin version of this function to handle #!-style shell scripts if the kernel does not.
%exit-on-false cmd
Runs the command, and exits if any command (except those executing as the tests of conditional statements) returns a non-zero status. (This function is used as an argument to %batch-loop and %interactive-loop when the shell is invoked with the -e option.)
%flatten separator list
Concatenate the elements of list into one string, separated by the string separator.
%here fd word ... cmd
Runs the command with the words passed as input on file descriptor fd.
%home [user]
Returns the home directory of the named user, or $home if there are no arguments.
%interactive-loop
Prompts, parses commands from the current input source and passes the commands to the function %dispatch, which is usually a dynamically bound identifier. This function catches the exception eof which causes it to return. This function is invoked by the shell on startup and from the dot (.) commands, when the input source is interactive. (See also %batch-loop.)
%noeval-noprint cmd
Do nothing. (Passed as the argument to %batch-loop and %interactive-loop when the -n option is used.)
%noeval-print cmd
Print but don't run the command. (Passed as the argument to %batch-loop and %interactive-loop when the -x and -n options are used.)
%not cmd
Runs the command and returns false if its exit status was true, otherwise returns true.
%one list
If list is one element long, %one returns its value; otherwise it raises an exception. %one is used to ensure that redirection operations get passed exactly one filename.
%open fd file cmd
Runs the command with file open for reading on file descriptor fd.
%open-append fd file cmd
Runs the command with file open for reading and appending on file descriptor fd.
%open-create fd file cmd
Runs the command with file open for reading and writing on file descriptor fd. If the file already exists, it is truncated.
%open-write fd file cmd
Runs the command with file open for reading and writing on file descriptor fd.
%openfile mode fd file cmd
Runs the command with file opened according to mode on file descriptor fd. The modes (r, w, a, r+, w+, and a+) have the same meanings in %openfile as they do in fopen(3). %openfile is invoked by the redirection hook functions: %append, %create, %open, %open-append, %open-create, and %open-write.
%or cmd ...
Runs the commands in order, stopping after the first one that has a true return value. Returns the result of the last command run.
%parse prompt1 prompt2
Reads input from the current input source, printing prompt1 before reading anything and prompt2 before reading continued lines. Returns a code fragment suitable for execution. Raises the exception eof on end of input.
%pathsearch program
Looks for an executable file named program in the directories listed in $path. If such a file is found, it is returned; if one is not found, an error exception is raised.
%pipe cmd [outfd infd cmd] ...
Runs the commands, with the file descriptor outfd in the left-hand process connected by a pipe to the file descriptor infd in the right-hand process. If there are more than two commands, a multi-stage pipeline is created.
%prompt
Called by %interactive-loop before every call to %parse. This function allows the user to provide any actions that he or she may wish to have executed before being prompted (e.g., updating the value of the prompt variable to contain all or part of the current working directory).
%readfrom var input cmd
Runs cmd with the variable var locally bound to the name of a file which contains the output of running the command input.
%seq cmd ...
Runs the commands, in order.
%whatis program ...
For each named program, returns the pathname, primitive, lambda, or code fragment which would be run if the program appeared as the first word of a command.
%writeto var output cmd
Runs cmd with the variable var locally bound to the name of a file which is used as the input for the command output.

Utility Functions

These functions are useful for people customizing the shell, may be used by other builtin commands, and probably don't make much sense to replace, though that is always possible.

%apids
Returns the process IDs of all background processes that the shell has not yet waited for.
%fsplit separator [args ...]
Splits its arguments into separate strings at every occurrence of any of the characters in the string separator. Repeated instances of separator characters cause null strings to appear in the result. (This function is used by some builtin settor functions.)
%is-interactive
Returns true if the current interpreter context is interactive; that is, if shell command input is currently coming from an interactive user. More precisely, this is true if the innermost enclosing read-eval-print loop is %interactive-loop rather than %batch-loop.
%newfd
Returns a file descriptor that the shell thinks is not currently in use.
%run program argv0 args ...
Run the named program, which is not searched for in $path, with the argument vector set to the remaining arguments. This builtin can be used to set argv[0] (by convention, the name of the program) to something other than file name.
%split separator [args ...]
Splits its arguments into separate strings at every occurrence of any of the characters in the string separator. Repeated instances of separator characters are coalesced. Backquote substitution splits with the same rules.
%var var ...
For each named variable, returns a string which, if interpreted by xs would assign to the variable its current value.

Primitives

Primitives exist in xs so that, in the presence of spoofing and redefinitions, there is a way to refer to built-in behaviors. This ability is necessary for the shell to be able to unambiguously refer to itself, but is also useful for users who have otherwise made their environment unnecessary but don't want to kill the current shell.

Primitives are referenced with the

$&name

notation. In this section, the “$&” prefixes will be omitted when primitive names are mentioned. Note that, by convention, primitive names follow C identifier names where xs variable and function names often contain “%” and “-” characters.

The following primitives directly implement the builtin functions with the same names:

access	forever	throw
catch	fork	umask
echo	if	wait
exec	newpgrp	
exit	result

In addition, the primitive dot implements the “.” builtin function.

The cd primitive is used in the implementation of the cd builtin, but does not understand no arguments to imply $home. The vars and internals primitives are used by the implementation of the vars builtin.

The following primitives implement the hook functions of the same names, with “%” prefixes:

apids	here	read
close	home	run
count	newfd	seq
dup	openfile	split
flatten	parse	var
fsplit	pipe	whatis

The following primitives implement the similar named hook functions, with “%” prefixes and internal hyphens:

batchloop	exitonfalse	isinteractive

The background primitive is used to implement the %background hook function, but does not print the process ID of the background process or set $apid. The backquote primitive is used to implement the %backquote hook function, but returns the exit status of the child as the first value of its result instead of setting $bqstatus to it.

The following primitives implement the similarly named settor functions:

sethistory	setnoexport	setsignals

Some primitives are included in xs conditionally, based on compile-time configuration options. Those primitives, and the functions to which they are bound, are

execfailure	%exec-failure
limit	limit
readfrom	%readfrom
time	time
writeto	%writeto

The primitive resetterminal is if xs is compiled with support for the readline or editline libraries. It is used in the implementation of settor functions of the TERM and TERMCAP variables to notify the line editing packages that the terminal configuration has changed.

Several primitives are not directly associated with other function. They are:

$&collect
Invokes the garbage collector. The garbage collector in xs runs rather frequently; there should be no reason for a user to issue this command.
$&noreturn lambda args ...
Call the lambda, but in such a way that it does not catch the return exception. This primitive exists in order that some control-flow operations in xs (e.g., while and &&) can be implemented as lambdas rather than primitives.
$&primitives
Returns a list of the names of xs primitives.
$&version
Returns the current version number and release date for xs.

Options

-c
Run the given command, placing the rest of the arguments to xs in $*.
-s
Read commands from standard input; i.e., put the first argument to xs in $* rather than using it as the name of a file to source.
-i
Force xs to be an interactive shell. Normally xs is only interactive if it is run with commands coming from standard input and standard input is connected to a terminal.
-l
Run $home/.esrc on startup, i.e., be a login shell. -l is implied if the name the shell was run under (that is, argv[0]) starts with a dash (-).
-e
Exit if any command (except those executing as the tests of conditional statements) returns a non-zero status.
-v
Echo all input to standard error.
Exit if any command (except those executing as the tests of
conditional statements) returns a non-zero status.
-v
Echo all input to standard error.
-x
Print commands to standard error before executing them.
-n
Turn off execution of commands. This can be used for checking the syntax of scripts. When combined with -x, xs prints the entered command based on the internal (parsed) representation.
-p
Don't initialize functions from the environment. This is used to help make scripts that don't break unexpectedly when the environment contains functions that would override commands used in the script.
-o
Don't open /dev/null on file descriptors 0, 1, and 2, if any of those descriptors are inherited closed.
-d
Don't trap SIGQUIT or SIGTERM. This is used for debugging.

Misc Notes

As with any other shell scripting language, process forking takes up the majority of time:

x = 0; while {!~ $x 4000} { x = :(x + 1) }

is several magnitudes faster than:

x = 0; while {test $x -ne 4000} { x = :(x + 1) }

Even though xs's arithmetic code is rather slow, in this case the cost of fork+exec far outweighs it.

Elaborate tricks involving stringifying closures and unstringying them later will probably not work. In general, trying to manipulate the scope of variables through similar techniques will probably not do what one expects.

Files

$home/.xsrc, /dev/null

Bugs

The interpreter should be properly tail recursive; that is, tail calls should not consume stack space.

break and return should have lexical scope.

Woe betide the environment string set by some other program to contain either the character control-a or the sequence control-b followed by control-a or control-b.

-x is not nearly as useful as it should be.

Too many creatures have fept in.

Please send bug reports to fkfire@gmail.com.

See Also

history(1), es(1), rc(1), sh(1), execve(2), getrlimit(2), fopen(3), getpwent(3)

Paul Haahr and Byron Rakitzis, Es — A shell with higher-order functions, Proceedings of the Winter 1993 Usenix Conference, San Diego, CA.

Tom Duff, Rc — A Shell for Plan 9 and UNIX Systems, Unix Research System, 10th Edition, Volume 2. (Saunders College Publishing)

Info

5 March 1992