- Query .csv file by specifying the delimiter as ',':
q -d',' "SELECT * from path/to/file"
- Query .tsv file:
q -t "SELECT * from path/to/file"
- Query file with header row:
q -ddelimiter -H "SELECT * from path/to/file"
- Read data from stdin; '-' in the query represents the data from
output | q "select * from -"
- Join two files (aliased as
f2in the example) on column
c1, a common column:
q "SELECT * FROM path/to/file f1 JOIN path/to/other_file f2 ON (f1.c1 = f2.c1)"
- Format output using an output delimiter with an output header line (note: command will output column names based on the input file header or the column aliases overridden in the query):
q -Ddelimiter -O "SELECT column as alias from path/to/file"
q [ options ] [ file | - ] [ argument ... ]
qc [ options ] [ file | - ] ...
These programs are used to compile and execute scripts written in the Q programming language. Q is an interpreted, dynamically typed functional programming language based on term rewriting which allows you to define functions using symbolic equations.
For instance, here is a little Q script featuring a recursive definition of the well-known factorial function:
fact 0 = 1; fact N = N*fact(N-1) if N>0;
This definition tells the interpreter that the term (function application) `fact 0' should evaluate to the integer constant 1, while any other term `fact N' with N>0 evaluates to the value of the expression N*fact(N-1).
A closer description of the language is well outside the scope of this manual page, but you can find some further notes about Q below, and you should also take a look at the Q info file (available online using info qdoc or with the help command of the interpreter) for details and many more examples.
The primary interface to the Q language is the interpreter program q. The qc program is a compiler for Q scripts which is usually invoked automatically by the interpreter to translate the source script to a bytecode format which is suitable for efficient execution. To run a script stored in a file foo.q you usually invoke the interpreter just as:
(The script name can also be followed by other parameters which are passed to the script and can be accessed through the built-in ARGS variable of the interpreter.)
You can also execute compiler and interpreter separately, like this:
qc foo.q q q.out
The compiler will then compile the source script foo.q to the bytecode file q.out which can be loaded by the interpreter. Note that if you run a source script through the interpreter, then the compilation step is handled automatically and the bytecode file is removed automatically as soon as it has been loaded by the interpreter.
Both compiler and interpreter can also be invoked without arguments, or with an empty script name, in which case only the built-in functions and definitions in the script prelude.q (which by default includes the standard Q library) are loaded. The automatic inclusion of the prelude script can also be suppressed with the --no-prelude compiler option.
The script name can also be a single hyphen `-' to indicate that the script should be read from standard input.
Script and bytecode files are searched for on the “Q library path” which usually defaults to something like
You can override this default by setting the QPATH environment variable, by using the -p command line option, and with the path command of the interpreter.
Compiler and interpreter support both short and long (GNU style) options. A brief descriptive message showing the version number can be obtained with the --version or -V option. You can also invoke compiler and interpreter with the --help or -h option to print a summary of the command line syntax and the available options. Other important options are listed below (see the Q info file for more).
Stops option processing (remaining parameters will be treated as ordinary command arguments even if they start with `-').
- -c command
Execute the given interpreter command (batch mode).
Invoke the symbolic debugger built into the interpreter.
Run interactively (print sign-on and prompt) even when input or output is redirected. Also cause any -c and -s options to be ignored.
- -o output-file
Specify the name of the bytecode file created by the compiler (default is q.out).
Quiet startup (suppress the sign-on message).
- -s command-file
Source file with interpreter commands (batch mode).
Prints warnings about possibly undefined function symbols. This gives you a moderate level of confidence for small or medium-sized programs. -w2 or --pedantic prints even more diagnostics, and might be useful for larger projects. -w3 or --paranoid prints an excessive amount of diagnostics even for perfectly legal scripts. This is not intended to be used regularly, but may occasionally be useful to check your script for missing declarations or mistyped identifiers.
Unless one of the -c and -s options is specified, or if invoked with the -i option, the interpreter starts up in interactive mode, in which the user is repeatedly prompted to enter an expression to be evaluated, and the interpreter answers with the corresponding “normal form.” If the interpreter runs in interactive mode and is connected to a tty, the interpreter supports command line editing and history using the GNU readline library. The quit function causes the interpreter to be exited. End-of-file and Ctl-C are also handled (more or less) gracefully.
On the interactive command line, the value of the last expression can be referred to using the “anonymous” variable, denoted by an underscore (`_'). Moreover, the interpreter understands a number of special commands which allow you to define variables, inspect and adjust various system parameters, edit and run scripts and command source files, read online info, load and save variables values, etc. Please refer to the Q info page for a description of these. You can also put such commands into the .qinitrc and .qexitrc files in your home directory which are sourced when the interpreter starts up and is exited in interactive mode, respectively. This provides a convenient means, e.g., to customize parameters of the interpreter according to your taste, and to automatically reload and save variable values.
On UNIX systems, you can also run Q scripts directly from the shell using the “shebang” #! to specify the q program as a command language processor. For instance, use the following as the first line of your script to invoke q with the option -cfoo which causes the function foo to be evaluated at startup:
Such lines will be treated as comments by the compiler and interpreter. It is also possible to specify compiler and interpreter options at the beginning of the main script using the notation `#! option'. For instance:
#!/usr/local/bin/q #! -w #! -cfoo
Instead of directly running the script file, you can also use the qcwrap program to translate the script to a C file. This is useful if your shell does not support the #! notation, or if the script is to be distributed in a self-contained, binary form. The qcwrap program is available as an optional addon, see the Q info file for details.
The --debug or -d option causes activation of a symbolic debugger built into the interpreter. The debugger can also be invoked interactively and you can set breakpoints using the debug and break commands on the command line. The debugger allows you to trace the reductions performed by the Q interpreter in the course of an expression evaluation. You can also step over reductions, abort the evaluation, and print a list of activated rules. Use the command ? or help in the debugger to print a list of debugger commands.
A colon-separated list of directories to be searched for source and code files.
The default warning level (overridden with the -wn option; zero if not set).
Editor used by the built-in edit command of the interpreter (default: vi).
Program used to read online documentation with the built-in help command of the interpreter (default: info).
Program used to communicate with emacs(1) when running as a client of gnuserv(1), which is triggered with the interpreter's --gnuclient option.
Default code file name.
Default file name for loading and saving variable definitions (load and save commands).
File used to store the command history when the interpreter is run interactively.
Initialization file containing interpreter commands to be executed at startup when running interactively.
Termination file containing interpreter commands to be executed when the interpreter exits.
Q may have started out as an academic research project, but it should not be mistaken for a toy language. Q has a modern syntax featuring both user-definable infix operators and curried function applications, and provides many goodies of modern-style functional languages, such as higher-order functions (including lambdas), support for both eager and lazy evaluation, and OOP-style polymorphic algebraic types. The Q interpreter goes to great lengths to implement term rewriting in an efficient manner, so that Q programs are executed reasonably fast, more or less comparable to other interpreted languages. Moreover, Q comes with an extensive software library, which makes it a practical programming tool and in many areas surpasses what is available for its bigger cousins like ML and Haskell.
The Q interpreter is extensible using “external” modules written in C or C++ (which are loaded at runtime, if possible), and can itself be used as an extension language for other C/C++ applications. Q has a fairly complete POSIX system interface and a comprehensive collection of addon modules which interface to various popular open source software packages including, e.g., GNU Octave, various GUI, graphics, multimedia, database and web-related libraries, and a module for the Apache web server. There is a language mode for emacs, which provides a convenient environment for editing and running Q scripts, and syntax files for the vim and kate text editors are also available. All this is described in much more detail in the Q info file and in the other documentation available on the Q website.
Caveats and Bugs
The only major issue I am aware of is memory requirements. The actual data of an expression node is only 12 bytes, but memory management, type tags and other book-keeping information sum up to another 12 bytes. There is no easy way around this in the current implementation, so don't expect this to change anytime soon. Fortunately, main memory gets cheaper and bigger all the time, so this should rarely be a problem in practice.
The Q interpreter uses a special pattern-matching technique to determine matching equations quickly and during a single left-to-right scan of each potential redex. This usually works very well, but there are some pathological configurations of left-hand sides of equations which cause an exponential blow-up of the tables of the pattern-matching automaton; fortunately, they are rare. You can tell that you have run into such a situation when the interpreter needs a long time to start up or appears to hang during bytecode compilation. The only way around this currently is to rewrite your script so that the amount of overlap between equations is reduced.
Another limitation of the current implementation is that special argument patterns and paths to left-hand side variables are currently encoded as bit vectors to save memory space. Thus functions cannot be declared with more than 32 special parameters, and the left-hand side of a rule or local variable definition may not be more than 32 levels deep. There are also some hardcoded limits in the compiler for the sizes of the expression and code table for a single rule. The default table sizes are fairly large and, so far, this has never been a problem in practice. If you do run into an “expression too complex” or “code table overflow” error then it is probably time to restructure your program anyway. ;-)
The Q Programming Language, by Albert Graef, Johannes Gutenberg-University Mainz, Germany. (Also available online, see info qdoc.)