Your company here — click to reach over 10,000 unique daily visitors

mlr - Man Page

like awk, sed, cut, join, and sort for name-indexed data such as CSV and tabular JSON.

Examples (TL;DR)

Pretty-print a CSV file in a tabular format: mlr --icsv --opprint cat example.csv
Receive JSON data and pretty print the output: echo '{"hello":"world"}' | mlr --ijson --opprint cat
Sort alphabetically on a field: mlr --icsv --opprint sort -f field example.csv
Sort in descending numerical order on a field: mlr --icsv --opprint sort -nr field example.csv
Convert CSV to JSON, performing calculations and display those calculations: mlr --icsv --ojson put '$newField1 = $oldFieldA/$oldFieldB' example.csv
Receive JSON and format the output as vertical JSON: echo '{"hello":"world", "foo":"bar"}' | mlr --ijson --ojson --jvstack cat
Filter lines of a compressed CSV file treating numbers as strings: mlr --prepipe 'gunzip' --csv filter -S '$fieldName =~ "regular_expression"' example.csv.gz

Synopsis

Usage: mlr [I/O options] {verb} [verb-dependent options ...] {zero or more file names}

Miller operates on key-value-pair data while the familiar Unix tools operate on integer-indexed fields: if the natural data structure for the latter is the array, then Miller's natural data structure is the insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV, TSV, and JSON. (Miller can handle positionally-indexed data as a special case.) This manpage documents Miller v5.10.1.

Examples

Command-Line Syntax

mlr --csv cut -f hostname,uptime mydata.csv
mlr --tsv --rs lf filter '$status != "down" && $upsec >= 10000' *.tsv
mlr --nidx put '$sum = $7 < 0.0 ? 3.5 : $7 + 2.1*$8' *.dat
grep -v '^#' /etc/group | mlr --ifs : --nidx --opprint label group,pass,gid,member then sort -f group
mlr join -j account_id -f accounts.dat then group-by account_name balances.dat
mlr --json put '$attr = sub($attr, "([0-9]+)_([0-9]+)_.*", "\1:\2")' data/*.json
mlr stats1 -a min,mean,max,p10,p50,p90 -f flag,u,v data/*
mlr stats2 -a linreg-pca -f u,v -g shape data/*
mlr put -q '@sum[$a][$b] += $x; end {emit @sum, "a", "b"}' data/*
mlr --from estimates.tbl put '
  for (k,v in $*) {
    if (is_numeric(v) && k =~ "^[t-z].*$") {
      $sum += v; $count += 1
    }
  }
  $mean = $sum / $count # no assignment if count unset'
mlr --from infile.dat put -f analyze.mlr
mlr --from infile.dat put 'tee > "./taps/data-".$a."-".$b, $*'
mlr --from infile.dat put 'tee | "gzip > ./taps/data-".$a."-".$b.".gz", $*'
mlr --from infile.dat put -q '@v=$*; dump | "jq .[]"'
mlr --from infile.dat put  '(NR % 1000 == 0) { print > stderr, "Checkpoint ".NR}'

Data Formats

  DKVP: delimited key-value pairs (Miller default format)
  +---------------------+
  | apple=1,bat=2,cog=3 | Record 1: "apple" => "1", "bat" => "2", "cog" => "3"
  | dish=7,egg=8,flint  | Record 2: "dish" => "7", "egg" => "8", "3" => "flint"
  +---------------------+
  NIDX: implicitly numerically indexed (Unix-toolkit style)
  +---------------------+
  | the quick brown     | Record 1: "1" => "the", "2" => "quick", "3" => "brown"
  | fox jumped          | Record 2: "1" => "fox", "2" => "jumped"
  +---------------------+
  CSV/CSV-lite: comma-separated values with separate header line
  +---------------------+
  | apple,bat,cog       |
  | 1,2,3               | Record 1: "apple => "1", "bat" => "2", "cog" => "3"
  | 4,5,6               | Record 2: "apple" => "4", "bat" => "5", "cog" => "6"
  +---------------------+
  Tabular JSON: nested objects are supported, although arrays within them are not:
  +---------------------+
  | {                   |
  |  "apple": 1,        | Record 1: "apple" => "1", "bat" => "2", "cog" => "3"
  |  "bat": 2,          |
  |  "cog": 3           |
  | }                   |
  | {                   |
  |   "dish": {         | Record 2: "dish:egg" => "7", "dish:flint" => "8", "garlic" => ""
  |     "egg": 7,       |
  |     "flint": 8      |
  |   },                |
  |   "garlic": ""      |
  | }                   |
  +---------------------+
  PPRINT: pretty-printed tabular
  +---------------------+
  | apple bat cog       |
  | 1     2   3         | Record 1: "apple => "1", "bat" => "2", "cog" => "3"
  | 4     5   6         | Record 2: "apple" => "4", "bat" => "5", "cog" => "6"
  +---------------------+
  XTAB: pretty-printed transposed tabular
  +---------------------+
  | apple 1             | Record 1: "apple" => "1", "bat" => "2", "cog" => "3"
  | bat   2             |
  | cog   3             |
  |                     |
  | dish 7              | Record 2: "dish" => "7", "egg" => "8"
  | egg  8              |
  +---------------------+
  Markdown tabular (supported for output only):
  +-----------------------+
  | | apple | bat | cog | |
  | | ---   | --- | --- | |
  | | 1     | 2   | 3   | | Record 1: "apple => "1", "bat" => "2", "cog" => "3"
  | | 4     | 5   | 6   | | Record 2: "apple" => "4", "bat" => "5", "cog" => "6"
  +-----------------------+

Options

In the following option flags, the version with "i" designates the input stream, "o" the output stream, and the version without prefix sets the option for both input and output stream. For example: --irs sets the input record separator, --ors the output record separator, and --rs sets both the input and output separator to the given value.

Examples (TL;DR)

Synopsis

Description

Examples

Command-Line Syntax

Data Formats

Options

Help Options

Verb List

Function List

I/O Formatting

Comments in Data

Format-Conversion Keystroke-Savers

Compressed I/O

Separators

Csv-Specific Options

Double-Quoting for Csv/Csvlite Output

Numerical Formatting

Other Options

Then-Chaining

Auxiliary Commands

MLRRC

Verbs

altkv

bar

bootstrap

cat

check

clean-whitespace

count

count-distinct

count-similar

cut

decimate

fill-down

filter

format-values

fraction

grep

group-by

group-like

having-fields

head

histogram

join

label

least-frequent

merge-fields

most-frequent

nest

nothing

put

regularize

remove-empty-columns

rename

reorder

repeat

reshape

sample

sec2gmt

sec2gmtdate

seqgen

shuffle

skip-trivial-records

sort

sort-within-records

stats1

stats2

step

tac

tail

tee

top

uniq

unsparsify

Functions for Filter/Put

+

-

*

/