ch-grow - Man Page

Build and manage images; completely unprivileged

Synopsis

$ ch-grow [...] build [-t TAG] [-f DOCKERFILE] [...] CONTEXT
$ ch-grow [...] list
$ ch-grow [...] pull [...] IMAGE_REF [IMAGE_DIR]
$ ch-grow [...] storage-path
$ ch-grow { --help | --version | --dependencies }

Description

ch-grow is a tool for building and manipulating container images, but not running them (for that you want ch-run). It is completely unprivileged, with no setuid/setgid/setcap helpers.

Options that print brief information and then exit:

-h, --help

Print help and exit successfully.

--dependencies

Report dependency problems on standard output, if any, and exit. If all is well, there is no output and the exit is successful; in case of problems, the exit is unsuccessful.

--version

Print version number and exit successfully.

Common options placed before the sub-command:

--no-cache

Download everything needed, ignoring the cache.

-s, --storage DIR

Set the storage directory (see below for important details).

-v, --verbose

Print extra chatter; can be repeated.

Storage Directory

ch-grow maintains state using normal files and directories, including unpacked container images, located in its storage directory. There is no notion of storage drivers, graph drivers, etc., to select and/or configure. In descending order of priority, this directory is located at:

-s, --storage DIR

Command line option.

$CH_GROW_STORAGE

Environment variable.

/var/tmp/$USER/ch-grow

Default.

The storage directory can reside on any filesystem. However, it contains lots of small files and metadata traffic can be intense. For example, the Charliecloud test suite uses approximately 400,000 files and directories in the storage directory as of this writing. Place it on a filesystem appropriate for this; tmpfs’es such as /var/tmp are a good choice if you have enough RAM (/tmp is not recommended because ch-run bind-mounts it into containers by default).

While you can currently poke around in the storage directory and find unpacked images runnable with ch-run, this is not a supported use case. The supported workflow uses ch-builder2tar or ch-builder2squash to obtain a packed image; see the tutorial for details.

WARNING:

Network filesystems, especially Lustre, are typically bad choices for the storage directory. This is a site-specific question and your local support will likely have strong opinions.

Subcommands

build

Build an image from a Dockerfile and put it in the storage directory. Use ch-run(1) to execute RUN instructions.

Required argument:

CONTEXT

Path to context directory; this is the root of Copy and ADD instructions in the Dockerfile.

Options:

--build-arg KEY[=VALUE]

Set build-time variable KEY defined by ARG instruction to VALUE. If VALUE not specified, use the value of environment variable KEY.

-f, --file DOCKERFILE

Use DOCKERFILE instead of CONTEXT/Dockerfile. Specify a single hyphen (-) to use standard input; note that in this case, the context directory is still provided, which matches docker build -f - behavior.

-n, --dry-run

Do not actually execute any Dockerfile instructions.

--parse-only

Stop after parsing the Dockerfile.

-t, -tag TAG

Name of image to create. If not specified, use the final component of path CONTEXT. Append :latest if no colon present.

storage-path

Print the storage directory path and exit.

pull

Pull the image described by the image reference IMAGE_REF from a repository by HTTPS. See the FAQ for the gory details on specifying image references.

This script does a fair amount of validation and fixing of the layer tarballs before flattening in order to support unprivileged use despite image problems we frequently see in the wild. For example, device files are ignored, and file and directory permissions are increased to a minimum of rwx------ and rw------- respectively. Note, however, that symlinks pointing outside the image are permitted, because they are not resolved until runtime within a container.

Destination argument:

IMAGE_DIR

If specified, place the unpacked image at this path; it is then ready for use by ch-run or other tools. The storage directory will not contain a copy of the image, i.e., it is only unpacked once.

Options:

--parse-only

Parse IMAGE_REF, print a parse report, and exit successfully without talking to the internet or touching the storage directory.

Compatibility with Other Dockerfile Interpreters

ch-grow is an independent implementation and shares no code with other Dockerfile interpreters. It uses a formal Dockerfile parsing grammar developed from the Dockerfile reference documentation and miscellaneous other sources, which you can examine in the source code.

We believe this independence is valuable for several reasons. First, it helps the community examine Dockerfile syntax and semantics critically, think rigorously about what is really needed, and build a more robust standard. Second, it yields disjoint sets of bugs (note that Podman, Buildah, and Docker all share the same Dockerfile parser). Third, because it is a much smaller code base, it illustrates how Dockerfiles work more clearly. Finally, it allows straightforward extensions if needed to support scientific computing.

ch-grow tries hard to be compatible with Docker and other interpreters, though as an independent implementation, it is not bug-compatible.

This section describes differences from the Dockerfile reference that we expect to be approximately permanent. For an overview of features we have not yet implemented and our plans, see our road map on GitHub. Plain old bugs are in our GitHub issues.

None of these are set in stone. We are very interested in feedback on our assessments and open questions. This helps us prioritize new features and revise our thinking about what is needed for HPC containers.

Quirks of a fully unprivileged build

ch-grow is fully unprivileged. It runs all instructions as the normal user who invokes it, does not use any setuid or setcap helper programs, and does not use /etc/subuid or /etc/subgid, in contrast to the “rootless” mode of some competing builders.

RUN instructions are executed with ch-run --uid=0 --gid=0, i.e., host EUID and EGID both mapped to zero inside the container, and only one UID (zero) and GID (zero) are available inside the container. Also, /etc/passwd and /etc/group are bind-mounted from temporary files outside the container and can’t be written. (Strictly speaking, the files themselves are read-write, but because they are bind-mounted, the common pattern of writing a new file and moving it on top of the existing one fails.)

This has two consequences: the shell and its children appear to be running as root but only some privileged system calls are available, and manipulating users and groups will fail. This confuses some programs, which fail with “permission denied” and related errors; for example, chgrp(1) often appears in Debian package post-install scripts. We have worked around some of these problems, but many remain. Another manual workaround is to install fakeroot in the Dockerfile and prepend fakeroot to problem commands.

NOTE:

Most of these issues affect any fully unprivileged container build, not just ch-grow. We are working to better characterize the problems and add automatic workarounds.

Context directory

The context directory is bind-mounted into the build, rather than copied like Docker. Thus, the size of the context is immaterial, and the build reads directly from storage like any other local process would. However, you still can’t access anything outside the context directory.

Authentication

ch-grow can authenticate using one-time passwords, e.g. those provided by a security token. Unlike docker login, it does not assume passwords are persistent.

Environment variables

Variable substitution happens for all instructions, not just the ones listed in the Dockerfile reference.

ARG and ENV cause cache misses upon definition, in contrast with Docker where these variables miss upon use, except for certain cache-excluded variables that never cause misses, listed below.

Like Docker, ch-grow pre-defines the following proxy variables, which do not require an ARG instruction. However, they are available if the same-named environment variable is defined; --build-arg is not required. Changes to these variables do not cause a cache miss.

HTTP_PROXY
http_proxy
HTTPS_PROXY
https_proxy
FTP_PROXY
ftp_proxy
NO_PROXY
no_proxy

The following variables are also pre-defined:

PATH=/ch/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
TAR_OPTIONS=--no-same-owner

Note that ARG and ENV have different syntax despite very similar semantics.

Copy

Especially for people used to UNIX cp(1), the semantics of the Dockerfile COPY instruction can be confusing.

Most notably, when a source of the copy is a directory, the contents of that directory, not the directory itself, are copied. This is documented, but it’s a real gotcha because that’s not what cp(1) does, and it means that many things you can do in one cp(1) command require multiple COPY instructions.

Also, the reference documentation is incomplete. In our experience, Docker also behaves as follows; ch-grow does the same in an attempt to be bug-compatible for the COPY instructions.

  1. You can use absolute paths in the source; the root is the context directory.
  2. Destination directories are created if they don’t exist in the following situations:

    1. If the destination path ends in slash. (Documented.)
    2. If the number of sources is greater than 1, either by wildcard or explicitly, regardless of whether the destination ends in slash. (Not documented.)
    3. If there is a single source and it is a directory. (Not documented.)
  3. Symbolic links are particularly messy (this is not documented):

    1. If named in sources either explicitly or by wildcard, symlinks are dereferenced, i.e., the result is a copy of the symlink target, not the symlink itself. Keep in mind that directory contents are copied, not directories.
    2. If within a directory named in sources, symlinks are copied as symlinks.

We expect the following differences to be permanent:

  • Wildcards use Python glob semantics, not the Go semantics.
  • COPY --chown is ignored, because it doesn’t make sense in an unprivileged build.

Features we do not plan to support

  • Parser directives are not supported. We have not identified a need for any of them.
  • EXPOSE: Charliecloud does not use the network namespace, so containerized processes can simply listen on a host port like other unprivileged processes.
  • HEALTHCHECK: This instruction’s main use case is monitoring server processes rather than applications. Also, implementing it requires a container supervisor daemon, which we have no plans to add.
  • MAINTAINER is deprecated.
  • STOPSIGNAL requires a container supervisor daemon process, which we have no plans to add.
  • USER does not make sense for unprivileged builds.
  • VOLUME: This instruction is not currently supported. Charliecloud has good support for bind mounts; we anticipate that it will continue to focus on that and will not introduce the volume management features that Docker has.

Environment Variables

CH_LOG_FILE

If set, append log chatter to this file, rather than standard error. This is useful for debugging situations where standard error is consumed or lost.

Also sets verbose mode if not already set (equivalent to --verbose).

CH_LOG_FESTOON

If set, prepend PID and timestamp to logged chatter.

Examples

build

Build image bar using ./foo/bar/Dockerfile and context directory ./foo/bar:

$ ch-grow build -t bar -f ./foo/bar/Dockerfile ./foo/bar
[...]
grown in 4 instructions: bar

Same, but infer the image name and Dockerfile from the context directory path:

$ ch-grow build ./foo/bar
[...]
grown in 4 instructions: bar

pull

Download the Debian Buster image and place it in the storage directory:

$ ch-grow pull debian:buster
pulling image:   debian:buster

manifest: downloading
layer 1/1: d6ff36c: downloading
layer 1/1: d6ff36c: listing
validating tarball members
resolving whiteouts
flattening image
layer 1/1: d6ff36c: extracting
done

Same, except place the image in /tmp/buster:

$ ch-grow pull debian:buster /tmp/buster
[...]
$ ls /tmp/buster
bin   dev  home  lib64  mnt  proc  run   srv  tmp  var
boot  etc  lib   media  opt  root  sbin  sys  usr

Reporting Bugs

If Charliecloud was obtained from your Linux distribution, use your distribution’s bug reporting procedures.

Otherwise, report bugs to: <https://github.com/hpc/charliecloud/issues>

See Also

charliecloud(1)

Full documentation at: <https://hpc.github.io/charliecloud>

Referenced By

charliecloud(1).

2020-09-22 00:00 Coordinated Universal Time 0.19 Charliecloud