mount_namespaces man page
mount_namespaces — overview of Linux mount namespaces
For an overview of namespaces, see namespaces(7).
Mount namespaces provide isolation of the list of mount points seen by the processes in each namespace instance. Thus, the processes in each of the mount namespace instances will see distinct single-directory hierarchies.
The views provided by the /proc/[pid]/mounts, /proc/[pid]/mountinfo, and /proc/[pid]/mountstats files (all described in proc(5)) correspond to the mount namespace in which the process with the PID [pid] resides. (All of the processes that reside in the same mount namespace will see the same view in these files.)
When a process creates a new mount namespace using clone(2) or unshare(2) with the CLONE_NEWNS flag, the mount point list for the new namespace is a copy of the caller's mount point list. Subsequent modifications to the mount point list (mount(2) and umount(2)) in either mount namespace will not (by default) affect the mount point list seen in the other namespace (but see the following discussion of shared subtrees).
Restrictions on mount namespaces
Note the following points with respect to mount namespaces:
- A mount namespace has an owner user namespace. A mount namespace whose owner user namespace is different from the owner user namespace of its parent mount namespace is considered a less privileged mount namespace.
- When creating a less privileged mount namespace, shared mounts are reduced to slave mounts. (Shared and slave mounts are discussed below.) This ensures that mappings performed in less privileged mount namespaces will not propagate to more privileged mount namespaces.
- Mounts that come as a single unit from more privileged mount are locked together and may not be separated in a less privileged mount namespace. (The unshare(2) CLONE_NEWNS operation brings across all of the mounts from the original mount namespace as a single unit, and recursive mounts that propagate between mount namespaces propagate as a single unit.)
- The mount(2) flags MS_RDONLY, MS_NOSUID, MS_NOEXEC, and the "atime" flags (MS_NOATIME, MS_NODIRATIME, MS_RELATIME) settings become locked when propagated from a more privileged to a less privileged mount namespace, and may not be changed in the less privileged mount namespace.
A file or directory that is a mount point in one namespace that is not a mount point in another namespace, may be renamed, unlinked, or removed (rmdir(2)) in the mount namespace in which it is not a mount point (subject to the usual permission checks).
Previously, attempting to unlink, rename, or remove a file or directory that was a mount point in another mount namespace would result in the error EBUSY. That behavior had technical problems of enforcement (e.g., for NFS) and permitted denial-of-service attacks against more privileged users. (i.e., preventing individual files from being updated by bind mounting on top of them).
Mount namespaces first appeared in Linux 2.4.19.
Namespaces are a Linux-specific feature.
The propagation type assigned to a new mount point depends on the propagation type of the parent mount. If the mount point has a parent (i.e., it is a non-root mount point) and the propagation type of the parent is MS_SHARED, then the propagation type of the new mount is also MS_SHARED. Otherwise, the propagation type of the new mount is MS_PRIVATE.
Notwithstanding the fact that the default propagation type for new mount points is in many cases MS_PRIVATE, MS_SHARED is typically more useful. For this reason, systemd(1) automatically remounts all mount points as MS_SHARED on system startup. Thus, on most modern systems, the default propagation type is in practice MS_SHARED.
Since, when one uses unshare(1) to create a mount namespace, the goal is commonly to provide full isolation of the mount points in the new namespace, unshare(1) (since util-linux version 2.27) in turn reverses the step performed by systemd(1), by making all mount points private in the new namespace. That is, unshare(1) performs the equivalent of the following in the new mount namespace:
mount --make-rprivate /
To prevent this, one can use the --propagation unchanged option to unshare(1).
For a discussion of propagation types when moving mounts (MS_MOVE) and creating bind mounts (MS_BIND), see Documentation/filesystems/sharedsubtree.txt.
unshare(1), clone(2), mount(2), setns(2), umount(2), unshare(2), proc(5), namespaces(7), user_namespaces(7), findmnt(8)
Documentation/filesystems/sharedsubtree.txt in the kernel source tree.
This page is part of release 5.01 of the Linux man-pages project. A description of the project, information about reporting bugs, and the latest version of this page, can be found at https://www.kernel.org/doc/man-pages/.
clone(2), core(5), mount(2), namespaces(7), nsenter(1), pid_namespaces(7), proc(5), systemd.exec(5), umount(2), unshare(1), unshare(2).