See https://www.redhat.com/sysadmin/fedora-31-control-group-v2 for
details on why this is desirable, and how it impacts containers.
Users that need to keep using the old cgroup hierarchy can re-enable it
by setting `systemd.unifiedCgroupHierarchy` to `false`.
Well-known candidates not supporting that hierarchy, like docker and
hidepid=… will disable it automatically.
Fixes #73800
This is analogous to #70447.
With security.lockKernelModules=true, docker commands result in the following
error without at least loading veth:
$ docker run hello-world
/nix/store/mr50kaan2vs4gc40ymwncb2vci25aq7z-docker-19.03.2/libexec/docker/docker: Error response from daemon: failed to create endpoint epic_kare on network bridge: failed to add the host (veth8b381f3) <=> sandbox (veth348e197) pair interfaces: operation not supported.
ERRO[0003] error waiting for container: context canceled
This allows to run the prune job periodically on a machine.
By default the if enabled the job is run once a week.
The structure is similar to how system.autoUpgrade works.
Docker socket is world writable. This means any user on the system is
able to invoke docker command. (Which is equal to having a root access
to the machine.)
This commit makes socket group-writable and owned by docker group.
Inspired by
https://github.com/docker/docker/blob/master/contrib/init/systemd/docker.socket
All the new options in detail:
Enable docker in multi-user.target make container created with restart=always
to start. We still want socket activation as it decouples dependencies between
the existing of /var/run/docker.sock and the docker daemon. This means that
services can rely on the availability of this socket. Fixes #11478#21303
wantedBy = ["multi-user.target"];
This allows us to remove the postStart hack, as docker reports on its own when
it is ready.
Type=notify
The following will set unset some limits because overhead in kernel's ressource
accounting was observed. Note that these limit only apply to containerd.
Containers will have their own limit set.
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Upgrades may require schema migrations. This can delay the startup of dockerd.
TimeoutStartSec=0
Allows docker to create its own cgroup subhierarchy to apply ressource limits on
containers.
Delegate=true
When dockerd is killed, container should be not affected to allow
`live restore` to work.
KillMode=process
- logDriver option, use journald for logging by default
- keep storage driver intact by default, as docker has sane defaults
- do not choose storage driver in tests, docker will choose by itself
- use dockerd binary as "docker daemon" command is deprecated and will be
removed
- add overlay2 to list of storage drivers
We need to use wrapped modprobe, so that it finds the right
modules. Docker needs modprobe to load overlay kernel module
for example.
This fixes an an error starting docker if the booted system's kernel
version is different from the /run/current-system profile's one.
The docker module used different code for socket-activated docker daemon than for the non-socket activated daemon.
In particular, if the socket-activated daemon is used, then modprobe wasn't set up to be usable and in PATH for
the docker daemon, which resulted in a failure to start the daemon with overlayfs as storageDriver if the
`overlay` kernel module wasn't already loaded. This commit fixes that bug (which only appears if socket
activation is used), and also reduces the duplication between code paths so that it's easier to keep
both in sync in future.
Commit 9bfe92ecee ("docker: Minor improvements, fix failing test") added
the services.docker.storageDriver option, made it mandatory but didn't
give it a default value. This results in an ugly traceback when users
enable docker, if they don't pay enough attention to also set the
storageDriver option. (An attempt was made to add an assertion, but it
didn't work, possibly because of how "mkMerge" works.)
The arguments against a default value were that the optimal value
depends on the filesystem on the host. This is, AFAICT, only in part
true. (It seems some backends are filesystem agnostic.) Also, docker
itself uses a default storage driver, "devicemapper", when no
--storage-driver=x options are given. Hence, we use the same value as
default.
Add a FIXME comment that 'devicemapper' breaks NixOS VM tests (for yet
unknown reasons), so we still run those with the 'overlay' driver.
Closes #10100 and #10217.
When using the ZFS storagedriver in docker, it shells out for the ZFS
commands. The path configuration for the systemd task does not include
ZFS, so if the driver is set to ZFS, add ZFS utilities to the PATH.
This will resolve https://github.com/NixOS/nixpkgs/issues/10127
[Bjørn: prefix commit message with "nixos/docker:", remove extra space
before ';']
This version of module has disabled socketActivation, because until
nixos upgrade systemd to at least 214, systemd does not support
SocketGroup. So socket is created with "root" group when
socketActivation enabled. Should be fixed as soon as systemd upgraded.
Includes changes from #3015 and supersedes #3028