Related to #85746 which addresses documentation issue,
digging deeper for a reason why this was disabled
was simply because it wasn't working which is not the case anymore.
- Actually use the zfsSupport option
- Add documentation URI to lxd.service
- Add lxd.socket to enable socket activatation
- Add proper dependencies and remove systemd-udev-settle from lxd.service
- Set up /var/lib/lxc/rootfs using systemd.tmpfiles
- Configure safe start and shutdown of lxd.service
- Configure restart on failures of lxd.service
Launching a container with a private network requires creating a
dedicated networking interface for it; name of that interface is derived
from the container name itself - e.g. a container named `foo` gets
attached to an interface named `ve-foo`.
An interface name can span up to IFNAMSIZ characters, which means that a
container name must contain at most IFNAMSIZ - 3 - 1 = 11 characters;
it's a limit that we validate using a build-time assertion.
This limit has been upgraded with Linux 5.8, as it allows for an
interface to contain a so-called altname, which can be much longer,
while remaining treated as a first-class citizen.
Since altnames have been supported natively by systemd for a while now,
due diligence on our side ends with dropping the name-assertion on newer
kernels.
This commit closes#38509.
systemd/systemd#14467systemd/systemd#17220https://lwn.net/Articles/794289/
Ensure that the HyperV keyboard driver is available in the early
stages of the boot process. This allows the user to enter a disk
encryption passphrase or repair a boot problem in an interactive
shell.
We now set the hooks dir correctly if the OCI hook is enabled. CRI-O
supports this specific hook from v1.20.0.
Signed-off-by: Sascha Grunert <mail@saschagrunert.de>
Reintroduce the `fetch-ssh-keys` service so that GCE images that work
with NixOps can once again be built. Also, reformat the code a bit.
The service was removed in 88570538b3,
likely due to a comment saying it should be removed. It was still
needed for images to work with NixOps, however, and probably needed to
be replaced or rewritten rather than removed.
The socketActivation option was removed, but later on socket activation
was added back without the option to disable it. The description now reflects
that socket activation is used unconditionally in the current setup.
Since the introduction of option `containers.<name>.pkgs`, the
`nixpkgs.*` options (including `nixpkgs.pkgs`, `nixpkgs.config`, ...) were always
ignored in container configs, which broke existing containers.
This was due to `containers.<name>.pkgs` having two separate effects:
(1) It sets the source for the modules that are used to evaluate the container.
(2) It sets the `pkgs` arg (`_module.args.pkgs`) that is used inside the container
modules.
This happens even when the default value of `containers.<name>.pkgs` is unchanged, in which
case the container `pkgs` arg is set to the pkgs of the host system.
Previously, the `pkgs` arg was determined by the `containers.<name>.config.nixpkgs.*` options.
This commit reverts the breaking change (2) while adding a backwards-compatible way to achieve (1).
It removes option `pkgs` and adds option `nixpkgs` which implements (1).
Existing users of `pkgs` are informed by an error message to use option
`nixpkgs` or to achieve only (2) by setting option `containers.<name>.config.nixpkgs.pkgs`.
It's been 8.5 years since NixOS used mingetty, but the option was
never renamed (despite the file definining the module being renamed in
9f5051b76c ("Rename mingetty module to agetty")).
I've chosen to rename it to services.getty here, rather than
services.agetty, because getty is implemantation-neutral and also the
name of the unit that is generated.
... build-vm-with-bootloader" for EFI systems
This reverts commit 20257280d9, reversing
changes made to 926a1b2094.
It broke nixosTests.installer.simpleUefiSystemdBoot
and right now channel is lagging behing for two weeks.
`nixos-rebuild build-vm-with-bootloader` currently fails with the
default NixOS EFI configuration:
$ cat >configuration.nix <<EOF
{
fileSystems."/".device = "/dev/sda1";
boot.loader.systemd-boot.enable = true;
boot.loader.efi.canTouchEfiVariables = true;
}
EOF
$ nixos-rebuild build-vm-with-bootloader -I nixos-config=$PWD/configuration.nix -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/nixos-20.09.tar.gz
[...]
insmod: ERROR: could not insert module /nix/store/1ibmgfr13r8b6xyn4f0wj115819f359c-linux-5.4.83/lib/modules/5.4.83/kernel/fs/efivarfs/efivarfs.ko.xz: No such device
mount: /sys/firmware/efi/efivars: mount point does not exist.
[ 1.908328] reboot: Power down
builder for '/nix/store/dx2ycclyknvibrskwmii42sgyalagjxa-nixos-boot-disk.drv' failed with exit code 32
[...]
Fix it by setting virtualisation.useEFIBoot = true in qemu-vm.nix, when
efi is needed.
And remove the now unneeded configuration in
./nixos/tests/systemd-boot.nix, since it's handled globally.
Before:
* release-20.03: successful build, unsuccessful run
* release-20.09 (and master): unsuccessful build
After:
* Successful build and run.
Fixes https://github.com/NixOS/nixpkgs/issues/107255
Fixes that `containers.<name>.extraVeths.<name>` configuration was not
always applied.
When configuring `containers.<name>.extraVeths.<name>` and not
configuring one of `containers.<name>.localAddress`, `.localAddress6`,
`.hostAddress`, `.hostAddress6` or `.hostBridge` the veth was created,
but otherwise no configuration (i.e. no ip) was applied.
nixos-container always configures the primary veth (when `.localAddress`
or `.hostAddress` is set) to be the containers default gateway, so
this fix is required to create a veth in containers that use a different
default gateway.
To test this patch configure the following container and check if the
addresses are applied:
```
containers.testveth = {
extraVeths.testveth = {
hostAddress = "192.168.13.2";
localAddress = "192.168.13.1";
};
config = {...}:{};
};
```
The metadata fetcher scripts run each time an instance starts, and it
is not safe to assume that responses from the instance metadata
service (IMDS) will be as they were on first boot.
Example: an EC2 instance can have its user data changed while
the instance is stopped. When the instance is restarted, we want to
see the new user data applied.
According to Freenode's ##AWS, the metadata server can sometimes
take a few moments to get its shoes on, and the very first boot
of a machine can see failed requests for a few moments.
AWS's metadata service has two versions. Version 1 allowed plain HTTP
requests to get metadata. However, this was frequently abused when a
user could trick an AWS-hosted server in to proxying requests to the
metadata service. Since the metadata service is frequently used to
generate AWS access keys, this is pretty gnarly. Version two is
identical except it requires the caller to request a token and provide
it on each request.
Today, starting a NixOS AMI in EC2 where the metadata service is
configured to only allow v2 requests fails: the user's SSH key is not
placed, and configuration provided by the user-data is not applied.
The server is useless. This patch addresses that.
Note the dependency on curl is not a joyful one, and it expand the
initrd by 30M. However, see the added comment for more information
about why this is needed. Note the idea of using `echo` and `nc` are
laughable. Don't do that.
See https://www.redhat.com/sysadmin/fedora-31-control-group-v2 for
details on why this is desirable, and how it impacts containers.
Users that need to keep using the old cgroup hierarchy can re-enable it
by setting `systemd.unifiedCgroupHierarchy` to `false`.
Well-known candidates not supporting that hierarchy, like docker and
hidepid=… will disable it automatically.
Fixes#73800
This reverts commit fb6d63f3fd.
I really hope this finally fixes#99236: evaluation on Hydra.
This time I really did check basically the same commit on Hydra:
https://hydra.nixos.org/eval/1618011
Right now I don't have energy to find what exactly is wrong in the
commit, and it doesn't seem important in comparison to nixos-unstable
channel being stuck on a commit over one week old.
This adds the pinns path to the configuration let CRI-O start properly.
We also change the configuration to the new drop-in syntax.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
The toplevel derivations of systems that have `networking.hostName`
set to `""` (because they want their hostname to be set by DHCP) used
to be all named
`nixos-system-unnamed-${config.system.nixos.label}`.
This makes them hard to distinguish.
A similar problem existed in NixOS tests where `vmName` is used in the
`testScript` to refer to the VM. It defaulted to the
`networking.hostName` which when set to `""` won't allow you to refer
to the machine from the `testScript`.
This commit makes the `system.name` configurable. It still defaults to:
```
if config.networking.hostName == ""
then "unnamed"
else config.networking.hostName;
```
but in case `networking.hostName` needs to be to `""` the
`system.name` can be set to a distinguishable name.
This is analogous to #70447.
With security.lockKernelModules=true, docker commands result in the following
error without at least loading veth:
$ docker run hello-world
/nix/store/mr50kaan2vs4gc40ymwncb2vci25aq7z-docker-19.03.2/libexec/docker/docker: Error response from daemon: failed to create endpoint epic_kare on network bridge: failed to add the host (veth8b381f3) <=> sandbox (veth348e197) pair interfaces: operation not supported.
ERRO[0003] error waiting for container: context canceled
This is required by (among others) Podman to run containers in rootless mode.
Other distributions such as Fedora and Ubuntu already set up these mappings.
The scheme with a start UID/GID offset starting at 100000 and increasing in 65536 increments is copied from Fedora.
Per upstream:
> libvirtd-tcp.socket - the unit file corresponding to the TCP 16509
> port for non-TLS remote access. This socket should not be configured
> to start on boot until the administrator has configured a suitable
> authentication mechanism.
Without this, systemd-boot does not add an EFI boot entry for itself.
The reason it worked before this fix is because it would fall back to
the default installed \EFI\BOOT\BOOTX64.EFI
boot.loader.grub.device` was hardcoded to `bootDevice`, which is
wrong, because that's the device for `/`, and with `useBootLoader`
the boot loader is not on that device.
This bug probably came into existence because of bad naming;
`virtualisation.bootDevice` has description
"The disk to be used for the root filesystem", which is very confusing;
it should be `.rootDevice` then!
Unfortunately, the description is right and the attribute name is wrong,
so it is not easy to change this without deprecation.
This commit ensures that even if you use `useBootLoader` and
`diskInterface == "scsi"`, the created VM can boot through, and can run
`nixos-rebuild afterwards.
It also adds extra commentary to explain what's going on in this module
in general in relation to `useBootLoader`.
VMSGVA is recommended by virtualbox for Linux clients.
Compared to VBoxVGA and VBoxSVGA it also supports 3D acceleration.
Adding the driver makes nixos work with all three supported graphics card
types.
xchg is advertised as a bidirectional exchange dir, but file content
transfer from host to VM fails due to caching:
If a file is read in the VM and then modified on the host, subsequent
re-reads in the VM can yield old, cached data.
This is caused by the use of 9p's cache=loose mode that is explicitly
meant for read-only mounts.
9p doesn't provide any suitable cache modes, so fix this by disabling
caching.
Also, remove a now unnecessary sync in the test driver.
- Update the default pause image
- Set the cgroup manager to systemd
- Enable `manage_ns_lifecycle` instead of the deprecated
`manage_network_ns_lifecycle` option
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
This follows upstreams change in documentation. While the `[DHCP]`
section might still work it is undocumented and we should probably not
be using it anymore. Users can just upgrade to the new option without
much hassle.
I had to create a bit of custom module deprecation code since the usual
approach doesn't support wildcards in the path.
What's happening now is that both cri-o and podman are creating
/etc/containers/policy.json.
By splitting out the creation of configuration files we can make the
podman module leaner & compose better with other container software.
Many options define their example to be a Nix value without using
literalExample. This sometimes gets rendered incorrectly in the manual,
causing confusion like in https://github.com/NixOS/nixpkgs/issues/25516
This fixes it by using literalExample for such options. The list of
option to fix was determined with this expression:
let
nixos = import ./nixos { configuration = {}; };
lib = import ./lib;
valid = d: {
# escapeNixIdentifier from https://github.com/NixOS/nixpkgs/pull/82461
set = lib.all (n: lib.strings.escapeNixIdentifier n == n) (lib.attrNames d) && lib.all (v: valid v) (lib.attrValues d);
list = lib.all (v: valid v) d;
}.${builtins.typeOf d} or true;
optionList = lib.optionAttrSetToDocList nixos.options;
in map (opt: {
file = lib.elemAt opt.declarations 0;
loc = lib.options.showOption opt.loc;
}) (lib.filter (opt: if opt ? example then ! valid opt.example else false) optionList)
which when evaluated will output all options that use a Nix identifier
that would need escaping as an attribute name.
extraModprobeConfig could be applied too late i.e. if the driver has been
loaded in initrd, while the harddrive is still encrypted.
Using a kernelParams works in all cases however.
This commit fixes#76620. It moves ExecStartPre and ExecStopPost to
preStart and postStop, as these options are composable. It thus allows
adding additional initialisation scripts or cleanup scripts to the systemd
unit of the docker container.
This option allows the user to control whether or not the docker container is
automatically started on boot. The previous default behavior (true) is preserved
NixOS has `virtualisation.docker.autoPrune.enable` for this
functionality; we should not do it every time a container starts up.
(also, some trivial documentation fixes)
- the `imageFile` option allows to load an image from a derivation
- the `dependsOn` option can be used to specify dependencies between container systemd units.
Co-authored-by: Christian Höppner <mkaito@users.noreply.github.com>
Previously, we were storing the leader pid in a runtime file and
signalled SIGRTMIN+4 manually.
In systemd 219, the `machinectl poweroff` command was introduced, which
does that for us.
The missing `\n` in the printf format string prevented multiple channels from
being logged.
The missing `nixpkgs=` in the `NIX_PATH` prevented `nixos-rebuild` from working
if the system configuration has any reference to `nixpkgs`.
Additionally:
* Use process substitution instead of piping printf to avoid creating a subshell.
* Set an empty `IFS` to avoid word splitting.
* Add the `-r` flag to `read` to avoid mangling backslashes.
Currently, LXD always use pkgs.zfs, even if boot.zfs.enableUnstable is set. This
change provides the option to change the LXC, LXD and ZFS packages, and
determines the default ZFS package based on zfs.enableUnstable.
Systemd dependencies for scripted mode
were refactored according to analysis in #34586.
networking.vswitches can now be used with systemd-networkd,
although they are not supported by the daemon, a nixos receipe
creates the switch and attached required interfaces (just like
the scripted version).
Vlans and internal interfaces are implemented following the
template format i.e. each interface is
described using an attributeSet (vlan and type at the moment).
If vlan is present, then interface is added to the vswitch with
given tag (access mode). Type internal enabled vswitch to create
interfaces (see openvswitch docs).
Added configuration for configuring supported openFlow version on
the vswitch
This commit is a split from the original PR #35127.
This makes ~2.5x speed up of an empty container instantiate, hence reduces
rebuild time of system with many declarative containers.
Note that this doesn't affect production systems much, becaseu those most
likely already include `minimal.nix` profile.
A centralized list for these renames is not good because:
- It breaks disabledModules for modules that have a rename defined
- Adding/removing renames for a module means having to find them in the
central file
- Merge conflicts due to multiple people editing the central file
As part of the networking.* name space cleanup, connman should be moved
to services.connman. The same will happen for example with
networkmanager in a separate PR.
While switching NixOS configurations with both
networking.useNetworkd = true;
virtualisation.virtualbox.host.enable;
You often end up waiting for systemd-networkd-wait-online.service.
This happens because the vboxnet0 device doesn't have a carrier until
virtualbox machines are started, so networkd gets stuck in
"Configuring":
⇒ networkctl list
IDX LINK TYPE OPERATIONAL SETUP
1 lo loopback carrier unmanaged
2 wlp2s0 wlan routable unmanaged
3 vboxnet0 ether no-carrier configuring
This updates the NixOS virtualbox host module to include a
RequiredForOnline=no statement in the generated 40-vboxnet0.network
file, so networkd doesn't consider it necessary for
systemd-networkd-wait-online.service to finish.
List all modules that *may* be required depending on individual container
configurations; don't expect that further modules can be loaded after boot.
Fixes https://github.com/NixOS/nixpkgs/issues/38676
Openvswitch was upgraded to the latest
stable version (currenty 2.12.0). This remove ovs-monitor-ipsec
commands.
LTS version is still available using
`config.virtualisation.vswitch.package = pkgs.openvswitch-lts`
it has been upgraded to 2.5.6.
This commit is a split from the original PR #35127.
This fixes the warning being emitted by nixos-rebuild switch:
building Nix...
building the system configuration...
trace: warning: types.string is deprecated because it quietly concatenates strings
It started emitting a warning in #66346.
Since https://github.com/NixOS/nixpkgs/pull/61321, local-fs.target is
part of sysinit.target again, meaning units without
DefaultDependencies=no will automatically depend on it, and the manual
set dependencies can be dropped.
With local-fs.target part of sysinit.target
(https://github.com/NixOS/nixpkgs/pull/61321), we don't need to add it
explicitly to certain units anymore, and can change dependencies like
they are in other distros (I picked from Google's official CentOS 7
image here).
Like them, use StandardOutput=journal+console to pipe google-*.service
output to the serial console as well.
This adds a new ``onBoot`` option that allows specifying the action taken on
guests when the host boots. Specifying "start" ensures all guests that were
running prior to shutdown are started, regardless of their autostart settings.
Specifying "ignore" will make libvirtd ignore such guests. Any guest marked as
autostart will still be automatically started by libvirtd.
systemd provides two sysctl snippets, 50-coredump.conf and
50-default.conf.
These enable:
- Loose reverse path filtering
- Source route filtering
- `fq_codel` as a packet scheduler (this helps to fight bufferbloat)
This also configures the kernel to pass coredumps to `systemd-coredump`.
These sysctl snippets can be found in `/etc/sysctl.d/50-*.conf`,
and overridden via `boot.kernel.sysctl`
(which will place the parameters in `/etc/sysctl.d/60-nixos.conf`.
Let's start using these, like other distros already do for quite some
time, and remove those duplicate `boot.kernel.sysctl` options we
previously did set.
In the case of rp_filter (which systemd would set to 2 (loose)), make
our overrides to "1" more explicit.
We might be inside a NixOS container on a non-NixOS host, so instead of not
running at all inside a container, check if the nix-daemon socket is writable as
it will tell us if the store is managed from here or outside.
Fixes#63578
See https://github.com/NixOS/nixpkgs/issues/15747. Previously this module was called `<unknown-file>`
in error messages, now it is called a bit more close to real:
```
module at /home/danbst/dev/nixpkgs/nixos/modules/virtualisation/containers.nix:470
```
Fix#61859.
Assertion fails when a Google Compute Engine image is built, because
now choices of filesystem types are restricted to `f2fs` and `ext` family if
auto-resizing is enabled.
This change will pin the filesystem used on such an image to be `ext4` for now.
A new internal option `hardware.opengl.setLdLibraryPath` is added which controls if `LD_LIBRARY_PATH` should be set to `/run/opengl-driver(-32)/lib`. It is false by default and is meant to be set to true by any driver which requires it. If this option is false, then `opengl.nix` and `xserver.nix` will not set `LD_LIBRARY_PATH`.
Currently Mesa and NVidia drivers don't set `setLdLibraryPath` because they work with libglvnd and do not override libraries, while `amdgpu-pro`, `ati` and `parallels-guest` set it to true (the former two really need it, the last one doesn't build so is presumed to).
Additionally, the `libPath` attribute within entries of `services.xserver.drivers` is removed. This made `xserver.nix` add the driver path directly to the `LD_LIBRARY_PATH` for the display manager (including X server). Not only is it redundant when the driver is added to `hardware.opengl.package` (assuming that `hardware.opengl.enable` is true), in fact all current drivers except `ati` set it incorrectly to the package path instead of package/lib.
This removal of `LD_LIBRARY_PATH` could break certain packages using CUDA, but only those that themselves load `libcuda` or other NVidia driver libraries using `dlopen` (not if they just use `cudatoolkit`). A few have already been fixed but it is practically impossible to test all because most packages using CUDA are libraries/frameworks without a simple way to test.
Fixes#11434 if only Mesa or NVidia graphics drivers are used.
Quite some fixing was needed to get this to work.
Changes in VirtualBox and additions:
- VirtualBox is no longer officially supported on 32-bit hosts so i686-linux is removed from platforms
for VirtualBox and the extension pack. 32-bit additions still work.
- There was a refactoring of kernel module makefiles and two resulting bugs affected us which had to be patched.
These bugs were reported to the bug tracker (see comments near patches).
- The Qt5X11Extras makefile patch broke. Fixed it to apply again, making the libraries logic simpler
and more correct (it just uses a different base path instead of always linking to Qt5X11Extras).
- Added a patch to remove "test1" and "test2" kernel messages due to forgotten debugging code.
- virtualbox-host NixOS module: the VirtualBoxVM executable should be setuid not VirtualBox.
This matches how the official installer sets it up.
- Additions: replaced a for loop for installing kernel modules with just a "make install",
which seems to work without any of the things done in the previous code.
- Additions: The package defined buildCommand which resulted in phases not running, including RUNPATH
stripping in fixupPhase, and installPhase was defined which was not even run. Fixed this by
refactoring using phases. Had to set dontStrip otherwise binaries were broken by stripping.
The libdbus path had to be added later in fixupPhase because it is used via dlopen not directly linked.
- Additions: Added zlib and libc to patchelf, otherwise runtime library errors result from some binaries.
For some reason the missing libc only manifested itself for mount.vboxsf when included in the initrd.
Changes in nixos/tests/virtualbox:
- Update the simple-gui test to send the right keys to start the VM. With VirtualBox 5
it was enough to just send "return", but with 6 the Tools thing may be selected by
default. Send "home" to reliably select Tools, "down" to move to the VM and "return"
to start it.
- Disable the VirtualBox UART by default because it causes a crash due to a regression
in VirtualBox (specific to software virtualization and serial port usage). It can
still be enabled using an option but there is an assert that KVM nested virtualization
is enabled, which works around the problem (see below).
- Add an option to enable nested KVM virtualization, allowing VirtualBox to use hardware
virtualization. This works around the UART problem and also allows using 64-bit
guests, but requires a kernel module parameter.
- Add an option to run 64-bit guests. Tested that the tests pass with that. As mentioned
this requires KVM nested virtualization.
Regression introduced by c94005358c.
The commit introduced declarative docker containers and subsequently
enables docker whenever any declarative docker containers are defined.
This is done via an option with type "attrsOf somesubmodule" and a check
on whether the attribute set is empty.
Unfortunately, the check was whether a *list* is empty rather than
wether an attribute set is empty, so "mkIf (cfg != [])" *always*
evaluates to true and thus subsequently enables docker by default:
$ nix-instantiate --eval nixos --arg configuration {} \
-A config.virtualisation.docker.enable
true
Fixing this is simply done by changing the check to "mkIf (cfg != {})".
Tested this by running the "docker-containers" NixOS test and it still
passes.
Signed-off-by: aszlig <aszlig@nix.build>
Cc: @benley, @danbst, @Infinisil, @nlewo
This otherwise does not eval `:tested` any more, which means no nixos
channel updates.
Regression comes from 0eb6d0735f (#57751)
which added an assertion stopping the use of `autoResize` when the
filesystem cannot be resized automatically.
* WIP: Run Docker containers as declarative systemd services
* PR feedback round 1
* docker-containers: add environment, ports, user, workdir options
* docker-containers: log-driver, string->str, line wrapping
* ExecStart instead of script wrapper, %n for container name
* PR feedback: better description and example formatting
* Fix docbook formatting (oops)
* Use a list of strings for ports, expand documentation
* docker-continers: add a simple nixos test
* waitUntilSucceeds to avoid potential weird async issues
* Don't enable docker daemon unless we actually need it
* PR feedback: leave ExecReload undefined