This is a good example of a package/module that should be distributed
externally (e.g. as a flake [1]): it's not stable yet so anybody who
seriously wants to use it will want to use the upstream repo. Also,
it's highly specialized so NixOS is not really the right place at the
moment (every NixOS module slows down NixOS evaluation for everybody).
[1] https://github.com/edolstra/jormungandr/tree/flake
The recent custom endpoint addition allows us to directly point
certbot to the custom Pebble directory endpoint.
Thanks to that, we can ditch the Pebble patch we were using so far;
making this test maintenance easier.
* lm_sensors: add fancontrol module + nixos test
fancontrol is a small script that checks temperature sensors and adapts
fan speeds accordingly. It reads a text config file that can be
auto-generated by running the pwmconfig wizard on the live system.
Rename the old ceph test to ceph-single-node and add a new test
ceph-multi-node. The ceph-single-node represents a dev cluster whereas
ceph-multi-node is closer to a prod cluster.
Let's encrypt bumped ACME to V2. We need to update our nixos test to
be compatible with this new protocol version.
We decided to drop the Boulder ACME server in favor of the more
integration test friendly Pebble.
- overriding cacert not necessary
- this avoids rebuilding lots of packages needlessly
- nixos/tests/acme: use pebble's ca for client tests
- pebble always generates its own ca which has to be fetched
TODO: write proper commit msg :)
Default getfacl behavior is to remove leading slash on absolute
paths in its header printed to stdout.
Before the header it will also print a message about it...
Switches -p -or --absolute-names can turn this off
and remove some noise from our tests logs.
The networking.virtual test does not work with networkd yet, for
multiple reasons:
- network-online.target is not reached, because tun0 and tap0 are
considered as required for online but _not_ brought up or assigned
the configured addresses
- the commands later in the test rely on some units from the scripted
network setup
cc @fpletz networkd exper
cc @globin we looked at this together
The test has recently been failing due to the IPv6 address
on the server still being in the tentative state, when the
client sends its first request. The server will not start
using the IPv6 address until DAD has completed.
Scripted networking seems not to wait for DAD completion
before completing network-online.target, so let's switch
to networkd instead, which does.
During the last update, `hydra-notify` was rewritten as a daemon which
listens to postgresql notifications for each build[1]. The module
uses the `hydra-notify.service` unit from upstream's Hydra module and
the VM test ensures that email notifications are sent properly.
Also updated `hydra-init.service` to install `pg_trgm` on a local
database if needed[2].
[1] c7861b85c4
[2] 8a0a5ec3a3
We ship `https://cache.nixos.org` as binary cache by default which
automatically substitutes the test derivation used inside the Hydra
test. However it needs to be built locally to confirm that
`hydra-queue-runner` works properly.
Also inherited the platform name for the test derivation from `system`
to ensure that the build can be tested on each supported platform.
ZHF #68361
It turned out that /dev/snd/* always exists even if there are no sound
drivers loaded at all. Loading `snd` and `snd_timer` fixes that
situation. It is probably fair to assume someone that wants to use sound
also enables that in the NixOS configuration.
Add support for storing secrets in files outside the nix store, since
files in the nix store are world-readable and secrets therefore can't
be stored safely there.
The old string options are kept, since they can potentially be handy
for testing purposes, but their descriptions now state that they
shouldn't be used in production. The manual section is updated to use
the file options rather than the string options and the tests now test
both.
Always enable the UART because the VirtualBug bug that required running without the UART was fixed in 6.0.10. Stop using an old kernel version because the tests work with the default kernel.
(cherry picked from commit ae93571e8d04cebd69491a789d902d6481e05d3f)
In c814d72b51, a bunch of packages were
changed to use the pname attribute, among them were the quake3-demodata
and quake3-pointrelease which we use for the quake3 test.
Fortunately, having pname available means that we no longer need to
match using a prefix, so fixing this eval error also simplifies our
matching.
I directly pushed this to master because the change is non-controversial
and we can't break things that are already broken :-)
Signed-off-by: aszlig <aszlig@nix.build>
* remove kinetic
* release note
* add johanot as maintainer
nixos/ceph: create option for mgr_module_path
- since the upstream default is no longer correct in v14
* fix module, default location for libexec has changed
* ceph: fix test
* maintain only one version
* ceph-client: init
* include ceph-volume python tool in output
nixos/ceph: extraConfig, fix test, wait for ceph-mgr to become active
* run ceph with disk group permission
* add extraConfig option for the global section
needed per cluster
* clear up how ceph.conf is generated
* fix ceph testcase
Since https://github.com/NixOS/nixpkgs/pull/61321, local-fs.target is
part of sysinit.target again, meaning units without
DefaultDependencies=no will automatically depend on it, and the manual
set dependencies can be dropped.
It turns out that checking for the last mount time of an ext4 file
system isn't a very reliable way to check whether the file system was
properly unmounted.
When creating that test in the first place (88530e02b6),
I was reluctant to inspect the file system when the VM is down and was
searching for a way to check for a clean unmount *after* the file system
was mounted again to make sure we don't need to create a 512 MB raw
image on the host.
Fortunately however, when converting from qcow2, qemu-img actually
writes a sparse file, so for most file systems (that is, file systems
supporting sparse files) this shouldn't waste a lot of disk space.
So when investigating the flakiness, I found that whenever the test is
failing, the unmount of /test-x-initrd-mount was done *before* the final
step during which systemd remounts+unmounts all the remaining file
systems.
I haven't investigated why this is the case, but the test is a
regression test for https://github.com/NixOS/nixpkgs/issues/35268, which
actually didn't unmount the file system *at* *all*, so really all we
need to take care here is whether the unmount has happened and not
*how*.
To make sure that checking the filesystem state is enough for this, I
temporarily replaced the $machine->shutdown call with $machine->crash
and verified that the file system state is "not clean".
Signed-off-by: aszlig <aszlig@nix.build>
Fixes: https://github.com/NixOS/nixpkgs/issues/67555
* nixos/acme: Fix ordering of cert requests
When subsequent certificates would be added, they would
not wake up nginx correctly due to target units only being triggered
once. We now added more fine-grained systemd dependencies to make sure
nginx always is aware of new certificates and doesn't restart too early
resulting in a crash.
Furthermore, the acme module has been refactored. Mostly to get
rid of the deprecated PermissionStartOnly systemd options which were
deprecated. Below is a summary of changes made.
* Use SERVICE_RESULT to determine status
This was added in systemd v232. we don't have to keep track
of the EXITCODE ourselves anymore.
* Add regression test for requesting mutliple domains
* Deprecate 'directory' option
We now use systemd's StateDirectory option to manage
create and permissions of the acme state directory.
* The webroot is created using a systemd.tmpfiles.rules rule
instead of the preStart script.
* Depend on certs directly
By getting rid of the target units, we make sure ordering
is correct in the case that you add new certs after already
having deployed some.
Reason it broke before: acme-certificates.target would
be in active state, and if you then add a new cert, it
would still be active and hence nginx would restart
without even requesting a new cert. Not good! We
make the dependencies more fine-grained now. this should fix that
* Remove activationDelay option
It complicated the code a lot, and is rather arbitrary. What if
your activation script takes more than activationDelay seconds?
Instead, one should use systemd dependencies to make sure some
action happens before setting the certificate live.
e.g. If you want to wait until your cert is published in DNS DANE /
TLSA, you could create a unit that blocks until it appears in DNS:
```
RequiredBy=acme-${cert}.service
After=acme-${cert}.service
ExecStart=publish-wait-for-dns-script
```