nixos/tests/initrd-openvpn: Add test for openvpn in the initramfs
The module in this commit adds new options that allows the
integration of an OpenVPN client into the initrd.
This can be used e.g. to remotely unlock LUKS devices.
This commit also adds two tests for `boot.initrd.network.openvpn`.
The first one is a basic test to validate that a failing connection
does not prevent the machine from booting.
The second test validates that this module actually creates a valid
openvpn connection.
For this, it spawns three nodes:
- The client that uses boot.initrd.network.openvpn
- An OpenVPN server that acts as gateway and forwards a port
to the client
- A node that is external to the OpenVPN network
The client connects to the OpenVPN server and spawns a netcat instance
that echos a value to every client.
Afterwards, the external node checks if it receives this value over the
forwarded port on the OpenVPN gateway.
test failed because gnutls-cli does not properly report connection
errors any more, fixed by increasing the debug level for gnutls-cli
Fixes: #84507
Closes: #90718
This option exposes the prefconfigured nextcloud-occ
program. nextcloud-occ can then be used in other systemd services or
added in environment.systemPackages.
The nextcloud test shows how it can be add in
environment.systemPackages.
Done by setting `autopilot.min_quorum = 3`.
Techncially, this would have been required to keep the test correct since
Consul's "autopilot" "Dead Server Cleanup" was enabled by default (I believe
that was in Consul 0.8). Practically, the issue only occurred with our NixOS
test with releases >= `1.7.0-beta2` (see #90613). The setting itself is
available since Consul 1.6.2.
However, this setting was not documented clearly enough for anybody to notice,
and only the upstream issue https://github.com/hashicorp/consul/issues/8118
I filed brought that to light.
As explained there, the test could also have been made pass by applying the
more correct rolling reboot procedure
-m.wait_until_succeeds("[ $(consul members | grep -o alive | wc -l) == 5 ]")
+m.wait_until_succeeds(
+ "[ $(consul operator raft list-peers | grep true | wc -l) == 3 ]"
+)
but we also intend to test that Consul can regain consensus even if
the quorum gets temporarily broken.
The systemd socket unit files now more precisely track the IPFS
configuration, by including any multaddr they can make a `ListenStream`
for. (The daemon doesn't currently support anything which would use
`ListDatagram`, so we don't need to worry about that.)
The tests use some of these features.
Reads a bit more naturally, and now the changes to the
acme-${cert}.service actually reflect what would be needed were you to
do the same in production.
e.g. "for dns-01, your service that needs the cert needs to pull in the
cert"
Refactor the systemd service definition for the haproxy reverse proxy,
using the upstream systemd service definition. This allows the service
to be reloaded on changes, preserving existing server state, and adds
some hardening options.
NixOS currently has issues with setting the FQDN of a system in a way
where standard tools work. In order to help with experimentation and
avoid regressions, add a test that checks that the hostname is
reported as the user wanted it to be.
Co-authored-by: Michael Weiss <dev.primeos@gmail.com>
Favor the configuration in "configFile" over "config" to allow
"configFile" to override "config" without a system rebuild.
Add a "persistentKeys" option to generate keys and addresses that
persist across service restarts. This is useful for self-configuring
boot media.
This adds a simple test running GNU Hello cross-compiled for armv7l and
aarch64 inside a x86_64 VM with configured binfmt.
We already build the cross toolchains in other invocations, and building
hello itself is small.
This test is sometimes flaky on hydra as at the time of the `git clone`
the network isn't really configured yet[1]. That problem doesn't seem to
occur locally but if you run it on a machine with high enough load (such
as hydra build machines). Hopefully this will make the test not flaky
anymore.
[1] https://hydra.nixos.org/build/118710378/nixlog/21/raw
This seems to have worked in 15f105d41f (5
months ago) but broke somewhere in the meantime.
The current module doesn't seem to be underdocumented and might need a
serious refactor. It requires quite some hacks to get it to work (see
https://github.com/NixOS/nixpkgs/issues/86305#issuecomment-621129942),
or how the ldap.nix test used systemd.services.openldap.preStart and
made quite some assumptions on internals.
Mic92 agreed on being added as a maintainer for the module, as he uses
it a lot and can possibly fix eventual breakages. For the most basic
startup breakages, the remaining openldap.nix test might suffice.
`doas` is a lighter alternative to `sudo` that "provide[s] 95% of the
features of `sudo` with a fraction of the codebase" [1]. I prefer it to
`sudo`, so I figured I would add a NixOS module in order for it to be
easier to use. The module is based off of the existing `sudo` module.
[1] https://github.com/Duncaen/OpenDoas
This is a follow-up to the PR #82026 that contains the promised tests.
In this test I am testing if we can properly propagate prefixes received
via DHCPv6 PD with the networkd options in our module system.
The comments in the test should be sufficient to follow the idea and
what is going on.
Setting up a XMPP chat server is a pretty deep rabbit whole to jump in
when you're not familiar with this whole universe. Your experience
with this environment will greatly depends on whether or not your
server implements the right set of XEPs.
To tackle this problem, the XMPP community came with the idea of
creating a meta-XEP in charge of listing the desirable XEPs to comply
with. This meta-XMP is issued every year under an new XEP number. The
2020 one being XEP-0423[1].
This prosody nixos module refactoring makes complying with XEP-0423
easier. All the necessary extensions are enabled by default. For some
extensions (MUC and HTTP_UPLOAD), we need some input from the user and
cannot provide a sensible default nixpkgs-wide. For those, we guide
the user using a couple of assertions explaining the remaining manual
steps to perform.
We took advantage of this substential refactoring to refresh the
associated nixos test.
Changelog:
- Update the prosody package to provide the necessary community
modules in order to comply with XEP-0423. This is a tradeoff, as
depending on their configuration, the user might end up not using them
and wasting some disk space. That being said, adding those will
allow the XEP-0423 users, which I expect to be the majority of
users, to leverage a bit more the binary cache.
- Add a muc submodule populated with the prosody muc defaults.
- Add a http_upload submodule in charge of setting up a basic http
server handling the user uploads. This submodule is in is
spinning up an HTTP(s) server in charge of receiving and serving the
user's attachments.
- Advertise both the MUCs and the http_upload endpoints using mod disco.
- Use the slixmpp library in place of the now defunct sleekxmpp for
the prosody NixOS test.
- Update the nixos test to setup and test the MUC and http upload
features.
- Add a couple of assertions triggered if the setup is not xep-0423
compliant.
[1] https://xmpp.org/extensions/xep-0423.html
When testing WireGuard updates, I usually run the VM-tests with
different kernels to make sure we're not introducing accidental
regressions for e.g. older kernels.
I figured that we should automate this process to ensure continuously
that WireGuard works fine on several kernels.
For now I decided to test the latest LTS version (5.4) and
the latest kernel (currently 5.6). We can add more kernels in the
future, however this seems to significantly slow down evaluation and
time.
The list can be customized by running a command like this:
nix-build nixos/tests/wireguard --arg kernelVersionsToTest '["4.19"]'
The `kernelPackages` argument in the tests is null by default to make
sure that it's still possible to invoke the test-files directly. In that
case the default kernel of NixOS (currently 5.4) is used.
The elasticsearch-curator was not deleting indices because the indices
had ILM policies associated with them. This is now fixed by
configuring the elasticsearch-curator with `allow_ilm_indices: true`.
Also see: https://github.com/elastic/curator/issues/1490
Enables multi-site configurations.
This break compatibility with prior configurations that expect options
for a single dokuwiki instance in `services.dokuwiki`.
Shimming out the Let's Encrypt domain name to reuse client configuration
doesn't work properly (Pebble uses different endpoint URL formats), is
recommended against by upstream,[1] and is unnecessary now that the ACME
module supports specifying an ACME server. This commit changes the tests
to use the domain name acme.test instead, and renames the letsencrypt
node to acme to reflect that it has nothing to do with the ACME server
that Let's Encrypt runs. The imports are renamed for clarity:
* nixos/tests/common/{letsencrypt => acme}/{common.nix => client}
* nixos/tests/common/{letsencrypt => acme}/{default.nix => server}
The test's other domain names are also adjusted to use *.test for
consistency (and to avoid misuse of non-reserved domain names such
as standalone.com).
[1] https://github.com/letsencrypt/pebble/issues/283#issuecomment-545123242
Co-authored-by: Yegor Timoshenko <yegortimoshenko@riseup.net>
This was added in aade4e577b, but the
implementation of the ACME module has been entirely rewritten since
then, and the test seems to run fine on AArch64.
linux-hardened sets kernel.unprivileged_userns_clone=0 by default; see
anthraxx/linux-hardened@104f44058f.
This allows the Nix sandbox to function while reducing the attack
surface posed by user namespaces, which allow unprivileged code to
exercise lots of root-only code paths and have lead to privilege
escalation vulnerabilities in the past.
We can safely leave user namespaces on for privileged users, as root
already has root privileges, but if you're not running builds on your
machine and really want to minimize the kernel attack surface then you
can set security.allowUserNamespaces to false.
Note that Chrome's sandbox requires either unprivileged CLONE_NEWUSER or
setuid, and Firefox's silently reduces the security level if it isn't
allowed (see about:support), so desktop users may want to set:
boot.kernel.sysctl."kernel.unprivileged_userns_clone" = true;
* nixos/k3s: simplify config expression
* nixos/k3s: add config assertions and trim unneeded bits
* nixos/k3s: add a test that k3s works; minor module improvements
This is a single-node test. Eventually we should also have a multi-node
test to verify the agent bit works, but that one's more involved.
* nixos/k3s: add option description
* nixos/k3s: add defaults for token/serveraddr
Now that the assertion enforces their presence, we dont' need to use the typesystem for it.
* nixos/k3s: remove unneeded sudo in test
* nixos/k3s: add to test list
For reasons yet unknown, the vxlan backend doesn't work (at least inside
the qemu networking), so this is moved to the udp backend.
Note changing the backend apparently also changes the interface name,
it's now `flannel0`, not `flannel.1`
fixes #74941
This was whitespace-sensitive, kept fighting with my editor and broke
the tests easily. To fix this, let python convert the output to
individual lines, and strip whitespace from them before comparing.
Also removed `pkgs.hydra-flakes` since flake-support has been merged
into master[1]. Because of that, `pkgs.hydra-unstable` is now compiled
against `pkgs.nixFlakes` and currently requires a patch since Hydra's
master doesn't compile[2] atm.
[1] https://github.com/NixOS/hydra/pull/730
[2] https://github.com/NixOS/hydra/pull/732
Upgrades Hydra to the latest master/flake branch. To perform this
upgrade, it's needed to do a non-trivial db-migration which provides a
massive performance-improvement[1].
The basic ideas behind multi-step upgrades of services between NixOS versions
have been gathered already[2]. For further context it's recommended to
read this first.
Basically, the following steps are needed:
* Upgrade to a non-breaking version of Hydra with the db-changes
(columns are still nullable here). If `system.stateVersion` is set to
something older than 20.03, the package will be selected
automatically, otherwise `pkgs.hydra-migration` needs to be used.
* Run `hydra-backfill-ids` on the server.
* Deploy either `pkgs.hydra-unstable` (for Hydra master) or
`pkgs.hydra-flakes` (for flakes-support) to activate the optimization.
The steps are also documented in the release-notes and in the module
using `warnings`.
`pkgs.hydra` has been removed as latest Hydra doesn't compile with
`pkgs.nixStable` and to ensure a graceful migration using the newly
introduced packages.
To verify the approach, a simple vm-test has been added which verifies
the migration steps.
[1] https://github.com/NixOS/hydra/pull/711
[2] https://github.com/NixOS/nixpkgs/pull/82353#issuecomment-598269471
While our ETag patch works pretty fine if it comes to serving data off
store paths, it unfortunately broke something that might be a bit more
common, namely when using regexes to extract path components of
location directives for example.
Recently, @devhell has reported a bug with a nginx location directive
like this:
location ~^/\~([a-z0-9_]+)(/.*)?$" {
alias /home/$1/public_html$2;
}
While this might look harmless at first glance, it does however cause
issues with our ETag patch. The alias directive gets broken up by nginx
like this:
*2 http script copy: "/home/"
*2 http script capture: "foo"
*2 http script copy: "/public_html/"
*2 http script capture: "bar.txt"
In our patch however, we use realpath(3) to get the canonicalised path
from ngx_http_core_loc_conf_s.root, which returns the *configured* value
from the root or alias directive. So in the example above, realpath(3)
boils down to the following syscalls:
lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/home/$1", 0x7ffd08da6f60) = -1 ENOENT (No such file or directory)
During my review[1] of the initial patch, I didn't actually notice that
what we're doing here is returning NGX_ERROR if the realpath(3) call
fails, which in turn causes an HTTP 500 error.
Since our patch actually made the canonicalisation (and thus additional
syscalls) necessary, we really shouldn't introduce an additional error
so let's - at least for now - silently skip return value if realpath(3)
has failed.
However since we're using the unaltered root from the config we have
another issue, consider this root:
/nix/store/...-abcde/$1
Calling realpath(3) on this path will fail (except if there's a file
called "$1" of course), so even this fix is not enough because it
results in the ETag not being set to the store path hash.
While this is very ugly and we should fix this very soon, it's not as
serious as getting HTTP 500 errors for serving static files.
I added a small NixOS VM test, which uses the example above as a
regression test.
It seems that my memory is failing these days, since apparently I *knew*
about this issue since digging for existing issues in nixpkgs, I found
this similar pull request which I even reviewed:
https://github.com/NixOS/nixpkgs/pull/66532
However, since the comments weren't addressed and the author hasn't
responded to the pull request, I decided to keep this very commit and do
a follow-up pull request.
[1]: https://github.com/NixOS/nixpkgs/pull/48337
Signed-off-by: aszlig <aszlig@nix.build>
Reported-by: @devhell
Acked-by: @7c6f434c
Acked-by: @yorickvP
Merges: https://github.com/NixOS/nixpkgs/pull/80671
Fixes: https://github.com/NixOS/nixpkgs/pull/66532
Dropbear lags behind OpenSSH significantly in both support for modern
key formats like `ssh-ed25519`, let alone the recently-introduced
U2F/FIDO2-based `sk-ssh-ed25519@openssh.com` (as I found when I switched
my `authorizedKeys` over to it and promptly locked myself out of my
server's initrd SSH, breaking reboots), as well as security features
like multiprocess isolation. Using the same SSH daemon for stage-1 and
the main system ensures key formats will always remain compatible, as
well as more conveniently allowing the sharing of configuration and
host keys.
The main reason to use Dropbear over OpenSSH would be initrd space
concerns, but NixOS initrds are already large (17 MiB currently on my
server), and the size difference between the two isn't huge (the test's
initrd goes from 9.7 MiB to 12 MiB with this change). If the size is
still a problem, then it would be easy to shrink sshd down to a few
hundred kilobytes by using an initrd-specific build that uses musl and
disables things like Kerberos support.
This passes the test and works on my server, but more rigorous testing
and review from people who use initrd SSH would be appreciated!
This mirrors the behaviour of systemd - It's udev that parses `.link`
files, not `systemd-networkd`.
This was originally applied in 36ef112a47,
but was reverted due to 1115959a8d causing
evaluation errors on hydra.
...even when networkd is disabled
This reverts commit ce78f3ac70, reversing
changes made to dc34da0755.
I'm sorry; Hydra has been unable to evaluate, always returning
> error: unexpected EOF reading a line
and I've been unable to reproduce the problem locally. Bisecting
pointed to this merge, but I still can't see what exactly was wrong.
- Fix misspelled option. mkRenamedOptionModule is not used because the
option hasn't really worked before.
- Add missing cfg.telemetryPath arg to ExecStart.
- Fix mkdir invocation in test.
Add a cage module to nixos. This can be used to make kiosk-style
systems that boot directly to a single application. The user (demo by
default) is automatically logged in by this service and the
program (xterm by default) is automatically started.
This is useful for some embedded, single-user systems where we want
automatic booting. To keep the system secure, the user should have
limited privileges.
Based on the service provided in the Cage wiki here:
https://github.com/Hjdskes/cage/wiki/Starting-Cage-on-boot-with-systemd
Co-Authored-By: Florian Klink <flokli@flokli.de>
* prometheus-nginx-exporter: 0.5.0 -> 0.6.0
* nixos/prometheus-nginx-exporter: update for 0.6.0
Added new option constLabels and updated virtualHost name in the
exporter's test.
This is useful when buildLayeredImage is called in a generic way
that should allow simple (base) images to be built, which may not
reference any store paths.
I am not sure how this ever passed on hydra but 30s is barely enough to
pass the configure phase of opensmtpd. It is likely the package was
built as part of another jobset. Whenever it is built as part of the
test execution the timeout propagates and 30s is clearly not enough for
that.
The subtest was mainly written to demonstrate the VRF-issues with a
5.x-kernel. However this breaks the entire test now as we have 5.4 as
default kernel. Disabling the test for now, I still need to find some
time to investigate.
Depending on the network management backend being used, if the interface
configuration in stage 1 is not cleared, there might still be some old
addresses or routes from stage 1 present in stage 2 after network
configuration has finished.
This makes predictable interfaces names available as soon as possible
with udev by adding the default network link units to initrd which are read
by udev. Also adds some udev rules that are needed but which would normally
loaded from the udev store path which is not included in the initrd.
invalid test was introduced in 297d1598ef
and it is disabled in the shipped daemon.conf.
I forgot to reflect that in the module, which caused the daemon to print the following on start-up:
FuEngine invalid has incorrect built version invalid
and the command to warn:
WARNING: The daemon has loaded 3rd party code and is no longer supported by the upstream developers!
To reduce the change of this happening in the future, I moved the list of default disabled plug-ins to the package expression.
I also set the value of the NixOS module option in the config section of the module instead of the default value used previously,
which will allow users to not care about these plug-ins.
In 87a19e9048 I merged staging-next into master using the GitHub gui as intended.
In ac241fb7a5 I merged master into staging-next for the next staging cycle, however, I accidentally pushed it to master.
Thinking this may cause trouble, I reverted it in 0be87c7979. This was however wrong, as it "removed" master.
This reverts commit 0be87c7979.
I merged master into staging-next but accidentally pushed it to master.
This should get us back to 87a19e9048.
This reverts commit ac241fb7a5, reversing
changes made to 76a439239e.
The old Quake3 NixOS test was removed in
50ea99cbc1 which served as a nice demo to
showcase what NixOS tests are capable of. This commit adds the same
functionality to run real openarena clients.
This module allows root autoLogin, so we would break that for users, but
they shouldn't be using it anyways. This gives the impression like auto
is some special display manager, when it's just lightdm and special pam
rules to allow root autoLogin. It was created for NixOS's testing
so I believe this is where it belongs.
- the `imageFile` option allows to load an image from a derivation
- the `dependsOn` option can be used to specify dependencies between container systemd units.
Co-authored-by: Christian Höppner <mkaito@users.noreply.github.com>
* nixos/buildkite: drop user option
This reverts 8c6b1c3eaa.
Turns out, buildkite-agent has logic to write .ssh/known_hosts files and
only really works when $HOME and the user homedir are in sync.
On top of that, we provision ssh keys in /var/lib/buildkite-agent, which
doesn't work if that other users' homedir points elsewhere (we can cheat
by setting $HOME, but then getent and $HOME provide conflicting
results).
So after all, it's better to only run the system-wide buildkite agent as
the "buildkite-agent" user only - if one wants to run buildkite as
different users, systemd user services might be a better fit.
* nixosTests.buildkite-agent: add node with separate user and no ssh key
The current nixpkgs use elasticsearch-curator 5.8.1. As of version 5.7.0,
elasticsearch-curator supports elasticsearch 7, thus this change enables tests
with ES 7.
Updates required:
- Use vpc image format (new default, supported by Amazon)
- Pass full image filename to makeEc2Test
- Increase memory allocation for nixos-rebuild
- Set a networking.hostName for services.httpd
- Add appropriate escaping in literal userdata
While I'm here, try to make it fail fast.
This fixes the patch for nginx to clear the Last-Modified header if a
static file is served from the Nix store.
So far we only used the ETag from the store path, but if the
Last-Modified header is always set to "Thu, 01 Jan 1970 00:00:01 GMT",
Firefox and Chrome/Chromium seem to ignore the ETag and simply use the
cached content instead of revalidating.
Alongside the fix, this also adds a dedicated NixOS VM test, which uses
WebDriver and Firefox to check whether the content is actually served
from the browser's cache and to have a more real-world test case.
This is what I've suspected a while ago[1]:
> Heads-up everyone: After testing this in a few production instances,
> it seems that some browsers still get cache hits for new store paths
> (and changed contents) for some reason. I highly suspect that it might
> be due to the last-modified header (as mentioned in [2]).
>
> Going to test this with last-modified disabled for a little while and
> if this is the case I think we should improve that patch by disabling
> last-modified if serving from a store path.
Much earlier[2] when I reviewed the patch, I wrote this:
> Other than that, it looks good to me.
>
> However, I'm not sure what we should do with Last-Modified header.
> From RFC 2616, section 13.3.4:
>
> - If both an entity tag and a Last-Modified value have been
> provided by the origin server, SHOULD use both validators in
> cache-conditional requests. This allows both HTTP/1.0 and
> HTTP/1.1 caches to respond appropriately.
>
> I'm a bit nervous about the SHOULD here, as user agents in the wild
> could possibly just use Last-Modified and use the cached content
> instead.
Unfortunately, I didn't pursue this any further back then because
@pbogdan noted[3] the following:
> Hmm, could they (assuming they are conforming):
>
> * If an entity tag has been provided by the origin server, MUST
> use that entity tag in any cache-conditional request (using If-
> Match or If-None-Match).
Since running with this patch in some deployments, I found that both
Firefox and Chrome/Chromium do NOT re-validate against the ETag if the
Last-Modified header is still the same.
So I wrote a small NixOS VM test with Geckodriver to have a test case
which is closer to the real world and I indeed was able to reproduce
this.
Whether this is actually a bug in Chrome or Firefox is an entirely
different issue and even IF it is the fault of the browsers and it is
fixed at some point, we'd still need to handle this for older browser
versions.
Apart from clearing the header, I also recreated the patch by using a
plain "git diff" with a small description on top. This should make it
easier for future authors to work on that patch.
[1]: https://github.com/NixOS/nixpkgs/pull/48337#issuecomment-495072764
[2]: https://github.com/NixOS/nixpkgs/pull/48337#issuecomment-451644084
[3]: https://github.com/NixOS/nixpkgs/pull/48337#issuecomment-451646135
Signed-off-by: aszlig <aszlig@nix.build>
When using format-strings, curly brackets need to be escaped using `{{`
to avoid errors from python.
And apparently, Perl's `==` is used to compare substrings[1] which is why
the translation to `assert http_code == "304"` failed as the string
contains several headers from curl.
[1] Just check `perl <(echo 'die "alarm" if "foo\n304" == 304')`
We've rewritten it use GDM, and we can now autologin
to the X11 session because of the accountsservice preStart
script for autologin. It should work similar to how the wayland
test works, just with a few nuanced differences for xorg.