In Linux 4.19 there has been a major rework of the overlayfs
implementation and it now opens files in lowerdir with O_NOATIME, which
in turn caused issues in our VM tests because the process owner of QEMU
doesn't match the file owner of the lowerdir.
The crux here is that 9p propagates the O_NOATIME flag to the host and
the guest kernel has no way of verifying whether that flag will lead to
any problems beforehand.
There is ongoing work to possibly fix this in the kernel, but it will
take a while until there is a working patch and consensus.
So in order to bring our default kernel back to 4.19 and of course make
it possible to run newer kernels in VM tests, I'm merging a small QEMU
patch as an interim solution, which we can drop once we have a working
fix in the next round of stable kernels.
Now we already had Linux 4.19 set as the default kernel, but that was
subsequently reverted in 048c36ccaa
because the patch we have used was the revert of the commit I bisected a
while ago.
This patch broke overlayfs in other ways, so I'm also merging in a VM
test by @bachp, which only tests whether overlayfs is working, just to
be on the safe side that something like this won't happen in the future.
Even though this change could be considered a moderate mass-rebuild at
least for GNU/Linux, I'm merging this to master, mainly to give us some
time to get it into the current 19.03 release branch (and subsequent
testing window) once we got no new breaking builds from Hydra.
Cc: @samueldr, @lheckemann
Fixes: https://github.com/NixOS/nixpkgs/issues/54509
Fixes: https://github.com/NixOS/nixpkgs/issues/48828
Merges: https://github.com/NixOS/nixpkgs/pull/57641
Merges: https://github.com/NixOS/nixpkgs/pull/54508
After working on the last wireguard bump (#57534), we figured that it's
probably a good idea to have a basic test which confirms that a simple
VPN with wireguard still works.
This test starts two peers with a `wg0` network interface and adds a v4
and a v6 route that goes through `wg0`.
From @edolstra at [1]:
BTW we probably should take the closure of the whole unit rather than
just the exec commands, to handle things like Environment variables.
With this commit, there is now a "fullUnit" option, which can be enabled
to include the full closure of the service unit into the chroot.
However, I did not enable this by default, because I do disagree here
and *especially* things like environment variables or environment files
shouldn't be in the closure of the chroot.
For example if you have something like:
{ pkgs, ... }:
{
systemd.services.foobar = {
serviceConfig.EnvironmentFile = ${pkgs.writeText "secrets" ''
user=admin
password=abcdefg
'';
};
}
We really do not want the *file* to end up in the chroot, but rather
just the environment variables to be exported.
Another thing is that this makes it less predictable what actually will
end up in the chroot, because we have a "globalEnvironment" option that
will get merged in as well, so users adding stuff to that option will
also make it available in confined units.
I also added a big fat warning about that in the description of the
fullUnit option.
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704
Signed-off-by: aszlig <aszlig@nix.build>
Another thing requested by @edolstra in [1]:
We should not provide a different /bin/sh in the chroot, that's just
asking for confusion and random shell script breakage. It should be
the same shell (i.e. bash) as in a regular environment.
While I personally would even go as far to even have a very restricted
shell that is not even a shell and basically *only* allows "/bin/sh -c"
with only *very* minimal parsing of shell syntax, I do agree that people
expect /bin/sh to be bash (or the one configured by environment.binsh)
on NixOS.
So this should make both others and me happy in that I could just use
confinement.binSh = "${pkgs.dash}/bin/dash" for the services I confine.
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704
Signed-off-by: aszlig <aszlig@nix.build>
Quoting @edolstra from [1]:
I don't really like the name "chroot", something like "confine[ment]"
or "restrict" seems better. Conceptually we're not providing a
completely different filesystem tree but a restricted view of the same
tree.
I already used "confinement" as a sub-option and I do agree that
"chroot" sounds a bit too specific (especially because not *only* chroot
is involved).
So this changes the module name and its option to use "confinement"
instead of "chroot" and also renames the "chroot.confinement" to
"confinement.mode".
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704
Signed-off-by: aszlig <aszlig@nix.build>
Currently, if you want to properly chroot a systemd service, you could
do it using BindReadOnlyPaths=/nix/store (which is not what I'd call
"properly", because the whole store is still accessible) or use a
separate derivation that gathers the runtime closure of the service you
want to chroot. The former is the easier method and there is also a
method directly offered by systemd, called ProtectSystem, which still
leaves the whole store accessible. The latter however is a bit more
involved, because you need to bind-mount each store path of the runtime
closure of the service you want to chroot.
This can be achieved using pkgs.closureInfo and a small derivation that
packs everything into a systemd unit, which later can be added to
systemd.packages. That's also what I did several times[1][2] in the
past.
However, this process got a bit tedious, so I decided that it would be
generally useful for NixOS, so this very implementation was born.
Now if you want to chroot a systemd service, all you need to do is:
{
systemd.services.yourservice = {
description = "My Shiny Service";
wantedBy = [ "multi-user.target" ];
chroot.enable = true;
serviceConfig.ExecStart = "${pkgs.myservice}/bin/myservice";
};
}
If more than the dependencies for the ExecStart* and ExecStop* (which
btw. also includes "script" and {pre,post}Start) need to be in the
chroot, it can be specified using the chroot.packages option. By
default (which uses the "full-apivfs"[3] confinement mode), a user
namespace is set up as well and /proc, /sys and /dev are mounted
appropriately.
In addition - and by default - a /bin/sh executable is provided as well,
which is useful for most programs that use the system() C library call
to execute commands via shell. The shell providing /bin/sh is dash
instead of the default in NixOS (which is bash), because it's way more
lightweight and after all we're chrooting because we want to lower the
attack surface and it should be only used for "/bin/sh -c something".
Prior to submitting this here, I did a first implementation of this
outside[4] of nixpkgs, which duplicated the "pathSafeName" functionality
from systemd-lib.nix, just because it's only a single line.
However, I decided to just re-use the one from systemd here and
subsequently made it available when importing systemd-lib.nix, so that
the systemd-chroot implementation also benefits from fixes to that
functionality (which is now a proper function).
Unfortunately, we do have a few limitations as well. The first being
that DynamicUser doesn't work in conjunction with tmpfs, because it
already sets up a tmpfs in a different path and simply ignores the one
we define. We could probably solve this by detecting it and try to
bind-mount our paths to that different path whenever DynamicUser is
enabled.
The second limitation/issue is that RootDirectoryStartOnly doesn't work
right now, because it only affects the RootDirectory option and not the
individual bind mounts or our tmpfs. It would be helpful if systemd
would have a way to disable specific bind mounts as well or at least
have some way to ignore failures for the bind mounts/tmpfs setup.
Another quirk we do have right now is that systemd tries to create a
/usr directory within the chroot, which subsequently fails. Fortunately,
this is just an ugly error and not a hard failure.
[1]: https://github.com/headcounter/shabitica/blob/3bb01728a0237ad5e7/default.nix#L43-L62
[2]: https://github.com/aszlig/avonc/blob/dedf29e092481a33dc/nextcloud.nix#L103-L124
[3]: The reason this is called "full-apivfs" instead of just "full" is
to make room for a *real* "full" confinement mode, which is more
restrictive even.
[4]: https://github.com/aszlig/avonc/blob/92a20bece4df54625e/systemd-chroot.nix
Signed-off-by: aszlig <aszlig@nix.build>
by adding targets and curl wait loops to services to ensure services
are not started before their depended services are reachable.
Extra targets cfssl-online.target and kube-apiserver-online.target
syncronize starts across machines and node-online.target ensures
docker is restarted and ready to deploy containers on after flannel
has discussed the network cidr with apiserver.
Since flannel needs to be started before addon-manager to configure
the docker interface, it has to have its own rbac bootstrap service.
The curl wait loops within the other services exists to ensure that when
starting the service it is able to do its work immediately without
clobbering the log about failing conditions.
By ensuring kubernetes.target is only reached after starting the
cluster it can be used in the tests as a wait condition.
In kube-certmgr-bootstrap mkdir is needed for it to not fail to start.
The following is the relevant part of systemctl list-dependencies
default.target
● ├─certmgr.service
● ├─cfssl.service
● ├─docker.service
● ├─etcd.service
● ├─flannel.service
● ├─kubernetes.target
● │ ├─kube-addon-manager.service
● │ ├─kube-proxy.service
● │ ├─kube-apiserver-online.target
● │ │ ├─flannel-rbac-bootstrap.service
● │ │ ├─kube-apiserver-online.service
● │ │ ├─kube-apiserver.service
● │ │ ├─kube-controller-manager.service
● │ │ └─kube-scheduler.service
● │ └─node-online.target
● │ ├─node-online.service
● │ ├─flannel.target
● │ │ ├─flannel.service
● │ │ └─mk-docker-opts.service
● │ └─kubelet.target
● │ └─kubelet.service
● ├─network-online.target
● │ └─cfssl-online.target
● │ ├─certmgr.service
● │ ├─cfssl-online.service
● │ └─kube-certmgr-bootstrap.service
to protect services from crashing and clobbering the logs when
certificates are not in place yet and make sure services are activated
when certificates are ready.
To prevent errors similar to "kube-controller-manager.path: Failed to
enter waiting state: Too many open files"
fs.inotify.max_user_instances has to be increased.
+ isolate etcd on the master node by letting it listen only on loopback
+ enabling kubelet on master and taint master with NoSchedule
The reason for the latter is that flannel requires all nodes to be "registered"
in the cluster in order to setup the cluster network. This means that the
kubelet is needed even at nodes on which we don't plan to schedule anything.
- All kubernetes components have been seperated into different files
- All TLS-enabled ports have been deprecated and disabled by default
- EasyCert option added to support automatic cluster PKI-bootstrap
- RBAC has been enforced for all cluster components by default
- NixOS kubernetes test cases make use of easyCerts to setup PKI
The `| tee` invocation always masked the return value of the
switch-to-configuration test.
```
~ $ false | tee && echo "oh no"
oh no
```
The added wrapper script will still output everything to stderr, while
passing failures to the test harness.
trace: warning: config.services.gitea.database.password will be stored as plaintext
in the Nix store. Use database.passwordFile instead.
(Arguably, this shouldn't be a warning at all. But making it happy is
easier than having a debate on the value of this warning.)
trace: warning: The options services.ndppd.interface and services.ndppd.network will probably be removed soon,
please use services.ndppd.proxies.<interface>.rules.<network> instead.
trace: warning: The option `services.rspamd.bindUISocket' defined in `<unknown-file>' has been renamed to `services.rspamd.workers.controller.bindSockets'.
trace: warning: The option `services.rspamd.bindSocket' defined in `<unknown-file>' has been renamed to `services.rspamd.workers.normal.bindSockets'.
trace: warning: The option `services.rspamd.workers.”rspamd_proxy".type` defined in `<unknown-file>' has enum value `proxy` which has been renamed to `rspamd_proxy`
With this option it's possible to specify a custom expression for
`roundcube`, i.e. a roundcube environment with third-party plugins as
shown in the testcase.
* pr-55320:
nixos/release-notes: mention breaking changes with matrix-synapse update
nixos/matrix-synapse: reload service with SIGHUP
nixos/tests/matrix-synapse: generate ca and certificates
nixos/matrix-synapse: use python to launch synapse
pythonPackages.pymacaroons-pynacl: remove unmaintained fork
matrix-synapse: 0.34.1.1 -> 0.99.0
pythonPackages.pymacaroons: init at 0.13.0
Hydra should support multiple Nix versions (and currently contains fixes
to work with Nix 2.0 and higher).
Further Nix versions can be added to the `hydraPkgs` expression in the
test case which lists all supported Nix versions for Hydra.
* redmine: 3.4.8 -> 4.0.1
* nixos/redmine: update nixos test to run against both redmine 3.x and 4.x series
* nixos/redmine: default new installs from 19.03 onward to redmine 4.x series, while keeping existing installs on redmine 3.x series
* nixos/redmine: add comment about default redmine package to 19.03 release notes
* redmine: add aandersea as a maintainer
This is just a set of globs to remove from the active plugins directory
after autoconfiguration is complete.
I also removed the hard-coded disabling of "diskstats", since it seems
to work just fine now.
This allows the VM to provide a `configuration.nix` file to the VM.
The test doesn't work in sandbox because it needs Internet (however it
works interactively).
The Openstack metadata service exposes the EC2 API. We use the
existing `ec2.nix` module to configure the hostname and ssh keys of an
Openstack Instance.
A test checks the ssh server is well configured.
This is mainly to reduce the size of the image (700MB). Also,
declarative features provided by cloud-init are not really useful
since we would prefer to use our `configuration.nix` file instead.
Don't add the testing "webcam" device,
which is unexpected to see when querying
what devices fwupd believes exist :).
Won't change behavior for anyone defining
the blacklistPlugin option already,
but doesn't seem worth making more complicated.
postgis: cleanup
Another part of https://github.com/NixOS/nixpkgs/pull/38698, though I did cleanup even more.
Moving docs to separate output should save another 30MB.
I did pin poppler to 0.61 just to be sure GDAL doesn't break again next
time poppler changes internal APIs.
* postgresql: reorganize package and it's extensions
Extracts some useful parts of https://github.com/NixOS/nixpkgs/pull/38698,
in particular, it's vision that postgresql plugins should be namespaced.
He prefers to contribute to his own nixpkgs fork triton.
Since he is still marked as maintainer in many packages
this leaves the wrong impression he still maintains those.
For large setups it is useful to list all databases explicit
(for example if temporary databases are also present) and store them in extra
files.
For smaller setups it is more convenient to just backup all databases at once,
because it is easy to forget to update configuration when adding/renaming
databases. pg_dumpall also has the advantage that it backups users/passwords.
As a result the module becomes easier to use because it is sufficient
in the default case to just set one option (services.postgresqlBackup.enable).
Although this can be added to `extraOptions` I figured that it makes
sense to add an option to explicitly promote this feature in our
documentation since most of the self-hosted gitea instances won't be
intended for common use I guess.
Also added a notice that this should be added after the initial deploy
as you have to register yourself using that feature unless the install
wizard is used.
Nexus increased their default minimum disk space requirement to 4GB:
```
com.orientechnologies.orient.core.exception.OLowDiskSpaceException: Error occurred while executing a
write operation to database 'OSystem' due to limited free space on the disk (1823 MB). The database
is now working in read-only mode. Please close the database (or stop OrientDB), make room on your hard
drive and then reopen the database. The minimal required space is 4096 MB. Required space is now
set to 4096MB (you can change it by setting parameter storage.diskCache.diskFreeSpaceLimit) .
server# [ 72.560866] zqnav3mg7m6ixvdcacgj7p5ibijpibx5-unit-script-nexus-start[627]: DB name="OSystem"
```
Including the rest on the VM 8GB should be the most suitable solution.
As the installer test also takes 8GB of disk size this should still be
in an acceptable range.
According to systemd-nspawn(1), --network-bridge implies --network-veth,
and --port option is supported only when private networking is enabled.
Fixes #52417.
Introduces the option security.protectKernelImage that is intended to control
various mitigations to protect the integrity of the running kernel
image (i.e., prevent replacing it without rebooting).
This makes sense as a dedicated module as it is otherwise somewhat difficult
to override for hardened profile users who want e.g., hibernation to work.
Although the package itself builds fine, the module fails because it
tries to log into a non-existant file in `/var/log` which breaks the
service. Patching to default config to log to stdout by default fixes
the issue. Additionally this is the better solution as NixOS heavily
relies on systemd (and thus journald) for logging.
Also, the runtime relies on `/etc/localtime` to start, as it's not
required by the module system we set UTC as sensitive default when using
the module.
To ensure that the service's basic functionality is available, a simple
NixOS test has been added.
pkgs.owncloud still pointed to owncloud 7.0.15 (from May 13 2016)
Last owncloud server update in nixpkgs was in Jun 2016.
At the same time Nextcloud forked away from it, indicating users
switched over to that.
cc @matej (original maintainer)
The intention of the previous change was to move krb5-config to .dev (it
gives the locations of headers), but it grabbed all of the user-facing
binaries too. This puts them back.
Allow switching out kerberos server implementation.
Sharing config is probably sensible, but implementation is different enough to
be worth splitting into two files. Not sure this is the correct way to split an
implementation, but it works for now.
Uses the switch from config.krb5 to select implementation.
They consistently fail since openjdk bump with some out-of-space errors.
That's not a problem by itself, but each test instance ties a build slot
for many hours and consequently they also delay channels as those wait
for all builds to finish.
Feel free to re-enable when fixed, of course.
The test now runs wayland, which means we can no longer use X11 style testing.
Instead we get gnome shell to execute javascript through its dbus interface.
Since 83b27f60ce, the tests were moved
into all-tests.nix and some of the tooling has changed so that
subattributes of test expressions are now recursively evaluated until a
derivation with a .test attribute has been found.
Unfortunately this isn't the case for all of the tests and the
runInMachine doesn't use the makeTest function other tests are using but
instead uses runInMachine, which doesn't generate a .test attribute.
Whener a .test attribute wasn't found by the new handleTest function, it
recurses down again until there is no value left that is an attribute
set and subsequently returns its unchanged value. This however has the
drawback that instead of getting different attributes for each
architecture we only get the last architecture in the supportedSystems
list.
In the case of the release.nix, the last architecture in
supportedSystems is "aarch64-linux", so the runInMachine test is always
built on that architecture.
In order to work around this, I changed runInMachine to emit a .test
attribute so that it looks to handleTest like it was a test created via
makeTest.
Signed-off-by: aszlig <aszlig@nix.build>
Docker images used to be, essentially, a linked list of layers. Each
layer would have a tarball and a json document pointing to its parent,
and the image pointed to the top layer:
imageA ----> layerA
|
v
layerB
|
v
layerC
The current image spec changed this format to where the Image defined
the order and set of layers:
imageA ---> layerA
|--> layerB
`--> layerC
For backwards compatibility, docker produces images which follow both
specs: layers point to parents, and images also point to the entire
list:
imageA ---> layerA
| |
| v
|--> layerB
| |
| v
`--> layerC
This is nice for tooling which supported the older version and never
updated to support the newer format.
Our `buildImage` code only supported the old version, so in order for
`buildImage` to properly generate an image based on another image
with `fromImage`, the parent image's layers must fully support the old
mechanism.
This is not a problem in general, but is a problem with
`buildLayeredImage`.
`buildLayeredImage` creates images with newer image spec, because
individual store paths don't have a guaranteed parent layer. Including
a specific parent ID in the layer's json makes the output less likely
to cache hit when published or pulled.
This means until now, `buildLayeredImage` could not be the input to
`buildImage`.
The changes in this PR change `buildImage` to only use the layer's
manifest when locating parent IDs. This does break buildImage on
extremely old Docker images, though I do wonder how many of these
exist.
This work has been sponsored by Target.
GitLab 11.5.1 dropped the dependency to posix_spawn, which is broken on
32bit. (See https://gitlab.com/gitlab-org/gitlab-ce/issues/53525)
The only part missing is decreasing virtualisation.memorySize to
something that a 32 bit qemu still executes.
The maximum seems to be 2047, and tests passed with that value for me.
This also includes a full end-to-end CockroachDB clustering test to
ensure everything basically works. However, this test is not currently
enabled by default, though it can be run manually. See the included
comments in the test for more information.
Closes #51306. Closes #38665.
Co-authored-by: Austin Seipp <aseipp@pobox.com>
Signed-off-by: Austin Seipp <aseipp@pobox.com>