Setting nixosVersion to something custom is useful for meaningful GRUB
menus and /nix/store paths, but actuallly changing it rebulids the
whole system path (because of `nixos-version` script and manual
pages). Also, changing it is not a particularly good idea because you
can then be differentitated from other NixOS users by a lot of
programs that read /etc/os-release.
This patch introduces an alternative option that does all you want
from nixosVersion, but rebuilds only the very top system level and
/etc while using your label in the names of system /nix/store paths,
GRUB and other boot loaders' menus, getty greetings and so on.
This hopefully fixes intermittent initrd failures where udevd cannot
create a Unix domain socket:
machine# running udev...
machine# error getting socket: Address family not supported by protocol
machine# error initializing udev control socket
machine# error getting socket: Address family not supported by protocol
The "unix" kernel module is supposed to be loaded automatically, and
clearly that works most of the time, but maybe there is a race
somewhere. In any case, no sane person would run a kernel without Unix
domain sockets, so we may as well make it builtin.
http://hydra.nixos.org/build/30001448
Two concurrent tarsnap backups cannot be run at the same time with the
same keys - completely separate sets of keys must be generated for each
archive in this case, if you want backups to overlap.
This extends the archives attrset to support a 'keyfile' option, which
defaults to /root/tarsnap.key like the top-level attribute.
With this change, if you generate two keys with tarsnap-keygen(1) and
use each of those separately for each archive, you can backup
concurrently.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Tarsnap locks the cachedir during backup, meaning if you specify
multiple backups with a shared cache that might overlap (for example,
one backup may take an hour), secondary backups will fail. This isn't
very nice behavior for the obvious reasons.
This splits the cache dirs for each archive appropriately. Note that
this will require a rebuild of your archive caches (although if you were
only using one archive for your whole system, you can just move the
directory).
Signed-off-by: Austin Seipp <aseipp@pobox.com>
A machine may not always be active (or online!) when a backup timer
triggers, meaning backups can be missed - now we properly set the
tarsnap timer's Persistent option so systemd will run the command even
when the machine wasn't online at that exact time.
However, we also need to make sure that we can contact the tarsnap
server reliably before we start the backup. So, we attempt to ping the
access endpoint in a loop with a sleep, before continuing.
This fixes#8823.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Tarsnap locks the cachedir during backup, meaning if you specify
multiple backups with a shared cache that might overlap (for example,
one backup may take an hour), secondary backups will fail. This isn't
very nice behavior for the obvious reasons.
This splits the cache dirs for each archive appropriately. Note that
this will require a rebuild of your archive caches (although if you were
only using one archive for your whole system, you can just move the
directory).
Signed-off-by: Austin Seipp <aseipp@pobox.com>
The Bitmessage protocol v3 became mandatory on 16 Nov 2014 and notbit does not support it, nor has there been any activity in the project repository since then.
Part of the way towards #11864. We still don't have the auditd
userland logging daemon, but journald also tracks audit logs so we
can already use this.
This hopefully fixes intermittent test failures like
http://hydra.nixos.org/build/29962437
router# [ 240.128835] INFO: task mke2fs:99 blocked for more than 120 seconds.
router# [ 240.130135] Not tainted 3.18.25 #1-NixOS
router# [ 240.131110] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
assuming that these are caused by high load on the host.
... because we make it built-in by default.
I can't imagine anyone who wanted to purge this module from his/her system,
so let's keep it simple, at least for now.
This commit adds the options --build-host and --target-host to nixos-rebuild.
--build-host instructs nixos-rebuild to perform all nix builds on the
specified host (via ssh). Build results are then copied back to the
local machine and used when activating the system.
--build-target instructs nixos-rebuild to activate the configuration
not on the local machine but on the specified remote host. Build
results are copied to the target machine and then activated there (via ssh).
It is possible to combine the usage of --build-host and --target-host,
in which case you can perform the build on one remote machine and deploy
the configuration to another remote machine. The only requirement is that
the build host has a working ssh connection to the target host (if the
target is not local), and that the local machine can connect to both
the target and the build host. Also, your user must be allowed to copy
nix closures between the local machine and the target and host machines.
At no point in time are the configuration sources (the nix files) copied
anywhere. Instead, nix evaluation always happens locally
(with nix-instantiate). The drv-file is then copied and realised remotely
(with nix-store).
As a convenience, if only --target-host is specified, --build-host is
implicitly set to that host too. So if you want to build locally and deploy
remotely you have to explicitly set "--build-host localhost".
To activate (test, boot or switch) you need to have root access to the
target host. You can specify this by "--target-host root@myhost".
I have tested the obvious scenarios and they are working. Some of the
combinations of --build-host and --target-host and the various actions might
not make much sense, and should maybe be forbidden (like setting a remote
target host when building a VM), and some combinations might not work at all.
Previously this barfed with:
updating GRUB 2 menu...
fileparse(): need a valid pathname at /nix/store/zldbbngl0f8g5iv4rslygxwp0dbg1624-install-grub.pl line 391.
warning: error(s) occured while switching to the new configuration
* Patched fish to load /etc/fish/config.fish if it exists (by default,
it only loads config relative to itself)
* Added fish-foreign-env package to parse the system environment
closes#5331
We seem to be in an unfortunate situation: booting without 'nomodeset'
causes hangs when booting on some NVIDIA cards (6948c3ab80), but on the
other hand adding 'nomodeset' prevents X from starting on other hardware
(e.g. issue #10381 and my Thinkpad X250 with an integrated Broadwell GPU).
Attempt to remedy this situation a bit by adding a separate entry in the
ISOLINUX menu (with the non-'nomodeset' being the default).
The docker module used different code for socket-activated docker daemon than for the non-socket activated daemon.
In particular, if the socket-activated daemon is used, then modprobe wasn't set up to be usable and in PATH for
the docker daemon, which resulted in a failure to start the daemon with overlayfs as storageDriver if the
`overlay` kernel module wasn't already loaded. This commit fixes that bug (which only appears if socket
activation is used), and also reduces the duplication between code paths so that it's easier to keep
both in sync in future.
I think the name 'listenAddress' is more descriptive. Other NixOS
modules that define 'host' either use it as listen address or as address
a client connects to. listenAddress is unambiguous.
The addition of 'host' was added earlier today[1], so not bothering with
./nixos/modules/rename.nix.
[1]: 44ea184997 ("jenkins ci enhancement: add port and prefix option")
As named these options enable to specify a bind host and url prefix
to be used by jenkins. Adding these options in the config rather than
using extra arguments allows us to re-use those information in other
services using jenkins such as jenkins-job-builder or a reverse proxy.
Add new option declarations to control what information is published
by the avahi daemon. The default values are chosen to respect the
privacy of the user over the connectivity of the system.
The kernel default for `link_power_management_policy` is `"max_performance"`.
This commit:
f169f60575
set the NixOS default to `"min_performance"`.
This issue (https://github.com/NixOS/nixpkgs/issues/11276) details my long
journey to discover this after several file system failures incorrectly
attributed to `TRIM` and `NCQ` settings.
I think we should use the kernel default of `"max_performance"` to assure
the best experience for new users with SSDs and to conform to the defaults of
the kernel and other distros.
The three KDE package sets now have circular dependencies between them,
so they can only be built if they are merged into a single package set
during evaluation.
- if xserver.tty and/or display are set to null, then don't specify
them, or the -logfile argument in the xserverArgs
- For lightdm, we set default tty and display to null and we determine
those at runtime based on arguments passed. This is necessary because
we run multiple X servers so they can't all be on the same display
This reverts commit 02b568414d.
With a5bc11f and 6353f58 in place, we really don't need this anymore.
After running about 500 VM tests on my Hydra, it still didn't improve
very much.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
As @domenkozar noted in #10828, cache=writeback seems to do more harm
than good:
https://github.com/NixOS/nixpkgs/issues/10828#issuecomment-164426821
He has tested it using the openstack NixOS tests and found that
cache=none significantly improves startup performance.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
This seems to be the root cause of the random page allocation failures
and @wizeman did a very good job on not only finding the root problem
but also giving a detailed explanation of it in #10828.
Here is an excerpt:
The problem here is that the kernel is trying to allocate a contiguous
section of 2^7=128 pages, which is 512 KB. This is way too much:
kernel pages tend to get fragmented over time and kernel developers
often go to great lengths to try allocating at most only 1 contiguous
page at a time whenever they can.
From the error message, it looks like the culprit is unionfs, but this
is misleading: unionfs is the name of the userspace process that was
running when the system ran out of memory, but it wasn't unionfs who
was allocating the memory: it was the kernel; specifically it was the
v9fs_dir_readdir_dotl() function, which is the code for handling the
readdir() function in the 9p filesystem (the filesystem that is used
to share a directory structure between a qemu host and its VM).
If you look at the code, here's what it's doing at the moment it tries
to allocate memory:
buflen = fid->clnt->msize - P9_IOHDRSZ;
rdir = v9fs_alloc_rdir_buf(file, buflen);
If you look into v9fs_alloc_rdir_buf(), you will see that it will try
to allocate a contiguous buffer of memory (using kzalloc(), which is a
wrapper around kmalloc()) of size buflen + 8 bytes or so.
So in reality, this code actually allocates a buffer of size
proportional to fid->clnt->msize. What is this msize? If you follow
the definition of the structures, you will see that it's the
negotiated buffer transfer size between 9p client and 9p server. On
the client side, it can be controlled with the msize mount option.
What this all means is that, the reason for running out of memory is
that the code (which we can't easily change) tries to allocate a
contiguous buffer of size more or less equal to "negotiated 9p
protocol buffer size", which seems to be way too big (in our NixOS
tests, at least).
After that initial finding, @lethalman tested the gnome3 gdm test
without setting the msize parameter at all and it seems to have resolved
the problem.
The reason why I'm committing this without testing against all of the
NixOS VM test is basically that I think we can only go better but not
worse than the current state.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
- Add new service for `clamd`, the ClamAV daemon.
- Replace the old upstart "jobs" section with systemd.services
- Remove unnecessary config options.
- Use `mkEnableOption`
We hit page allocation failures a lot at random for VM tests, in case of
my own Hydra when it comes to the installer tests. The reason for this
is that once the memory of the VM gets heavily fragmented the kernel is
unable to allocate new pages.
Setting vm.min_free_kbytes to 16MB forces the kernel to keep a minimum
of 16 MB free.
I've done some testing accross repeated runs of the installer tests with
and without vm.min_free_kbytes set. So accross 30 test runs for each
settings, all of the tests with the option being set passed while 14
tests without that sysctl option triggered page allocation failures.
Sure, running 30 tests is not a guarantee that 16MB is enough, but we'll
see how it turns out in the long run across all VM tests.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>