Unfortunately, pkill doesn't distinguish between kernel and user space
processes, so we need to make sure we don't accidentally kill kernel
threads.
Normally, a kernel thread ignores all signals, but there are a few that
do. A quick grep on the kernel source tree (as of kernel 4.6.0) shows
the following source files which use allow_signal():
drivers/isdn/mISDN/l1oip_core.c
drivers/md/md.c
drivers/misc/mic/cosm/cosm_scif_server.c
drivers/misc/mic/cosm_client/cosm_scif_client.c
drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
drivers/staging/rtl8188eu/core/rtw_cmd.c
drivers/staging/rtl8712/rtl8712_cmd.c
drivers/target/iscsi/iscsi_target.c
drivers/target/iscsi/iscsi_target_login.c
drivers/target/iscsi/iscsi_target_nego.c
drivers/usb/atm/usbatm.c
drivers/usb/gadget/function/f_mass_storage.c
fs/jffs2/background.c
fs/lockd/clntlock.c
fs/lockd/svc.c
fs/nfs/nfs4state.c
fs/nfsd/nfssvc.c
While not all of these are necessarily kthreads and some functionality
may still be unimpeded, it's still quite harmful and can cause
unexpected side-effects, especially because some of these kthreads are
storage-related (which we obviously don't want to kill during bootup).
During discussion at #15226, @dezgeg suggested the following
implementation:
for pid in $(pgrep -v -f '@'); do
if [ "$(cat /proc/$pid/cmdline)" != "" ]; then
kill -9 "$pid"
fi
done
This has a few downsides:
* User space processes which use an empty string in their command line
won't be killed.
* It results in errors during bootup because some shell-related
processes are already terminated (maybe it's pgrep itself, haven't
checked).
* The @ is searched within the full command line, not just at the
beginning of the string. Of course, we already had this until now, so
it's not a problem of his implementation.
I posted an alternative implementation which doesn't suffer from the
first point, but even that one wasn't sufficient:
for pid in $(pgrep -v -f '^@'); do
readlink "/proc/$pid/exe" &> /dev/null || continue
echo "$pid"
done | xargs kill -9
This one spawns a subshell, which would be included in the processes to
kill and actually kills itself during the process.
So what we have now is even checking whether the shell process itself is
in the list to kill and avoids killing it just to be sure.
Also, we don't spawn a subshell anymore and use /proc/$pid/exe to
distinguish between user space and kernel processes like in the comments
of the following StackOverflow answer:
http://stackoverflow.com/a/12231039
We don't need to take care of terminating processes, because what we
actually want IS to terminate the processes.
The only point where this (and any previous) approach falls short if we
have processes that act like fork bombs, because they might spawn
additional processes between the pgrep and the killing. We can only
address this with process/control groups and this still won't save us
because the root user can escape from that as well.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Fixes: #15226
A user noticed the example for `hosts`, took the `mode` permissions literally, and ended up with surprising behavior on their system. Updating the documentation to not reference a real config file which might have real permissions requirements.
Continuation of 79c3c16dcb. Systemd 229
sets the default RLIMIT_CORE to infinity, causing systems to be
littered with core dumps when systemd.coredump.enable is disabled.
This restores the 15.09 soft limit of 0 and hard limit of infinity.
This reverts commit 45c218f893.
Busybox's modprobe causes numerous "Unknown symbol" errors in the
kernel log, even though the modules do appear to load correctly.
Systemd 229 sets kernel.core_pattern to "|/bin/false" by default,
unless systemd-coredump is enabled. Revert back to the default of
writing "core" in the current directory.
Also add required systemd services for starting/stopping mdmon.
Closes #13447.
abbradar: fixed `mdadmShutdown` service name according to de facto conventions.
Since we don't restart sysinit.service in switch-to-configuration, this
additionally overrides systemd-binfmt.service to depend on
proc-sys-fs-binfmt_misc.automount, which is normally provided by
sysinit.service.
- Enforce that an option declaration has a "defaultText" if and only if the
type of the option derives from "package", "packageSet" or "nixpkgsConfig"
and if a "default" attribute is defined.
- Enforce that the value of the "example" attribute is wrapped with "literalExample"
if the type of the option derives from "package", "packageSet" or "nixpkgsConfig".
- Warn if a "defaultText" is defined in an option declaration if the type of
the option does not derive from "package", "packageSet" or "nixpkgsConfig".
- Warn if no "type" is defined in an option declaration.
This reverts commit b861bf8ddf, because according to @mdorman [1] this
change rendered his NixOS systems unbootable, and we probably don't want that.
[1] b861bf8ddf (commitcomment-16058598)
Allow usage of list of strings instead of a comma-separated string
for filesystem options. Deprecate the comma-separated string style
with a warning message; convert this to a hard error after 16.09.
15.09 was just released, so this provides a deprecation period during
the 16.03 release.
closes #10518
Signed-off-by: Robin Gloster <mail@glob.in>
It's not entirely clear why this happens, but sometimes /proc/1/exe
returns a bogus value, like
/ar3a3j6b9livhy5fcfv69izslhgk4gcz-systemd-217/lib/systemd/systemd. In
any case, we can just conservatively assume that we need to restart
systemd when this happens.
Fixes #10261.
Fixes references coming from the mdadm udev rules.
This addresses #12722 (mdadm udev rules have references to /usr/bin) but
still won't fix the warning, though (if we want to fix the warnings, we
will have to patch the udev rules generater in services/hardware/udev).
For common mdraid functionality, this shouldn't fix anything, because
the wrong references seem to only apply to containers, see these
(wrapped) lines from ${mdadm}/lib/udev/rules.d/63-md-raid-arrays.rules:
# Tell systemd to run mdmon for our container, if we need it.
ENV{MD_LEVEL}=="raid[1-9]*",
ENV{MD_CONTAINER}=="?*",
PROGRAM="/usr/bin/readlink $env{MD_CONTAINER}",
ENV{MD_MON_THIS}="%c"
ENV{MD_MON_THIS}=="?*",
PROGRAM="/usr/bin/basename $env{MD_MON_THIS}",
ENV{SYSTEMD_WANTS}+="mdmon@%c.service"
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Partially reverts commit 901163c0c7.
This has broken remote SSH into initrd because ${cfg.shell} is not
expanded. Also, nsswitch is useless without libnss_files.so which
are installed by initrd-ssh.
- add missing types in module definitions
- add missing 'defaultText' in module definitions
- wrap example with 'literalExample' where necessary in module definitions
Setting nixosVersion to something custom is useful for meaningful GRUB
menus and /nix/store paths, but actuallly changing it rebulids the
whole system path (because of `nixos-version` script and manual
pages). Also, changing it is not a particularly good idea because you
can then be differentitated from other NixOS users by a lot of
programs that read /etc/os-release.
This patch introduces an alternative option that does all you want
from nixosVersion, but rebuilds only the very top system level and
/etc while using your label in the names of system /nix/store paths,
GRUB and other boot loaders' menus, getty greetings and so on.
This hopefully fixes intermittent initrd failures where udevd cannot
create a Unix domain socket:
machine# running udev...
machine# error getting socket: Address family not supported by protocol
machine# error initializing udev control socket
machine# error getting socket: Address family not supported by protocol
The "unix" kernel module is supposed to be loaded automatically, and
clearly that works most of the time, but maybe there is a race
somewhere. In any case, no sane person would run a kernel without Unix
domain sockets, so we may as well make it builtin.
http://hydra.nixos.org/build/30001448
This hopefully fixes intermittent test failures like
http://hydra.nixos.org/build/29962437
router# [ 240.128835] INFO: task mke2fs:99 blocked for more than 120 seconds.
router# [ 240.130135] Not tainted 3.18.25 #1-NixOS
router# [ 240.131110] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
assuming that these are caused by high load on the host.
... because we make it built-in by default.
I can't imagine anyone who wanted to purge this module from his/her system,
so let's keep it simple, at least for now.
Previously this barfed with:
updating GRUB 2 menu...
fileparse(): need a valid pathname at /nix/store/zldbbngl0f8g5iv4rslygxwp0dbg1624-install-grub.pl line 391.
warning: error(s) occured while switching to the new configuration
The most complex problems were from dealing with switches reverted in
the meantime (gcc5, gmp6, ncurses6).
It's likely that darwin is (still) broken nontrivially.
Clearly it would be the best if we'd directly generate mount units
instead of converting /etc/fstab. But in order to do that we need to
test it throughly so this approach is for the next stable release.
This fix however is intended for inclusion into release-14.12 and
release-15.09.
Using a simple regular expression unfortunately isn't sufficient for
proper mount unit name quoting/escaping and there is a utility in
systemd called systemd-escape which does nothing less than that.
Of course, using an external program to escape the unit name is way more
expensive and causes us to fork for each mount point.
But given that we already do quite a lot of forks just for unit starting
and stopping, I think it doesn't matter that much. Well, except if you
have a whole bunch of mount points.
However, if the latter is the case and you have thousands of mount
points, you probably have stumbled over this already if your mount point
contains a dash.
As for my motivation to fix this: I've stumbled on this while trying to
fix the "none" backend test for NixOps (see NixOS/nixops#350), where the
target machines use /nix/.ro-store and /nix/.rw-store as mount points.
The implementation we had so far did improperly escape it so those mount
points got the following unit files:
* nix-.ro-store.mount
* nix-.rw-store.mount
The correct names for these units are however:
* nix-.ro\x2dstore.mount
* nix-.rw\x2dstore.mount
So using systemd-escape now properly generates these names.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
mmc_block and sdhci_acpi are both necessary for a Bay Trail Chromebook with an
internal eMMC drive. The sdhci_acpi module is detectable but I can not figure
out a way to check whether the mmc_block module is needed by just looking at
/sys/
Option aliases/deprecations can now be declared in any NixOS module,
not just in nixos/modules/rename.nix. This is more modular (since it
allows for example grub-related aliases to be declared in the grub
module), and allows aliases outside of NixOS (e.g. in NixOps modules).
The syntax is a bit funky. Ideally we'd have something like:
options = {
foo.bar.newOption = mkOption { ... };
foo.bar.oldOption = mkAliasOption [ "foo" "bar" "newOption" ];
};
but that's not possible because options cannot define values in
*other* options - you need to have a "config" for that. So instead we
have functions that return a *module*: mkRemovedOptionModule,
mkRenamedOptionModule and mkAliasOptionModule. These can be used via
"imports", e.g.
imports = [
(mkAliasOptionModule [ "foo" "bar" "oldOption" ] [ "foo" "bar" "newOption" ]);
];
As an added bonus, deprecation warnings now show the file name of the
offending module.
Fixes #10385.
- systemd puts all into one output now (except for man),
because I wasn't able to fix all systemd/udev refernces
for NixOS to work well
- libudev is now by default *copied* into another path,
which is what most packages will use as build input :-)
- pkgs.udev = [ libudev.out libudev.dev ]; because there are too many
references that just put `udev` into build inputs (to rewrite them all),
also this made "${udev}/foo" fail at *evaluation* time
so it's easier to catch and change to something more specific
This way the binary gets stripped & rpath-shrinked etc. as usual.
We'd seem to get a runtime reference to gcc otherwise.
TODO: Maybe we should be able to set e.g. 'dontUnpack = true;'
to make this more pretty.
Adding the configuration option 'systemd.generators' to
specify systemd system-generators. The option allows to
either add new system-generators to systemd, or to over-
ride or disable the system-generators provided by systemd.
Internally, the configuration option 'systemd.generators'
maps onto the 'environment.etc' configuration option.
Having a convenience wrapper around 'environment.etc' helps
to group the systemd system-generator configuration more
easily with other 'systemd...' configurations.
$d should be $sd, this causes resume from hibernate to fail if
resumeDevice is not explicitly set in config. Introduced in commit:
'stage-1: Shut up warnings about swap devices that don't exist yet'
(cherry picked from commit 2a31397f53)
$d should be $sd, this causes resume from hibernate to fail if
resumeDevice is not explicitly set in config. Introduced in commit:
'stage-1: Shut up warnings about swap devices that don't exist yet'
When using extlinux-conf-builder in a nix build using chroots, the
following error message could be seen:
/nix/store/XXX-extlinux-conf-builder.sh: line 121: cd: /nix/var/nix/profiles: No such file or directory
To avoid this, just skip the code path parsing /nix/var/nix/profiles
when $numGenerations (passed from the command line) is 0 (which is the
only legal value of $numGenerations in a nix build context).
Without a menu title, U-Boot's distro scripts just autoboot the first
entry by default.
When I initially wrote this, my board wasn't apparently running stock
U-Boot but had some local hacks saved in the U-Boot's environment
which made it always display the prompt.
When calling addEntry inside a subshell, the filesCopied array would
be updated only in the subshell's environment. This would only cause an
issue if no -g flag was passed to the script, causing no kernels
to be copied.
This fixes a failing assert in systemd-timesyncd (issue #5913) as it
expects the directory /run/systemd/netif/links/ to exist, and nothing in
NixOS currently creates it.
Also we get a net reduction in our code as rules for /run/utmp and
/var/log/journal are also provided by the same upstream file.
(cherry picked from commit a278a9224a)
This shuts up this error from dbus:
May 11 13:52:16 machine dbus-daemon[259]: Unknown username "systemd-network" in message bus configuration file
May 11 13:52:16 machine dbus-daemon[259]: Unknown username "systemd-resolve" in message bus configuration file
which happens because the D-Bus config for networkd/resolved is
enabled unconditionally, and we don't have an easy way to turn it off.
(cherry picked from commit f19b58fb6a)
Some filesystems like fat32 don't support symlinking and need to be
supported on /boot as an efi system partition. Instead of creating the symlink directly in boot, create the symlink in
a temporary directory which has to support symlinking.
This solves the problem that modprobe does not know about $MODULE_DIR
when run via sudo, and instead wrongly tries to read /lib/modules/:
$ sudo strace -efile modprobe foo |& grep modules
open("/lib/modules/3.14.37/modules.softdep", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib/modules/3.14.37/modules.dep.bin", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib/modules/3.14.37/modules.dep.bin", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib/modules/3.14.37/modules.alias.bin", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
Without this patch, one would have to use sudo -E (preserves environment
vars). But that option is reserved for sudo users with extra rights
(SETENV), so it's not a solution.
environment.sessionVariables are set by PAM, so they are included in the
environment used by sudo.
This fixes a failing assert in systemd-timesyncd (issue #5913) as it
expects the directory /run/systemd/netif/links/ to exist, and nothing in
NixOS currently creates it.
Also we get a net reduction in our code as rules for /run/utmp and
/var/log/journal are also provided by the same upstream file.
This module generates a /boot/extlinux/extlinux.conf bootloader
configuration file that is supported by e.g. U-Boot:
http://git.denx.de/?p=u-boot.git;a=blob;f=doc/README.distro;hb=refs/heads/master
With this, all ARM boards supported by U-Boot can be booted in a common
way (a single boot file generator, all boards booting via initrd like
x86) and with same boot menu functionality as GRUB has.
-- sample extlinux.conf file --
# Generated file, all changes will be lost on nixos-rebuild!
# Change this to e.g. nixos-42 to temporarily boot to an older configuration.
DEFAULT nixos-default
TIMEOUT 50
LABEL nixos-default
MENU LABEL NixOS - Default
LINUX ../nixos/n7vxfk60nb5h0mcbhkwwxhcz2q2nvxzv-linux-4.1.0-rc3-cpufreq-zImage
INITRD ../nixos/0ss2zs8sb6d1qn4gblxpwlxkfjsgs5f0-initrd-initrd
FDTDIR ../nixos/n7vxfk60nb5h0mcbhkwwxhcz2q2nvxzv-linux-4.1.0-rc3-cpufreq-dtbs
APPEND systemConfig=/nix/store/469qvr43ln8bfsnk5lzcz6m6jfcgdd4r-nixos-15.06.git.0b7a7a6M init=/nix/store/469qvr43ln8bfsnk5lzcz6m6jfcgdd4r-nixos-15.06.git.0b7a7a6M/init loglevel=8 console=ttyS0,115200n8 drm.debug=0xf
LABEL nixos-71
MENU LABEL NixOS - Configuration 71 (2015-05-17 21:32 - 15.06.git.0b7a7a6M)
LINUX ../nixos/n7vxfk60nb5h0mcbhkwwxhcz2q2nvxzv-linux-4.1.0-rc3-cpufreq-zImage
INITRD ../nixos/0ss2zs8sb6d1qn4gblxpwlxkfjsgs5f0-initrd-initrd
FDTDIR ../nixos/n7vxfk60nb5h0mcbhkwwxhcz2q2nvxzv-linux-4.1.0-rc3-cpufreq-dtbs
APPEND systemConfig=/nix/store/469qvr43ln8bfsnk5lzcz6m6jfcgdd4r-nixos-15.06.git.0b7a7a6M init=/nix/store/469qvr43ln8bfsnk5lzcz6m6jfcgdd4r-nixos-15.06.git.0b7a7a6M/init loglevel=8 console=ttyS0,115200n8 drm.debug=0xf
With systemd 219, this is fine because systemd will cause the new
journald to re-use the file descriptors of the old one. So existing
connections to the journal are unaffected.
This shuts up this error from dbus:
May 11 13:52:16 machine dbus-daemon[259]: Unknown username "systemd-network" in message bus configuration file
May 11 13:52:16 machine dbus-daemon[259]: Unknown username "systemd-resolve" in message bus configuration file
which happens because the D-Bus config for networkd/resolved is
enabled unconditionally, and we don't have an easy way to turn it off.
This reverts commit d170c98d13.
niksnut argues that we need smaller system closures, not bigger.
So users facing the trouble of getting gcc rebuilds after nix-collect-garbage
for any minimal nixos configuration change should use other means of
not losing the stdenv output.
One way is to keep one somewhere: nix-build -A stdenv -o stdenv '<nixpkgs>'.
Another may be to use nix.conf options like gc-keep-outputs, gc-keep-derivations
or env-keep-derivations.
This will help a lot on ARM, where nix-collect-garbage erases gcc; then, any
change to a small system config file requires rebuilding gcc again.
I don't know why it does not happen on x86. Maybe it just pulls the gcc from
hydra, if garbage is collected.
It boots, but some things still don't work:
1) Installation of DTBs
2) Boot of initrd
Booting still needs a proper config.txt in /boot, which could probably be
managed by NixOS.
During the refactor of the networkd stuff in f8dbe5f, a lot of the
options are now needed by systemd.nix as well as networkd.nix but
weren't moved by that commit as well.
For now, this fixes all networkd VM tests except for the macvlan one and
thus it should fix #7505 for at least DHCP-based configuration.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
In f8dbe5f, the default value for networking unit "enabled" option
suddenly flipped to false. I have no idea of whether this happened by
accident, but I'm setting it to true again, because it essentially
breaks systemd networking support and we have systemd.network.enable to
have a "turn the world off" switch.
And of course, because the mentioned commit obviously wasn't done with
even a run of the simplest run of one of the network VM tests, we now
get an evaluation error if we switch useNetworkd to true.
Fixes the core issue of #7505.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Otherwise, the enabled -> disabled transition won't be handled
correctly (switch-to-configuration currently assumes that if a unit is
running and exists, it should be restarted).
Many bus clients get hopelessly confused when dbus-daemon is
restarted. So let's not do that.
Of course, this is not ideal either, because we end up stuck with a
possibly outdated dbus-daemon. But that issue will become irrelevant
in the glorious kdbus-based future.
Hopefully this also gets rid of systemd getting stuck after
dbus-daemon is restarted:
Apr 01 15:37:50 mandark systemd[1]: Failed to register match for Disconnected message: Connection timed out
Apr 01 15:37:50 mandark systemd[1]: Looping too fast. Throttling execution a little.
Apr 01 15:37:51 mandark systemd[1]: Looping too fast. Throttling execution a little.
...
Since we restart all active target units (of which there are many),
it's hard to see the units that actually matter. So don't print that
we're starting target units that are already active.
‘nixos-rebuild dry-activate’ builds the new configuration and then
prints what systemd services would be stopped, restarted etc. if the
configuration were actually activated. This could be extended later to
show other activation actions (like uids being deleted).
To prevent confusion, ‘nixos-rebuild dry-run’ has been renamed to
‘nixos-rebuild dry-build’.
The PID 1 shell is executed as the last command in a sh invocation. Some
shells implicitly use exec for that, but the current busybox ash does not,
so the shell gets a wrong PID. Spell out the exec.
When gummiboot.timeout == null, the menu will still be skipped.
When gummiboot.timeout == 0, the menu will also be skipped.
The only way to show the menu 'indefinitely' is to show it a long time.
Now that dbus reload has been moved before restarting units,
the reload may fail if dbus has been stopped before.
The reload-or-restart will reload dbus if it's active,
otherwise start it.
During install, the bootloader script gets run inside a chroot after the
/etc/group bind-mount is unmounted. Since we're not doing any building,
this should be safe, but really nix should just not care if the group
does not exist when no build is needed.
Fixes #5494
The old boot.spl.hostid option was not working correctly due to an
upstream bug.
Instead, now we will create the /etc/hostid file so that all applications
(including the ZFS kernel modules, ZFS user-space applications and other
unrelated programs) pick-up the same system-wide host id. Note that glibc
(and by extension, the `hostid` program) also respect the host id configured in
/etc/hostid, if it exists.
The hostid option is now mandatory when using ZFS because otherwise, ZFS will
require you to force-import your ZFS pools if you want to use them, which is
undesirable because it disables some of the checks that ZFS does to make sure it
is safe to import a ZFS pool.
The /etc/hostid file must also exist when booting the initrd, before the SPL
kernel module is loaded, so that ZFS picks up the hostid correctly.
The complexity in creating the /etc/hostid file is due to having to
write the host ID as a 32-bit binary value, taking into account the
endianness of the machine, while using only shell commands and/or simple
utilities (to avoid exploding the size of the initrd).
The gummiboot-builder.py script is expecting the @timeout@ metavar to be
substituted for either an empty string (in the case where a user has
left the timeout unset) or the actual value set in the system
configuration.
However, the config.boot.loader.gummiboot.timeout option defaults to
'null', and due to the way pkgs.substituteAll works, the substitution
for '@timeout@' is _never_ set to the empty string. This causes the
builder script to put a bogus line into /boot/loader/loader.conf:
timeout @timeout@
Fix this by explicitly setting 'timeout' to the empty string when it's
unset in the system configuration.
Signed-off-by: Josh Cartwright <joshc@eso.teric.us>