Previously, netdev units for network interfaces defined in the nixos
configurations would bindTo the systemd device unit of the interface if
not in a container.
In situations where you switch to a new nixos configration with changes
to network-setup.service (like nameservers) and have stacked interfaces
like vlans on a bond, it would fail to propagate restarts to the netdevs
correctly resulting with broken networking. The bond would be present
but no vlan interfaces rendering the machine unreachable.
My fear is that the udev events fail to propagate correctly while a systemd
transaction that is also restarting the triggered netdev service is running.
This commit changes this behaviour so netdev services bindTo other netdev
services if present and otherwise fall back to the previous behaviour.
We also noticed that stacked interfaces would sometimes seemingly be stopped
in the wrong order. For instance in the above example, the bond interface
would be deleted before the vlan interfaces resulting in the vlan interfaces
not being present when their service is being stopped. This would cause the
systemd transaction to fail and thus break networking. Their postStop hooks
are now allowed to fail as we have reached the desired state.
The GeoIP databases from MaxMind have no stable URLs and change every
month (or so). Our current method of packaging these database in Nix and
playing catch-up with ever-changing file hashes is a bad idea. For
instance, it makes it impossible to realize old NixOS configurations.
This patch adds a NixOS service that periodically updates the GeoIP
databases in /var/lib/geoip-databases. Moving NixOS modules over can be
done in later patches.
I tried adding MD5 check, but not all databases have them, so i skipped
it. We are downloading over HTTPS though, it should be good. I also
tried adding zip support, but the first zip file I extracted had a
different filename inside than the archive name, which breaks an
assumption in this service, so I skipped that too.
Changes v9 -> v10:
- Pass "--max-time" to curl to set upper bound on downloads (ensures
no indefinite hanging if there's problem with networking).
Timeout for network connectivity check: 60s.
Timeout for geoip database (each): 15m.
Changes v8 -> v9:
- Mention the random timer delay in the documentation for the
'interval' option.
Changes v7 -> v8:
- Add "RemainAfterExit=true" for the setup service, so it won't be
restarted needlessly. (Thanks @danbst!)
Changes v6 -> v7:
- Add --skip-existing flag to geoip-updater, which skips updating
existing database files. Pass that flag when we run the service on
boot (and on any NixOS configuration change).
(IMHO, this is somewhat a workaround for systemd persistent timers
not being triggered immediately when a timer has never expired
before. But it does have the nice side effect of ensuring that the
installed databases always correspond to the configured ones, since
the service is now always run after configuration changes.)
Changes v5 -> v6:
- Update database files atomically (per DB)
- If a database is removed from the configuration, it'll be removed
from /var/lib/geoip-databases too (on next run).
- Add NixOS module assertion so that if user inputs non- .gz or .xz
file there will be a build time error instead of runtime.
- Run updater as user "nobody" instead of "root".
- Rename NixOS service from "geoip-databases" to "geoip-updater".
- Drop RemainAfterExit, or else the timer won't trigger the unit.
- Bring back "curl --fail", or else we won't catch and log curl
failures.
Changes v4 -> v5:
- Add "GeoLite2-City.mmdb.gz" to default database list.
Changes v3 -> v4:
- Remove unneeded geoip-updater-setup.service after adding
'wantedBy = [ "multi-user.target" ]' directly to
geoip-updater.service
- Drop unneeded "Service" name from service descriptions.
Changes v2 -> v3:
- Network may be down when starting from a cold boot, so try a few
times. Possibly, if using systemd-networkd, it'll pass on the first
try. But with default DHCP on NixOS, the service is started before
hostnames can be resolved and thus we need a few extra seconds.
- Add error handling and mark service as failed if fatal error.
- Add proper syslog log levels.
- Add RandomizedDelaySec=3600 to the timer to not put high load on the
MaxMind servers. Suggested by @Mic92.
- Set RemainAfterExit on geoip-updater.service instead of
geoip-updater-setup.service. (The latter is only a proxy that pulls
in the former service).
Changes v1 -> v2:
From Данило Глинський (Danylo Hlynskyi) <abcz2.uprola@gmail.com>:
nixos/geoip-databases: add `databases` option and fix initial setup
There were two great issues when using this service:
- When you just enable service, databases aren't downloaded, they are
downloaded when timer triggers. Fixed this with automatic download on
first system activation.
- When there is no internet, updater outputs nothing to logs, which is
IMO misbehavior. Fixed this with removing `--fail` option, better be
explicit here.
The Raspberry Pi boot loader was deleting all xx-initrd text files
(which simply contain the path to the actual initrd files) just after
having created them. The code was actually trying to delete real,
obsolete initrd files, which are named <hash>-initrd-initrd (after path
cleaning), but the glob was catching the other files as well.
Turns out all variants of start.elf and fixup.dat are needed (depending
on what's in config.txt). I was under the mistaken impression that you
were supposed to rename one of the variants to switch using them, but
nope.
A very simple skeleton for now that doesn't attempt to model any of
the agent configuration, but we can grow it later. Tested and works
on an EC2 instance with ECS.
Recent versions of libreswan seem to omit this file, but it may be added/changed in the future. It is silly to have the service fail because a file is missing that only enriches the environment.
This fixes an issue where `nixops deploy` wouldn't restart the chrony
service when the chrony configuration changed, because it wouldn't
detect that `/etc/chrony.conf` was a dependency of the chrony service.
The tests have failed because Chromium has started up displaying the
following error message in a dialog window:
Chromium can not be run as root.
Please start Chromium as a normal user. If you need to run as root for
development, rerun with the --no-sandbox flag.
So let's run as user "alice" and pass all commands using the small
helper function "ru" (to keep it short, it's for "Run as User").
Tested it by running the "stable" test on x86_64-linux.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Reported-by: @globin
Regression introduced by 0cb487ee04.
This changed the result for defaultGateway to be a submodule instead of
just a plain string, so instead of using just cfg.defaultGateway we need
to pass cfg.defaultGateway.address now.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Cc: @abbradar
From Postfix documentation:
With this setting, the Postfix SMTP server will not reject mail with "User
unknown in local recipient table". Don't do this on systems that receive mail
directly from the Internet. With today's worms and viruses, Postfix will become
a backscatter source: it accepts mail for non-existent recipients and then
tries to return that mail as "undeliverable" to the often forged sender
address.
20e81f7c0d prevented key generation in
`preStart`, leaving the service broken for the case where the user has
no pre-existing key.
Eventually, we ought to store the state elsewhere so that `/etc` can be
read-only but for now we fix this the easy way.