The current weekly setting causes every NixOS server to try to renew
its certificate at midnight on the dot on Monday. This contributes to
the general problem of periodic load spikes for Let's Encrypt; NixOS
is probably not a major contributor to that problem, but we can lead by
example by picking good defaults here.
The values here were chosen after consulting with @yuriks, an SRE at
Let's Encrypt:
* Randomize the time certificates are renewed within a 24 hour period.
* Check for renewal every 24 hours, to ensure the certificate is always
renewed before an expiry notice is sent out.
* Increase the AccuracySec (thus lowering the accuracy(!)), so that
systemd can coalesce the renewal with other timers being run.
(You might be worried that this would defeat the purpose of the time
skewing, but systemd is documented as avoiding this by picking a
random time.)
lego already bundles the chain with the certificate,[1] so the current
code, designed for simp_le, was resulting in duplicate certificate
chains, manifesting as "Chain issues: Incorrect order, Extra certs" on
the Qualys SSL Server Test.
cert.pem stays around as a symlink for backwards compatibility.
[1] 5cdc0002e9/acme/api/certificate.go (L40-L44)
Lego allows users to use the DNS-01 challenge to validate their
certificates. It is mostly backwards compatible, with a few
caveats.
- extraDomains can no longer have different webroots to the
main webroot for the cert.
- An email address is now mandatory for account creation
The following other changes were required:
- Deprecate security.acme.certs.<name>.plugins, as this was
specific to simp-le
- Rename security.acme.validMin to validMinDays, to avoid
confusion and errors. Lego requires the TTL to be specified in
days
- Add options to cover DNS challenge (dnsProvider,
credentialsFile, dnsPropagationCheck)
- A shared state directory is now used (/var/lib/acme/.lego)
to avoid account creation rate limits and share credentials
between certs
In 5532065d06, acme was changed to be
RemainAfterExit=true, but `postRun` commands are implemented as
`ExecStopPost`. Systemd now considers the service to be still running
after simp_le is finished, so won't run these commands (e.g. to reload
certificates in a webserver). Change `postRun` to use `ExecStartPost` to
ensure the commands are run in a timely manner.
A centralized list for these renames is not good because:
- It breaks disabledModules for modules that have a rename defined
- Adding/removing renames for a module means having to find them in the
central file
- Merge conflicts due to multiple people editing the central file
Fixes https://github.com/NixOS/nixpkgs/issues/75075.
To summarize the report in the aforementioned issue, at a glance,
it's a different default than what upstream polkit has. Apparently
for 8+ years polkit defaults admin identities as members of
the wheel group [0]. This assumption would be appropriate on NixOS, where
every member of group 'wheel' is necessarily privileged.
[0]: 763faf434b
Change order of pam_mount.conf.xml so that users can override the preset configs.
My use case is to mount a gocryptfs (a fuse program) volume. I can not do that in current order.
Because even if I change the `<fusermount>` and `<fuserumount>` by add below to extraVolumes
```
<fusemount>${pkgs.fuse}/bin/mount.fuse %(VOLUME) %(MNTPT) "%(before=\"-o \" OPTIONS)"</fusemount>
<fuseumount>${pkgs.fuse}/bin/fusermount -u %(MNTPT)</fuseumount>
```
mount.fuse still does not work because it can not find `fusermount`. pam_mount will told stat /bin/fusermount failed.
Fine, I can add a `<path>` section to extraVolumes
```
<path>${pkgs.fuse}/bin:${pkgs.coreutils}/bin:${pkgs.utillinux}/bin</path>
```
but then the `<path>` section is overridden by the hardcoded `<path>${pkgs.utillinux}/bin</path>` below. So it still does not work.
Add a new option permitting to point certbot to an ACME Directory
Resource URI other than Let's Encrypt production/staging one.
In the meantime, we are deprecating the now useless Let's Encrypt
production flag.
Previously setting `allowKeysForGroup = true; group = "foo"` would not
apply the group permission change of the certificates until the service
gets restarted. This commit fixes this by making systemd restart the
service every time it changes.
Note that applying this commit to a system with an already running acme
systemd service doesn't fix this immediately and you still need to wait
for the next refresh (or call `systemctl restart acme-<domain>`). Once
everybody's service has restarted once this should be a problem of the
past.
Let's encrypt bumped ACME to V2. We need to update our nixos test to
be compatible with this new protocol version.
We decided to drop the Boulder ACME server in favor of the more
integration test friendly Pebble.
- overriding cacert not necessary
- this avoids rebuilding lots of packages needlessly
- nixos/tests/acme: use pebble's ca for client tests
- pebble always generates its own ca which has to be fetched
TODO: write proper commit msg :)
Updating:
- nixos module to use the new `account_reg.json` file.
- use nixpkgs pebble for integration tests.
Co-authored-by: Florian Klink <flokli@flokli.de>
Replace certbot-embedded pebble
In #68792 it was discovered that /dev/fuse doesn't have
wordl-read-writeable permissions anymore. The cause of this is that the
tmpfiles examples in systemd were reorganized and split into more files.
We thus lost some of the configuration we were depending on.
In this commit some of the new tmpfiles configuration that are
applicable to us are added which also makes wtmp/lastlog in the pam
module not necessary anymore.
Rationale for the new tmpfile configs:
- `journal-nowcow.conf`: Contains chattr +C for journald logs which
makes sense on copy-on-write filesystems like Btrfs. Other filesystems
shouldn't do anything funny when that flag is set.
- `static-nodes-permissions.conf`: Contains some permission overrides
for some device nodes like audio, loop, tun, fuse and kvm.
- `systemd-nspawn.conf`: Makes sure `/var/lib/machines` exists and old
snapshots are properly removed.
- `systemd-tmp.conf`: Removes systemd services related private tmp
folders and temporary coredump files.
- `var.conf`: Creates some useful directories in `/var` which we would
create anyway at some point. Also includes
`/var/log/{wtmp,btmp,lastlog}`.
Fixes #68792.