In the process I also found that the CapabilityBoundingSet
was restricting the service from listening on port 80, and
the AmbientCapabilities was ineffective. Fixed appropriately.
- Added defaultText for all inheritable options.
- Add docs on using new defaults option to configure
DNS validation for all domains.
- Update DNS docs to show using a service to configure
rfc2136 instead of manual steps.
Allows configuring many default settings for certificates,
all of which can still be overridden on a per-cert basis.
Some options have been moved into .defaults from security.acme,
namely email, server, validMinDays and renewInterval. These
changes will not break existing configurations thanks to
mkChangedOptionModule.
With this, it is also now possible to configure DNS-01 with
web servers whose virtualHosts utilise enableACME. The only
requirement is you set `acmeRoot = null` for each vhost.
The test suite has been revamped to cover these additions
and also to generally make it easier to maintain. Test config
for apache and nginx has been fully standardised, and it
is now much easier to add a new web server if it follows
the same configuration patterns as those two. I have also
optimised the use of switch-to-configuration which should
speed up testing.
Closes#129838
It is possible for the CA to revoke a cert that has not yet
expired. We must run lego to validate this before expiration,
but we must still ignore failures on unexpired certs to retain
compatibility with #85794
Also changed domainHash logic such that a renewal will only
be attempted at all if domains are unchanged, and do a full
run otherwises. Resolves#147540 but will be partially
reverted when go-acme/lego#1532 is resolved + available.
ClosesNixOS/nixpkgs#108237
When a user first adds an ACME cert to their configuration,
it's likely to fail to renew due to DNS misconfig. This is
non-fatal for other services since selfsigned certs are
(usually) put in place to let dependant services start.
Tell the user about this in the logs, and exit 2 for
differentiation purposes.
selfsignedDeps is already appended to the after and wants
of a cert's renewal service, making these redundant.
You can see this if you run the following command:
systemctl list-dependencies --all --reverse acme-selfsigned-mydomain.com.service
This is horrible if you want to debug failures that happened during
system switches but your 30-ish acme clients spam the log with the same
messages over and over again.
ClosesNixOS/nixpkgs#147348
I was able to reproduce this intermittently in the
test suite during the tests for HTTPd. Adding
StartLimitIntervalSec=0 to disable rate limiting
for these services works fine. I added it anywhere
there was a ConditionPathExists.
Since 7a10478ea7, all /var except
/var/lib/acme gets mounted in a read-only fashion. This behavior
breaks the existing acme deployments having a webroot set outside of
/var/lib/acme.
Collecting the webroots and adding them to the paths read/write
mounted to the systemd service runtime tree.
Fixes#139310
Currently, we hardcode the use of --http.webroot, even if no webroot is
configured. This has the effect of disabling the built-in server.
Co-authored-by: Chris Forno <jekor@jekor.com>
Reusing the same private/public key on renewal has two issues:
- some providers don't accept to sign the same public key
again (Buypass Go SSL)
- keeping the same private key forever partly defeats the purpose of
renewing the certificate often
Therefore, let's remove this option. People wanting to keep the same
key can set extraLegoRenewFlags to `[ --reuse-key ]` to keep the
previous behavior. Alternatively, we could put this as an option whose
default value is true.
As per #121293, I ensured the UMask is set correctly
and removed any unnecessary chmod/chown/chgrp commands.
The test suite already partially covered permissions
checking but I added an extra check for the selfsigned
cert permissions.
With the UMask set to 0023, the
mkdir -p command which creates the webroot
could end up unreadable if the web server
changes, as surfaced by the test suite in #114751
On top of this, the following commands
to chown the webroot + subdirectories was
mostly unnecessary. I stripped it back to
only fix the deepest part of the directory,
resolving #115976, and reintroduced a
human readable error message.
I found a logical error in the bash script, but during
debugging I enabled command echoing and realised it
would be a good idea to have it enabled all the time for
ease of bug reporting.
- Added an ExecPostStart to acme-$cert.service when webroot is defined to create the acme-challenge
directory and fix required permissions. Lego always tries to create .well-known and acme-challenge,
thus if any permissions in that tree are wrong it will crash and break cert renewal.
- acme-fixperms now configured with acme User and Group, however the script still runs as root. This
ensures the StateDirectories are owned by the acme user.
- Switched to list syntax for systemd options where multiple values are specified.
Closes#106603
Some webservers (lighttpd) require that the
files they are serving are world readable. We
do our own chmods in the scripts anyway, and
lego has sensible permissions on its output
files, so this change is safe enough.
systemd-tmpfiles is no longer required for
most of the critical paths in the module. The
only one that remains is the webroot
acme-challenge directory since there's no
other good place for this to live and forcing
users to do the right thing alone will only
create more issues.
Closes#106565
When generating multiple certificates which all
share the same server + email, lego will attempt
to create an account multiple times. By adding an
account creation target certificates which share
an account will wait for one service (chosen at
config build time) to complete first.
This means that all systems running from master will trigger
new certificate creation on next rebuild. Race conditions around
multiple account creation are fixed in #106857, not this commit.
When using the ACME DNS-01 challenge, there is a possibility of a
failure to resolve the challenge if the record is not propagated
fast enough. To circumvent this generic DNS problem, this adds
a setting to explicitly tell the ACME provider to use a certain DNS
resolver to lookup the challenge.
Signed-off-by: Jeroen Simonetti <jeroen@simonetti.nl>
This should hopefully solve races with DNS servers (such as unbound)
during the activation of a new generation. Previously unbound could
still be unavailable and thus the acme script would fail.
Attempting to reuse keys on a basis different to the cert (AKA,
storing the key in a directory with a hashed name different to
the cert it is associated with) was ineffective since when
"lego run" is used it will ALWAYS generate a new key. This causes
issues when you revert changes since your "reused" key will not
be the one associated with the old cert. As such, I tore out the
whole keyDir implementation.
As for the race condition, checking the mtime of the cert file
was not sufficient to detect changes. In testing, selfsigned
and full certs could be generated/installed within 1 second of
each other. cmp is now used instead.
Also, I removed the nginx/httpd reload waiters in favour of
simple retry logic for the curl-based tests
- Use an acme user and group, allow group override only
- Use hashes to determine when certs actually need to regenerate
- Avoid running lego more than necessary
- Harden permissions
- Support "systemctl clean" for cert regeneration
- Support reuse of keys between some configuration changes
- Permissions fix services solves for previously root owned certs
- Add a note about multiple account creation and emails
- Migrate extraDomains to a list
- Deprecate user option
- Use minica for self-signed certs
- Rewrite all tests
I thought of a few more cases where things may go wrong,
and added tests to cover them. In particular, the web server
reload services were depending on the target - which stays alive,
meaning that the renewal timer wouldn't be triggering a reload
and old certs would stay on the web servers.
I encountered some problems ensuring that the reload took place
without accidently triggering it as part of the test. The sync
commands I added ended up being essential and I'm not sure why,
it seems like either node.succeed ends too early or there's an
oddity of the vm's filesystem I'm not aware of.
- Fix duplicate systemd rules on reload services
Since useACMEHost is not unique to every vhost, if one cert
was reused many times it would create duplicate entries in
${server}-config-reload.service for wants, before and
ConditionPathExists