Instead of setting User/Group only when DynamicUser is disabled, the
previous version of the code set it only when it was enabled. This
caused services with DynamicUser enabled to actually run as nobody, and
services without DynamicUser enabled to run as root.
Regression from fbb7e0c82f.
This commit adds an assertion that checks that either `configFile` or
`configuration` is configured for alertmanager. The alertmanager config
can not be an empty attributeset. The check executed with `amtool` fails
before the service even has the chance to start. We should probably not
allow a broken alertmanager configuration anyway.
This also introduces a test for alertmanager configuration that piggy
backs on the existing prometheus tests.
Otherwise netdata will not find python modules.
To make sure netdata still pick up our setuid version of apps.plugin
we rename the original executable.
By using types.lines for 'config', we can specify monit configurations
in lots of modules and they can all be automatically combined together
with newlines. This is desireable because different modules might want
to each specify the small monitoring task specific to their service.
This commit also updates the module to use current idioms.
With `promtool` we can check the validity of a configuration before
deploying it. This avoids situations where you would end up with a
broken monitoring system without noticing it - since the monitoring
broke down. :-)
as using /var/run now emits a warning by systemd's tmpfiles.d.
As /var/run is already a symlink to /run, this can't break anything, and
data does not need to be migrated.
Previously it was only possible to use very simple Riemann config.
For more complicated scenarios you need a directory of clojure
files and the config file that riemann starts with should be in this
directory.
Introduces an option `services.datadog-agent.extraIntegrations` that
can be set to include additional Datadog agent integrations from the
integrations-core repository.
Documentation and an example is provided with the change.
Relates to NixOS/nixpkgs#40399
Refactors the datadog-agent (i.e. V6) module to let users configure
arbitrary checks, not just a limited set, without having to resort to
linking the files manually and updating the systemd unit.
Checks are now configured via a `services.datadog-agent.checks` option
which takes an attribute set in which the keys refer directly to
Datadog check names, and the values are attribute sets representing
Datadog's configuration structure.
With this mechanism users can configure arbitrary integrations, for
example for the `ntp`-check, simply by saying:
services.datadog-agent.checks.ntp = {
init_config = null;
# ... other check configuration options as per Datadog
# documentation
};
The previous check-specific configuration options for non-default
checks have been removed. Disk & network check configuration options
have been kept rather than making them a `default`-value of the
`checks`-option because they will be overridden by user-configurations
in that case.
Relates to NixOS/nixpkgs#40399.
The web_access.patch would no longer apply.
It disabled a check that required the static files
for the web UI to be owned by the user the daemon runs as
(not root, so it doesn't work well with nix).
Besides updating netdata, this commit removes that patch,
changes the netdata service config to set the "web files owner/group"
option to "root" and adds a test that checks that the web UI is being served.
This allows the web files to be owned by root without patching.
Use nixos-fw chain instead of INPUT so that the rules don't keep
stacking everytime the firewall is reloaded.
This also adds a comment to each rule about the associated exporter.
Fixes #30891
* Upgrade `graphite-web`, `carbon` and `whisper` from 1.0.2 -> 1.1.3.
* Replaced the deprecated `pythonPackages.graphite_influxdb` with
`pythonPackages.influxgraph.`
* Renamed `pythonPackages.graphite_web` to `pythonPackages.graphite-web`
to be consistent with the Python package name.
* Replaced the unmaintained `pythonPackages.graphite_pager` with
`pythonPackages.graphitepager`
* Moved all new packages from `python-packages.nix` to
`pkgs/development/python-modules`
The Datadog agent requires `gohai` to be available on its `$PATH` in
order to collect certain metrics.
It would previously start up and collect certain types of metrics, but
log errors related to the missing gohai binary.
This commit configures the systemd-unit to make gohai available at
runtime.
This fixes #39810.
- prometheus exporters are now configured with
`services.prometheus.exporters.<name>`
- the exporters are now defined by attribute sets
from which the options for each exporter are generated
- most of the exporter definitions are used unchanged,
except for some changes that should't have any impact
on the functionality.
Alertmanager 0.13.0 doesn't support single dash long options, so '-config.file'
for example is parsed as '-c', which leads to the service not starting.
apps.plugin requires capabilities for full process monitoring. with
1.9.0, netdata allows multiple directories to search for plugins and the
setuid directory can be specified here.
the module is backwards compatible with older configs. a test is
included that verifies data gathering for the elevated privileges. one
additional attribute is added to make configuration more generic than
including configuration in string form.
These packages will be placed into an environment using
`backendsToPackages`. This function explicitly maps backends to
`pkgs.nodePackages.${type}` unless it's a builtin. This ensures that only
valid backends that work on NixOS are used (if not, the build already
breaks at evaluation time).
The log will be redirected to `stdout` to be able to watch the entire
output using `journalctl`.
Configuration parameters for the backends need to be set using
`services.statsd.extraConfig` as each backend has its own options and
all of them shouldn't be validated and checked explicitly and manually.
The munin-node service used wrapProgram to inject environment variables.
This doesn't work because munin plugins depend on argv[0], which is
overwritten when the executable is a script with a shebang line (example
below).
This commit removes the wrappers and instead passes the required
environment variables to munin-node.
Eliminating the wrappers resulted in some broken plugins, e.g., meminfo
and hddtemp_smartctl. That was fixed with the per-plugin configuration.
Example:
The plugin if_eth0 is a symlink to /.../plugins/if_, which uses $0
to determine that it should monitor traffic on the eth0 interface.
if_ is a wrapped program, and runs `exec -a "$0" .if_-wrapped`
.if_-wrapped has a "#!/nix/.../bash" line, which results in bash
changing $0, and as a result the plugin thinks my interface
is called "-wrapped".
The behaviour have changed again. Listed collectors are now enabled in
addition to the default one.
Also run as DynmicUser instead of user nobody as the exporter doesn't need
any state.
* prometheus-collectd-exporter service: init module
Supports JSON and binary (optional) protocol
of collectd.
* nixos/prometheus-collectd-exporter: submodule is not needed for collectdBinary
"Builder called die: Cannot wrap
/nix/store/XXX-munin-available-plugins/plugin.sh because it is not an
executable file"
[Bjørn: Keep DRY, quote "$file".]
* removed pid-file support, it is needless to run collectd as systemd service
* removed static user id, as all the files reowned on the service start
* added ambient capabilities for ping and smart (hdd health) functions
While systemd suggests using the pre-defined graphical-session user
target, I found that this interface is difficult to use. Additionally,
no other major distribution, even in their unstable versions, currently
use this mechanism.
The window or desktop manager is supposed to run in a systemd user service
which activates graphical-session.target and the user services that are
binding to this target. The issue is that we can't elegantly pass the
xsession environment to the window manager session, in particular
whereas the PassEnvironment option does work for DISPLAY, it for some
mysterious reason won't for PATH.
This commit implements a new graphical user target that works just like
default.target. Services which should be run in a graphical session just
need to declare wantedBy graphical.target. The graphical target will be
activated in the xsession before executing the window or display manager.
Fixes #17858.
to /etc/dd-agent/conf.d by default, and make sure
/etc/dd-agent/conf.d is used.
Before NixOS 17.03, we were using dd-agent 5.5.X which
used configuration from /etc/dd-agent/conf.d
In NixOS 17.03 the default conf.d location is first used relative,
meaning that $out/agent/conf.d was used without NixOS overrides.
This change implements similar functionality as PR #25288, without
breaking backwards compatibility.
(cherry picked from commit 77c85b0ecb)
Adds services.longview.{apiKeyFile,mysqlPasswordFile} options as
alternatives to apiKey and mysqlPassword, which still work, but are
deprecated with a warning message.
Related to #24288.
The reason being less mental overhead when reading upstream
documentation. Examples can be pasted right into the configuration
instead of translating to Nix attrset first.
The current default value of listenAddress = null blows up:
$ nixos-rebuild build
error: cannot coerce null to a string, at
.../nixpkgs/nixos/modules/services/monitoring/prometheus/alertmanager.nix:97:16
With listenAddress = "" we use the same default as upstream and there is
no blow up :-)
...by providing a default value of "no labels" (an empty attrset).
Without this change we get
$ nixos-rebuild test -I nixpkgs=.
building Nix...
building the system configuration...
error: The option `services.prometheus.scrapeConfigs.[definition 1-entry 1].static_configs.[definition 1-entry 1].labels' is used but not defined.
which is unneeded, because labels _are_ optional.
The structured options are incomplete compared to upstream and I think
it will be a maintenance burden to try to keep up. Instead, provide an
option for the raw config file contents (prometheus.yml).
The collectd service runs as an unprivileged user by default, so it does
not leak more information to its data directory than any user can obtain
elsewhere by other means.
If people are running it as root and are worried about information leak,
we can add collectd group and set perms to 750.
CC @offlinehacker.
Fixes #21198.
* influxdb module: add postStart
* cadvisor module: increase TimeoutStartSec
Under high load, the cadvisor module can take longer than the default 90
seconds to start. This change should hopefully fix the test on Hydra.
Systemd upstream provides targets for networking. This also includes a target network-online.target.
In this PR I remove / replace most occurrences since some of them were even wrong and could delay startup.
Every period, sa1 collects and stores data.
Every 24 hours, sa2 aggregates the previous day's data in to a
report.
Timers and unit configurations were lifted from Fedora's default
units.
This commit implements the changes necessary to start up a graphite carbon Cache
with twisted and start the corresponding graphiteWeb service.
Dependencies need to be included via python buildEnv to include all recursive
implicit dependencies.
Additionally cairo is a requirement of graphiteWeb and pycairo is not a standard
python package (buildPythonPackage) and therefore cannot be included via
buildEnv. It also needs cairo in the Library PATH.
Accidentally broken by 4fede53c09
("nixos manuals: bring back package references").
Without this fix, grafana won't start:
$ systemctl status grafana
...
systemd[1]: Starting Grafana Service Daemon...
systemd[1]: Started Grafana Service Daemon.
grafana[666]: 2016/03/06 19:57:32 [log.go:75 Fatal()] [E] Failed to detect generated css or javascript files in static root (%!s(MISSING)), have you executed default grunt task?
systemd[1]: grafana.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: grafana.service: Unit entered failed state.
systemd[1]: grafana.service: Failed with result 'exit-code'.
This reverts most of 89e983786a, as those references are sanitized now.
Fixes #10039, at least most of it.
The `sane` case wasn't fixed, as it calls a *function* in pkgs to get
the default value.
- add missing types in module definitions
- add missing 'defaultText' in module definitions
- wrap example with 'literalExample' where necessary in module definitions
This reverts most of 89e983786a, as those references are sanitized now.
Fixes #10039, at least most of it.
The `sane` case wasn't fixed, as it calls a *function* in pkgs to get
the default value.
The advantage of putting the PID file under the ephemeral /run is that
when the machine crashes /run gets cleared allowing graphite to start
once the machine is rebooted.
We also set the PIDFile systemd option so that systemd knows the correct
PID and enables systemd to remove the file after service shut down.
* package statsd node packages separatly since they actually require
nodejs-0.10 or nodejs-0.12 to work (which is ... well old)
* remove statsd packages and its backends from "global" node-packages.json.
i did not rebuild it since for some reason npm2nix command fails. next time
somebody will rerun npm2nix statsd packages are going to be removed.
* statsd service: backends are now provided as strings and not anymore as
packages.
The most complex problems were from dealing with switches reverted in
the meantime (gcc5, gmp6, ncurses6).
It's likely that darwin is (still) broken nontrivially.