If the host network stack is slow to start, the alertmanager fails to
start with this error message:
caller=main.go:256 msg="unable to initialize gossip mesh" err="create memberlist: Failed to get final advertise address: No private IP address found, and explicit IP not provided"
This bug can be reproduced by shutting down the network stack and
restarting the alertmanager.
Note I don't know why I didn't hit this issue with previous
alertmanager releases.
- Fix misspelled option. mkRenamedOptionModule is not used because the
option hasn't really worked before.
- Add missing cfg.telemetryPath arg to ExecStart.
- Fix mkdir invocation in test.
The allowed values have changed in bd3319d28c.
0.15:
--log.level="info" Only log messages with the given severity or above. Valid levels: [debug, info, warn, error, fatal]
--log.format="logger:stderr"
Set the log target and format. Example: "logger:syslog?appname=bob&local=7" or "logger:stdout?json=true"
0.17:
--log.level=info Only log messages with the given severity or above. One of: [debug, info, warn, error]
--log.format=logfmt Output format of log messages. One of: [logfmt, json]
* prometheus-nginx-exporter: 0.5.0 -> 0.6.0
* nixos/prometheus-nginx-exporter: update for 0.6.0
Added new option constLabels and updated virtualHost name in the
exporter's test.
* structured config for main config file allows to launch nagios in
debug mode without having to write the whole config file by hand
* build time syntax check
* all options have types, one more example
* I find it misleading that the main nagios config file is linked in
/etc but that if you change the link in /etc/ and restart nagios, it
has no effect. Have nagios use /etc/nagios.cfg
* fix paths in example nagios config files, which allows to reuse it:
services.nagios.objectDefs =
(map (x: "${pkgs.nagios}/etc/objects/${x}.cfg")
[ "templates" "timeperiods" "commands" ]) ++ [ ./main.cfg ]
* for the above reason, add mailutils to default plugins
Co-Authored-By: Aaron Andersen <aaron@fosslib.net>
A centralized list for these renames is not good because:
- It breaks disabledModules for modules that have a rename defined
- Adding/removing renames for a module means having to find them in the
central file
- Merge conflicts due to multiple people editing the central file
osquery was marked as broken since April.
If somebody steps up to fix it, we can always revive it from the
histroy, but there's not much value in shipping completely broken things
in current master.
cc @ma27
This fixes an issue with a recent addition of a config file
check in c28ded36ef.
Previously it was possible to supply a path as a string
to `configFile`. Now it will fail checking the config file
during evaluation of the module due to sandboxing.
A toggle to disable the check, more informative log messages
and handling for various configFile values are added.
This option was added by mistake since `listenAddress` exists by default
for each prometheus-exporter. Using
`services.prometheus.exporters.wireguard.addr` will now cause a warning,
but doesn't break eval.
We don't want to ignore config that can mess up machines. In general
this should always fail evaluation, as you think you are changing
behaviour and don't, which can easily create run-time errors we can
catch early.
https://github.com/MindFlavor/prometheus_wireguard_exporter/releases/tag/3.1.1
This release adds a flag `-l` which takes an address where the exporter
is available. The default is `0.0.0.0` (previously, `0.0.0.0` was used
by default).
Please note that there are no dependency changes in Cargo and therefore
the cargo hash didn't change.
Prometheus2 does no longer support the command-line flag to specify
an alertmanager. Instead it now supports both service discovery and
configuration of alertmanagers in the alerting config section.
Simply mapping the previous option to an entry in the new alertmanagers
section is not enough to allow for complete configurations of an
alertmanager.
Therefore the option alertmanagerURL is no longer used and instead
a full alertmanager configuration is expected.
Since version 3.0 all allowed IPs and subnets are exposed by the
exporter. With `-s` set on the CLI, instead of a comma-separated list,
each allowed IP and subnet will be in a single field with the schema
`allowed_ip_<index>`.
The override that builds the custom python for integrations-core was
overriding python, but pythonPackages was still being inherited from a
call to `datadog-integrations-core {}`, causing
service.datadog-agent.extraIntegrations to be ignored.
This is a simple exporter which exports the information
provided by `wg show all dump` to prometheus.
Co-authored-by: Franz Pletz <fpletz@fnordicwalking.de>
https://humdi.net/vnstat/CHANGES
* enable tests
* add hardening options from upstream's
example service
* fix "documentation" setting in service:
either needs to be `unitConfig.Documentation`
(uppercase) or lowercase but not within unitConfig.
This makes sure that when a user hasn't set a Prometheus option it
won't show up in the prometheus.yml configuration file. This results
in smaller and easier to understand configuration files.
We previously filtered out the `_module` attribute in a NixOS
configuration by filtering it using the option's `apply` function.
This meant that every option that had a submodule type needed to have
this apply function. Adding this function is easy to forget thus this
mechanism is error prone.
We now recursively filter out the `_module` attributes at the place we
construct the Prometheus configuration file. Since we now do the filtering
centrally we don't have to do it per option making it less prone to errors.
This results in a smaller prometheus.yml config file.
It also allows us to use the same options for both prometheus-1 and
prometheus-2 since the new options for prometheus-2 default to null
and will be filtered out if they are not set.
This results in a simpler service unit which doesn't first have to
start a shell:
> cat /nix/store/s95nsr8zbkblklanqpkiap49mkwbaq45-unit-alertmanager.service/alertmanager.service
...
ExecStart=/nix/store/4g784lwcy7kp69hg0z2hfwkhjp2914lr-alertmanager-0.16.2-bin/bin/alertmanager \
--config.file /nix/store/p2c7fyi2jkkwq04z2flk84q4wyj2ggry-checked-config \
--web.listen-address [::1]:9093 \
--log.level warn
...
As the configuration for the exporters and alertmanager is unchanged
between the two major versions this patch tries to minimize
duplication while at the same time as there's no upgrade path from 1.x
to 2.x, it allows running the two services in parallel. See also #56037
https://github.com/golang/go/issues/9334 describes how net.Listen (as used by Alertmanager):
* listens on 127.0.0.1 if the listenAddress is "localhost"
* listens on all interfaces if the listenAddress is ""
munin_update relies on a stats file that exists, but isn't found in the
default location on NixOS; the appropriate plugin configuration is
added.
munin_stats relies on munin-cron writing a logfile, which the NixOS
build of munin does not. (This is probably fixable in the munin package,
but I don't have time to dig into that right now.)
This permits custom styling of the generated HTML without needing to
build your own Munin package from source. Also comes with an example
that works as a passable dark theme for Munin.
extraAutoPlugins lets you list plugins and plugin directories to be
autoconfigured, and extraPlugins lets you enable plugins on a one-by-one
basis. This can be used to enable plugins from contrib (although you'll
need to download and check out contrib yourself, then point these
options at it), or plugins you've written yourself.
munin-graph is hardcoded to use DejaVu Mono for the graph legends; if it
can't find it, there's no guarantee it finds a monospaced font at all,
and if it can't find a monospaced font the legends come out badly
misformatted.
This is just a set of globs to remove from the active plugins directory
after autoconfiguration is complete.
I also removed the hard-coded disabling of "diskstats", since it seems
to work just fine now.
Since this module was written, Munin has moved their documentation from
munin-monitoring.org/wiki to guide.munin-monitoring.org. Most of the
links were broken, and the ones that weren't went to "please use the new
site" pages.
New option `extraPluginPaths' that allows users to supply additional
paths for netdata plugins. Very useful for when you want to use
custom collection scripts.
`collectd' might fail because of a failure in any of numerous plugins.
For example `virt' plugin sometimes fails if `collectd' is started before `libvirtd'
Instead of setting User/Group only when DynamicUser is disabled, the
previous version of the code set it only when it was enabled. This
caused services with DynamicUser enabled to actually run as nobody, and
services without DynamicUser enabled to run as root.
Regression from fbb7e0c82f.
This commit adds an assertion that checks that either `configFile` or
`configuration` is configured for alertmanager. The alertmanager config
can not be an empty attributeset. The check executed with `amtool` fails
before the service even has the chance to start. We should probably not
allow a broken alertmanager configuration anyway.
This also introduces a test for alertmanager configuration that piggy
backs on the existing prometheus tests.
Otherwise netdata will not find python modules.
To make sure netdata still pick up our setuid version of apps.plugin
we rename the original executable.
By using types.lines for 'config', we can specify monit configurations
in lots of modules and they can all be automatically combined together
with newlines. This is desireable because different modules might want
to each specify the small monitoring task specific to their service.
This commit also updates the module to use current idioms.
With `promtool` we can check the validity of a configuration before
deploying it. This avoids situations where you would end up with a
broken monitoring system without noticing it - since the monitoring
broke down. :-)
as using /var/run now emits a warning by systemd's tmpfiles.d.
As /var/run is already a symlink to /run, this can't break anything, and
data does not need to be migrated.
Previously it was only possible to use very simple Riemann config.
For more complicated scenarios you need a directory of clojure
files and the config file that riemann starts with should be in this
directory.
Introduces an option `services.datadog-agent.extraIntegrations` that
can be set to include additional Datadog agent integrations from the
integrations-core repository.
Documentation and an example is provided with the change.
Relates to NixOS/nixpkgs#40399
Refactors the datadog-agent (i.e. V6) module to let users configure
arbitrary checks, not just a limited set, without having to resort to
linking the files manually and updating the systemd unit.
Checks are now configured via a `services.datadog-agent.checks` option
which takes an attribute set in which the keys refer directly to
Datadog check names, and the values are attribute sets representing
Datadog's configuration structure.
With this mechanism users can configure arbitrary integrations, for
example for the `ntp`-check, simply by saying:
services.datadog-agent.checks.ntp = {
init_config = null;
# ... other check configuration options as per Datadog
# documentation
};
The previous check-specific configuration options for non-default
checks have been removed. Disk & network check configuration options
have been kept rather than making them a `default`-value of the
`checks`-option because they will be overridden by user-configurations
in that case.
Relates to NixOS/nixpkgs#40399.
The web_access.patch would no longer apply.
It disabled a check that required the static files
for the web UI to be owned by the user the daemon runs as
(not root, so it doesn't work well with nix).
Besides updating netdata, this commit removes that patch,
changes the netdata service config to set the "web files owner/group"
option to "root" and adds a test that checks that the web UI is being served.
This allows the web files to be owned by root without patching.
Use nixos-fw chain instead of INPUT so that the rules don't keep
stacking everytime the firewall is reloaded.
This also adds a comment to each rule about the associated exporter.
Fixes#30891
* Upgrade `graphite-web`, `carbon` and `whisper` from 1.0.2 -> 1.1.3.
* Replaced the deprecated `pythonPackages.graphite_influxdb` with
`pythonPackages.influxgraph.`
* Renamed `pythonPackages.graphite_web` to `pythonPackages.graphite-web`
to be consistent with the Python package name.
* Replaced the unmaintained `pythonPackages.graphite_pager` with
`pythonPackages.graphitepager`
* Moved all new packages from `python-packages.nix` to
`pkgs/development/python-modules`
The Datadog agent requires `gohai` to be available on its `$PATH` in
order to collect certain metrics.
It would previously start up and collect certain types of metrics, but
log errors related to the missing gohai binary.
This commit configures the systemd-unit to make gohai available at
runtime.
This fixes#39810.
- prometheus exporters are now configured with
`services.prometheus.exporters.<name>`
- the exporters are now defined by attribute sets
from which the options for each exporter are generated
- most of the exporter definitions are used unchanged,
except for some changes that should't have any impact
on the functionality.
Alertmanager 0.13.0 doesn't support single dash long options, so '-config.file'
for example is parsed as '-c', which leads to the service not starting.
apps.plugin requires capabilities for full process monitoring. with
1.9.0, netdata allows multiple directories to search for plugins and the
setuid directory can be specified here.
the module is backwards compatible with older configs. a test is
included that verifies data gathering for the elevated privileges. one
additional attribute is added to make configuration more generic than
including configuration in string form.
These packages will be placed into an environment using
`backendsToPackages`. This function explicitly maps backends to
`pkgs.nodePackages.${type}` unless it's a builtin. This ensures that only
valid backends that work on NixOS are used (if not, the build already
breaks at evaluation time).
The log will be redirected to `stdout` to be able to watch the entire
output using `journalctl`.
Configuration parameters for the backends need to be set using
`services.statsd.extraConfig` as each backend has its own options and
all of them shouldn't be validated and checked explicitly and manually.
The munin-node service used wrapProgram to inject environment variables.
This doesn't work because munin plugins depend on argv[0], which is
overwritten when the executable is a script with a shebang line (example
below).
This commit removes the wrappers and instead passes the required
environment variables to munin-node.
Eliminating the wrappers resulted in some broken plugins, e.g., meminfo
and hddtemp_smartctl. That was fixed with the per-plugin configuration.
Example:
The plugin if_eth0 is a symlink to /.../plugins/if_, which uses $0
to determine that it should monitor traffic on the eth0 interface.
if_ is a wrapped program, and runs `exec -a "$0" .if_-wrapped`
.if_-wrapped has a "#!/nix/.../bash" line, which results in bash
changing $0, and as a result the plugin thinks my interface
is called "-wrapped".
The behaviour have changed again. Listed collectors are now enabled in
addition to the default one.
Also run as DynmicUser instead of user nobody as the exporter doesn't need
any state.