Report valid configs in the respective metrics from the beginning

In #7399, an early validity check of the config was introduced to
prevent the scenario where an invalid config is only detected after a
possibly very long startup procedure. However, the respective success
metrics are not updated after the initial validation so that the
success metrics suggest an invalid config. If the startup procedure,
like replaying the WAL, really takes very long, alerts about invalid
config will trigger.

This commit sets the succes metrics after initial validation. They
will be set again after the "real" config (re-)load, but that
shouldn't be a problem. The metric now truthfully represents whenever
the config was successfully loaded, no matter if the result was then
thrown away (because it was just for validation) or actually used.

Signed-off-by: beorn7 <beorn@grafana.com>
This commit is contained in:
beorn7 2020-10-12 21:30:59 +02:00
parent 90680b092c
commit 0f3c1bf6cf

View file

@ -290,6 +290,14 @@ func main() {
level.Error(logger).Log("msg", fmt.Sprintf("Error loading config (--config.file=%s)", cfg.configFile), "err", err) level.Error(logger).Log("msg", fmt.Sprintf("Error loading config (--config.file=%s)", cfg.configFile), "err", err)
os.Exit(2) os.Exit(2)
} }
// Now that the validity of the config is established, set the config
// success metrics accordingly, although the config isn't really loaded
// yet. This will happen later (including setting these metrics again),
// but if we don't do it now, the metrics will stay at zero until the
// startup procedure is complete, which might take long enough to
// trigger alerts about an invalid config.
configSuccess.Set(1)
configSuccessTime.SetToCurrentTime()
cfg.web.ReadTimeout = time.Duration(cfg.webTimeout) cfg.web.ReadTimeout = time.Duration(cfg.webTimeout)
// Default -web.route-prefix to path of -web.external-url. // Default -web.route-prefix to path of -web.external-url.