This fixes the case where a target provider closes the update
channel and exits before the context is canceled.
This should only be true for the static provider but it's safer
to generally handle this case.
I needed this today for debugging. It can certainly be improved, but
it's already quite helpful.
I refactored the reading of heads.db files out of persistence, which
is an improvement, too.
I made minor changes to the cli package to allow outputting via the
io.Writer interface.
This requires all the panic calls upon unexpected data to be converted
into errors returned. This pollute the function signatures quite
lot. Well, this is Go...
The ideas behind this are the following:
- panic only if it's a programming error. Data corruptions happen, and
they are not programming errors.
- If we detect a data corruption, we "quarantine" the series,
essentially removing it from the database and putting its data into
a separate directory for forensics.
- Failure during writing to a series file is not considered corruption
automatically. It will call setDirty, though, so that a
crashrecovery upon the next restart will commence and check for
that.
- Series quarantining and setDirty calls are logged and counted in
metrics, but are hidden from the user of the interfaces in
interface.go, whith the notable exception of Append(). The reasoning
is that we treat corruption by removing the corrupted series, i.e. a
query for it will return no results on its next call anyway, so
return no results right now. In the case of Append(), we want to
tell the user that no data has been appended, though.
Minor side effects:
- Now consistently using filepath.* instead of path.*.
- Introduced structured logging where I touched it. This makes things
less consistent, but a complete change to structured logging would
be out of scope for this PR.
This considers static labels in the equality of alerts to
avoid falsely copying state from a different alert definition with
the same name across reloads.
To be safe, it also copies the state map rather than just its pointer
so that remaining collisions disappear after one evaluation interval.