Commit graph

166 commits

Author SHA1 Message Date
Julius Volz d868264bb8 Improve UI of /alerts page.
Changes to the UI:
- "Active Since" timestamps are now human-readable.
- Alerting rules are now pretty-printed better.
- Labels are no longer just strings, but alert bubbles (like we do on
  the status page for base labels).
- Alert states and target health states are now capitalized in the
  presentation layer rather than at the source.
2015-06-23 18:48:45 +02:00
Fabian Reinartz 53b9d5917d web: improve target URL handling and display. 2015-06-23 13:45:15 +02:00
Fabian Reinartz dc7d27ab9a retrieval: add honor label handling and parametrized querying.
This commit adds the honor_labels and params arguments to the scrape
config. This allows to specify query parameters used by the scrapers
and handling scraped labels with precedence.
2015-06-23 13:45:14 +02:00
Fabian Reinartz 459d18cf18 Merge pull request #812 from Marmelatze/consul_services
Use Consul ServiceAddress instead of Address when set
2015-06-17 20:10:52 +02:00
Florian Pfitzer 0ac7e7217e Use Consul ServiceAddress instead of Address when set 2015-06-17 15:39:42 +02:00
Brian Brazil 4d895242f9 Add support for Zookeeper Serversets for SD.
It can discover an entire tree of serversets, or just one.
2015-06-16 11:02:08 +01:00
Brian Brazil 0dbae36d36 Allow ingested metrics to be relabeled.
The main purpose of this is to allow for blacklisting
of expensive metrics as a tactical option.
It could also find uses for renaming and removing labels
from federation.
2015-06-13 15:18:27 +01:00
Brian Brazil 58ceae82bc Revert "Allow ingested metrics to be relabeled."
This reverts commit f2f26ca08f.

Was accidentally pushed to master instead of a branch for PR.
2015-06-12 22:12:26 +01:00
Brian Brazil f2f26ca08f Allow ingested metrics to be relabeled.
The main purpose of this is to allow for blacklisting
of expensive metrics as a tactical option.
It could also find uses for renaming and removing labels
from federation.
2015-06-12 22:06:30 +01:00
Fabian Reinartz b5fe2e9afe Merge pull request #773 from prometheus/fabxc/simple-cfg
config: simplify default config handling.
2015-06-08 16:22:06 +02:00
Brian Brazil b8b1d3cbac Web: Add pre-relabel labels to status page.
Figuring out what's going on with the new service discovery
and labels is difficult. Add a popover with the labels
to the target table to make things simpler, and help
discovery of potentially useful labels.
2015-06-08 12:19:01 +01:00
Fabian Reinartz 0af1cff8af config: simplify default config handling. 2015-06-06 09:04:04 +02:00
Fabian Reinartz 8214b4ee78 retrieval/discovery: surround __meta_consul_tags value with tag seperators. 2015-06-05 19:18:34 +02:00
Fabian Reinartz 280d11dca8 main: exit on invalid rule files on startup. 2015-06-02 18:44:41 +02:00
Fabian Reinartz 0de6edbdfc Move pkg/ to util/ 2015-06-01 21:12:32 +02:00
Fabian Reinartz dfaf31a1da Move web/httputils to pkg/httputil and add DeadlineClient to it 2015-06-01 21:12:31 +02:00
Fabian Reinartz a4f179230a Merge pull request #744 from prometheus/fabxc/fix-labels
Fix discarding of labels in file target groups
2015-05-27 19:57:15 +02:00
Fabian Reinartz e9b344abee Fix discarding of labels in file target groups 2015-05-27 18:52:44 +02:00
Fabian Reinartz 8b7e5f9184 Stop holding TargetManager lock when stopping components.
TargetProviders may flush some last changes to the target manager
before actually stopping. To properly read those form the channel
the target manager must not be locked while stopping a provider.
2015-05-27 12:41:37 +02:00
Brian Brazil f34de493d5 Add increase() function, to replace delta(..., 1).
This calculates how much a counter increases over
a given period of time, which is the area under the curve
of it's rate.

increase(x[5m]) is equivilent to rate(x[5m]) * 300.
2015-05-26 22:49:21 +01:00
Fabian Reinartz efb39cfd4e Fix file SD test 2015-05-23 21:20:39 +02:00
Julius Volz 267fd34156 Switch Prometheus to use github.com/prometheus/log.
This change is conceptually very simple, although the diff is large. It
switches logging from "github.com/golang/glog" to
"github.com/prometheus/log", while not actually changing any log
messages. V(1)-style logging has been changed to be log.Debug*().
2015-05-20 18:19:32 +02:00
Fabian Reinartz 7143dff02f Add initial implementation for SD via Consul.
This commit adds service discovery using Consul's HTTP API and watches
(long polling) to retrieve target updates.
2015-05-20 11:46:24 +02:00
Fabian Reinartz b0c181dc0d Add Consul SD configuration. 2015-05-20 11:46:24 +02:00
Fabian Reinartz ff832d2e03 Attach __meta_filepath label to file SD targets. 2015-05-19 15:49:38 +02:00
Fabian Reinartz 8de50619f1 Increase target test wait times
On slow systems such as Travis CI occasionally the tests fail
because the wait times are too short.
2015-05-19 12:06:52 +02:00
Fabian Reinartz 385919a65a Avoid inter-component blocking if ingestion/scraping blocks.
Appending to the storage can block for a long time. Timing out
scrapes can also cause longer blocks. This commit avoids that those
blocks affect other compnents than the target itself.
Also the Target interface was removed.
2015-05-18 17:58:51 +02:00
Fabian Reinartz 1a2d57b45c Move template functionality out of target.
The target implementation and interface contain methods only serving a
specific purpose of the templates. They were moved to the template
as they operate on more fundamental target data.
2015-05-18 13:35:43 +02:00
Fabian Reinartz dbc08d390e Move target status data into its own object 2015-05-18 11:15:42 +02:00
Fabian Reinartz 9ca47869ed Provide full SD configs to discovery constructors.
Some SD configs may have many options. To be readable and consistent, make
all discovery constructors receive the full config rather than the separate
arguments.
2015-05-15 14:54:29 +02:00
Fabian Reinartz 93548a8882 Add initial file based service discovery.
This commits adds file based service discovery which reads target
groups from specified files. It detects changes based on file watches
and regular refreshes.
2015-05-15 14:44:54 +02:00
Fabian Reinartz d5aa012fd0 Make HTTP basic auth configurable for scrape targets. 2015-05-15 12:47:50 +02:00
Fabian Reinartz bb540fd9fd Implement config reloading on SIGHUP.
With this commit, sending SIGHUP to the Prometheus process will reload
and apply the configuration file. The different components attempt
to handle failing changes gracefully.
2015-05-13 16:49:46 +02:00
Fabian Reinartz 86087120dd Replace example config with new YAML format. 2015-05-11 18:14:07 +02:00
Fabian Reinartz 5fbde88919 Switch config to YAML format. 2015-05-07 16:52:14 +02:00
Fabian Reinartz b5a8f7b8fa Cleanup, test, and document config. 2015-04-30 21:17:19 +02:00
Fabian Reinartz 945c49a2dd Add relabelling to target management.
This commit adds a relabelling stage on the set of base
labels from which a target is created. It allows to drop
targets and rewrite any regular or internal label.
2015-04-30 18:46:33 +02:00
Fabian Reinartz 0b619b46d6 Change JobConfig to ScrapeConfig.
This commit changes the configuration interface from job configs to scrape
configs. This includes allowing multiple ways of target definition at once
and moving DNS SD to its own config message. DNS SD can now contain multiple
DNS names per configured discovery.
2015-04-28 23:18:55 +02:00
Fabian Reinartz 5015c2a0e8 Make target manager source based.
This commit shifts responsibility for maintaining targets from providers and
pools to the target manager. Target groups have a source name that identifies
them for updates.
2015-04-24 15:49:35 +02:00
Fabian Reinartz 4f8673aa88 Simplify update sync for targets, format config fixtures. 2015-04-19 10:36:26 +02:00
Fabian Reinartz 36184f3530 Show correct error on wrong DNS response. 2015-04-11 16:14:38 +02:00
beorn7 fa1935a644 Remove /api/targets call and do not show job and instance labels on status.
/api/targets was undocumented and never used and also broken.

Showing instance and job labels on the status page (next to targets)
does not make sense as those labels are set in an obvious way.

Also add a doc comment to TargetStateToClass.
2015-03-18 18:53:43 +01:00
beorn7 be11cb2b07 Remove the sample ingestion channel.
The one central sample ingestion channel has caused a variety of
trouble. This commit removes it. Targets and rule evaluation call an
Append method directly now. To incorporate multiple storage backends
(like OpenTSDB), storage.Tee forks the Append into two different
appenders.

Note that the tsdb queue manager had its own queue anyway. It was a
queue after a queue... Much queue, so overhead...

Targets have their own little buffer (implemented as a channel) to
avoid stalling during an http scrape. But a new scrape will only be
started once the old one is fully ingested.

The contraption of three pipelined ingesters was removed. A Target is
an ingester itself now. Despite more logic in Target, things should be
less confusing now.

Also, remove lint and vet warnings in ast.go.
2015-03-15 14:08:22 +01:00
Julius Volz 140eede5e0 Rename UNREACHABLE to UNHEALTHY.
The current wording suggests that a target is not reachable at all,
although it might also get set when the target was reachable, but there
was some other error during the scrape (invalid headers or invalid
scrape content). UNHEALTHY is a more general wording that includes all
these cases.

For consistency, ALIVE is also renamed to HEALTHY.
2015-03-07 23:18:18 +01:00
Sergiusz 'q3k' Bazański 0d0bb3c030 Change instance identifiers to be host:port
This changes the PublicURL function into InstanceIdentifier, which now
returns a simple <host>:<port> string instead of a full URL.
2015-02-20 16:21:13 +01:00
Sergiusz 'q3k' Bazański bb69a3d284 Hide HTTP auth parts from URL
This  instroduces an extra function in the Target interface (PublicURL)
which is used to populate the instance field in scraped metrics.
2015-02-19 18:58:47 +01:00
Julius Volz af627bb2b9 Copy vendored deps manually instead of using Godeps.
We were using Godep incorrectly (cloning repos from the internet during
build time instead of including Godeps/_workspace in the GOPATH via
"godep go"). However, to avoid even having to fetch "godeps" from the
internet during build, this now just copies the vendored files into the
GOPATH.

Also, the protocol buffer library moved from Google Code to GitHub,
which is reflected in these updates.

This fixes https://github.com/prometheus/prometheus/issues/525
2015-02-17 02:08:56 +01:00
beorn7 11b3c2387c Improvements after review.
- Increase samplesQueueCapacity.

- Improve docstring for the above.

- Accept a short waiting period for the ingest channel to become
  ready. This should depend on the http timeout, but 100ms is probably
  good enough to cushion bursts bigger than samplesQueueCapacity,
  while it is unlikely that anybody ever will set an HTTP timeout
  similarly short.
2015-02-10 14:58:46 +01:00
beorn7 0f191629c6 Next try to deal with backed-up ingestion.
This is now not even trying to throttle in a benign way, but creates a
fully-fledged error. Advantage: It shows up very visible on the status
page. Disadvantage: The server does not really adjusts to a lower
scraping rate. However, if your ingestion backs up, you are in a very
irregulare state, I'd say it _should_ be considered an error and not
dealt with in a more graceful way.

In different news: I'll work on optimizing ingestion so that we will
not as easily run into that situation in the first place.
2015-02-09 17:32:47 +01:00
beorn7 16a1a6d324 Add another check for stopped scraper. 2015-02-06 18:30:33 +01:00