Commit graph

3153 commits

Author SHA1 Message Date
Matt Bostock e618af5d0b Storage: Add crash recovery metric 'started_dirty'
...to indicate when crash recovery was invoked during Prometheus
startup.

Fixes #1918.
2016-08-27 21:41:06 +02:00
Björn Rabenstein 0ac2dbe6aa Merge pull request #1917 from prometheus/beorn7/promql
promql: Fix (and simplify) populating iterators
2016-08-26 08:42:44 +02:00
Björn Rabenstein b23169d86f Merge pull request #1913 from dmilstein/fix-remote-storage-test
Fix one of the tests for a remote storage QueueManager
2016-08-24 18:56:22 +02:00
beorn7 71571a8ec4 promql: Fix (and simplify) populating iterators
This was only relevant so far for the benchmark suite as it would
recycle Expr for repetitions. However, the append is unnecessary as
each node is only inspected once when populating iterators, and
population must always start from scratch.

This also introduces error checking during benchmarks and fixes the so
far undetected test errors during benchmarking.

Also, remove a style nit (two golint warnings less…).
2016-08-24 18:37:09 +02:00
Dan Milstein 764ceaa939 Add timeout to test, cap waiting at 1 second 2016-08-24 11:30:38 -04:00
Björn Rabenstein 4b8f963847 Merge pull request #1915 from prometheus/release-1.0
Forward-merge the bug fix from release-1.0
2016-08-24 13:04:45 +02:00
Björn Rabenstein d9b40793ac Merge pull request #1914 from prometheus/beorn7/release
Cut v1.0.2
2016-08-24 13:03:00 +02:00
beorn7 acd8937bab Cut v1.0.2 2016-08-24 12:04:10 +02:00
Björn Rabenstein 30cfee81a1 Merge pull request #1907 from prometheus/beorn7/sd
retrieval: Clean up target group map on config reload
2016-08-24 12:00:30 +02:00
Dan Milstein 007907b410 Fix one of the tests for a remote storage QueueManager
Specifically, the TestSpawnNotMoreThanMaxConcurrentSendsGoroutines was failing on a fresh checkout of master.

The test had a race condition -- it would only pass if one of the
spawned goroutines happened to very quickly pull a set of samples off an
internal queue.

This patch rewrites the test so that it deterministically waits until
all samples have been pulled off that queue.  In case of errors, it also
now reports on the difference between what it expected and what it found.

I verified that, if the code under test is deliberately broken, the test
successfully reports on that.
2016-08-23 16:26:33 -04:00
beorn7 e2b3626e0c retrieval: Clean up target group map on config reload
Also, remove unused `providers` field in targetSet.

If the config file changes, we recreate all providers (by calling
`providersFromConfig`) and retrieve all targets anew from the newly
created providers. From that perspective, it cannot harm to clean up
the target group map in the targetSet. Not doing so (as it was the
case so far) keeps stale targets around. This mattered if an existing
key in the target group map was not overwritten in the initial fetch
of all targets from the providers. Examples where that mattered:

```
scrape_configs:
- job_name: "foo"
  static_configs:
  - targets: ["foo:9090"]
  - targets: ["bar:9090"]
```
updated to:
```
scrape_configs:
- job_name: "foo"
  static_configs:
  - targets: ["foo:9090"]
```

`bar:9090` would still be monitored. (The static provider just
enumerates the target groups. If the number of target groups
decreases, the old ones stay around.

```
scrape_configs:
- job_name: "foo"
  dns_sd_configs:
  - names:
    - "srv.name.one.example.org"
```
updated to:
```
scrape_configs:
- job_name: "foo"
  dns_sd_configs:
  - names:
    - "srv.name.two.example.org"
```

Now both SRV records are still monitored. The SRV name is part of the
key in the target group map, thus the new one is just added and the
old ane stays around.

Obviously, this should have tests, and should have tests before, not
only for this case. This is the quick fix. I have created
https://github.com/prometheus/prometheus/issues/1906 to track test
creation.

Fixes https://github.com/prometheus/prometheus/issues/1610 .
2016-08-22 19:25:33 +02:00
Tobias Schmidt 16d70a8b6b Merge pull request #1901 from amorken/master
Run scrape loop with interval 1 instead of 0
2016-08-18 13:36:05 -04:00
Anders Daljord Morken 95cadd0702 Run scrape loop with interval 1 instead of 0
0 is considered an invalid interval by time.NewTicker() and will cause a
panic if control reaches that point. Given the vagaries of timekeeping,
this may occasionally happen and make this test unstable.
2016-08-18 09:39:11 +02:00
Fabian Reinartz fcac52ebbf Merge pull request #1899 from amorken/master
Trim stray whitespace from bearer token file
2016-08-17 16:01:58 +02:00
Anders Daljord Morken 8633ac180e Strip stray whitespace from bearer token file
Apart from not trying to send a newline in a HTTP header,
this also allows Prometheus to build and pass tests with Go 1.7,
which features stricter checking of HTTP headers.
2016-08-17 15:36:18 +02:00
Fabian Reinartz 54c0b10abb Merge pull request #1896 from amorken/master
Bugfix: Avoid divide-by-zero panic on query_range?step=0
2016-08-16 15:27:35 +02:00
Anders Daljord Morken e9885ecb94 Bugfix: Avoid divide-by-zero panic on query_range?step=0 2016-08-16 15:20:34 +02:00
Fabian Reinartz 6bfd30269a Merge pull request #1894 from d-ulyanov/annotations-formatting
Added toUpper and toLower formatting to templates
2016-08-15 18:19:05 +02:00
Dmitry Ulianov a8619111f3 Added toUpper and toLower formatting to templates 2016-08-15 14:00:22 +03:00
Tobias Schmidt 289e299eb5 Merge pull request #1890 from prometheus/fix-applyconfig-error
Fix ApplyConfig() error handling
2016-08-13 18:55:35 -04:00
Julius Volz 4a866c13be Fix ApplyConfig() error handling
Currently, Prometheus starts up without any error when there is an
invalid rule file :-/
2016-08-13 00:59:02 +02:00
Julius Volz 80b0e1b74c Merge pull request #1892 from fstab/assume-counters-start-at-zero-after-reset
Assume counters start at zero after reset.
2016-08-13 00:14:07 +02:00
Julius Volz fe7b8b7fd1 Add missing license header to alerting_test.go 2016-08-13 00:11:52 +02:00
Fabian Stäber 08b6556ee6 Assume counters start at zero after reset. 2016-08-12 20:21:04 +02:00
Brian Brazil d118acef96 Merge pull request #1889 from prometheus/fix-rule-escaping
Fix rule HTML escaping issues
2016-08-12 08:58:10 +01:00
Julius Volz da7206ec29 Fix rule HTML escaping issues
This was mentioned as part of https://github.com/prometheus/alertmanager/issues/452
2016-08-12 02:59:41 +02:00
Fabian Reinartz be596f82b4 Merge pull request #1783 from knyar/json
Allow URLs in targets defined via a JSON file
2016-08-10 09:42:17 +02:00
Fabian Reinartz 76edb86e86 Merge pull request #1878 from brancz/relabel-alerts
allow relabeling of alerts
2016-08-09 14:50:08 +02:00
Frederic Branczyk b655aa002f introduce top level alerting config node 2016-08-09 14:19:25 +02:00
Frederic Branczyk 7714b9c781 move relabeling functionality to its own package
also remove the returned error as it was always nil
2016-08-09 14:19:20 +02:00
Frederic Branczyk 679d225c8d allow relabeling of alerts
in case of dropping don't even enqueue them
2016-08-09 14:18:31 +02:00
Fabian Reinartz c9b58d3e27 Merge pull request #1877 from prometheus/fabxc-fix-joblink
web/ui: fix job link
2016-08-09 09:42:50 +02:00
Fabian Reinartz df22684b5b web/ui: fix job link 2016-08-08 19:03:51 +02:00
Fabian Reinartz 98c0d33567 Merge pull request #1875 from brancz/idelta-function
add idelta function
2016-08-08 12:33:07 +02:00
Frederic Branczyk f02df4138c refactor duplication of irate and idelta functions implementations 2016-08-08 10:52:00 +02:00
Fabian Reinartz 32fad9fbb4 Merge pull request #1874 from prometheus/fabxc-joblink
Add HTML link for job name on target page
2016-08-08 10:48:34 +02:00
Fabian Reinartz cfe5c5fa15 Merge branch 'master' of https://github.com/cambridge-university-press/prometheus into cambridge-university-press-master 2016-08-08 10:46:36 +02:00
Fabian Reinartz d3aa6c0133 Merge pull request #1872 from grandbora/ui-url-params
Use query parameters in the /graph page
2016-08-08 10:43:06 +02:00
Frederic Branczyk dbf83666bb add idelta function
similar to the irate function the idelta function calculates the delta
function with the last two values
2016-08-08 10:40:50 +02:00
Frederic Branczyk 0ce5e7fe6d move legacy test for delta function 2016-08-08 10:02:58 +02:00
Bora Tunca 12bcc92311 Generate bindata.go 2016-08-08 09:52:14 +02:00
Bora Tunca fc6cdd0611 Update backend helpers and templates to new url schema 2016-08-08 09:52:14 +02:00
Bora Tunca 445fac56e0 Refactor graph.js 2016-08-08 09:52:13 +02:00
Bora Tunca 3e18d86d8a Use query parameters in the url 2016-08-06 17:28:18 +02:00
Bora Tunca 3da825fc76 Point to correct place for GraphLinkForExpression 2016-08-06 17:28:18 +02:00
Julius Volz d770783777 Neurotic cleanups to graph.js 2016-08-05 23:35:11 +02:00
Fabian Reinartz 70490fe568 Merge pull request #1805 from prometheus/higher-level-storage-interface
Make the storage interface higher-level.
2016-08-05 16:17:14 +02:00
Fabian Reinartz 806571074a Merge pull request #1869 from prometheus/fabxc-patch-1
Clarify comment on rule evaluation
2016-08-03 10:47:28 +02:00
Fabian Reinartz 9a269b5507 Clarify comment on rule evaluation
Fixes #1866
2016-08-03 08:29:51 +02:00
Fabian Reinartz a4ee5b14d5 Merge pull request #1865 from prometheus/alert-name
Remove __name__ from alerts sent to AM.
2016-08-01 23:34:19 -07:00