Brian Brazil
f421ce0636
Remove label from prometheus_target_skipped_scrapes_total ( #2289 )
...
This avoids it not being intialised, and breaking out by
interval wasn't partiuclarly useful.
Fixes #2269
2016-12-16 18:00:52 +00:00
Brian Brazil
30448286c7
Add sample_limit to scrape config.
...
This imposes a hard limit on the number of samples ingested from the
target. This is counted after metric relabelling, to allow dropping of
problemtic metrics.
This is intended as a very blunt tool to prevent overload due to
misbehaving targets that suddenly jump in sample count (e.g. adding
a label containing email addresses).
Add metric to track how often this happens.
Fixes #2137
2016-12-16 15:10:09 +00:00
Björn Rabenstein
f3f798fbcf
Merge pull request #2283 from tcolgate/ignoredots
...
ignore dotfiles in data directory
2016-12-15 13:32:03 +01:00
Tristan Colgate
30be8e0b8a
ignore dotfiles in data directory
2016-12-15 11:48:23 +00:00
Tristan Colgate-McFarlane
4d9134e6d8
Add labeldrop and labelkeep actions. ( #2279 )
...
Introduce two new relabel actions. labeldrop, and labelkeep.
These can be used to filter the set of labels by matching regex
- labeldrop: drops all labels that match the regex
- labelkeep: drops all labels that do not match the regex
2016-12-14 10:17:42 +00:00
Björn Rabenstein
45570e5972
Merge pull request #2277 from prometheus/beorn7/storage2
...
storage: Sanity-check number of loaded chunk descs
2016-12-14 02:59:10 +01:00
beorn7
253be23c00
storage: Sanity-check number of loaded chunk descs
...
Two cases:
- An unarchived metric must have at least one chunk desc loaded upon
unarchival. Otherwise, the file is gone or has size 0, which is an
inconsistency (because the series is still indexed in the archive
index). Hence, quarantining is triggered.
- If loading the chunk descs of a series with a known chunkDescsOffset
(i.e. != -1), the number of chunks loaded must be equal to
chunkDescsOffset. If not, there is a data corruption. An error is
returned, which leads to qurantining.
In any case, there is a guard added to not access the 1st element of
an empty chunkDescs slice. (That's what triggered the crashes in issue
2249.) A time series with unknown chunkDescsOffset and no chunks in
memory and no chunks on disk either could trigger that case. I would
assume such a "null series" doesn't exist, but it's not entirely
unthinkable and unreasonable to happen (perhaps in future uses of the
storage). (Create a series, and then something tries to preload chunks
before the first sample is added.)
2016-12-13 23:19:39 +01:00
Björn Rabenstein
5f0c0e43cf
Merge pull request #2276 from prometheus/beorn7/storage
...
storage: Catch data corruption that leads to division by zero
2016-12-13 23:13:39 +01:00
Björn Rabenstein
a4c8292232
Merge pull request #2278 from prometheus/beorn7/style
...
storage: Fix linter issue
2016-12-13 23:13:05 +01:00
beorn7
837c029b16
storage: Fix linter issue
...
Go style tries to avoid indented `else` blocks.
2016-12-13 19:05:30 +01:00
Brian Brazil
c8de1484d5
Add scrape_samples_post_metric_relabeling
...
This reports the number of samples post any keep/drop
from metric relabelling.
2016-12-13 17:32:11 +00:00
Brian Brazil
06b9df65ec
Refactor and add unittests to scrape result handling.
2016-12-13 16:49:17 +00:00
Björn Rabenstein
568fd8a8cb
Merge pull request #2155 from prometheus/beorn7/vendoring2
...
Update vendoring for Azure
2016-12-13 17:10:59 +01:00
beorn7
4719482f5f
storage: Make tests go-vet and golint clean
2016-12-13 17:07:27 +01:00
beorn7
485ac8dff7
storage: Verify validity of byte length when unmarshalling (double)delta chunks
...
This makes sure a division-by-zero crash cannot happen in the Len()
method.
Fixes #2773
2016-12-13 17:07:27 +01:00
Brian Brazil
b5ded43594
Allow buffering of scraped samples before sending them to storage.
2016-12-13 15:01:35 +00:00
beorn7
906c3a2237
Update vendoring for Azure
...
Also, actually record the vendored version in vendor.json.
2016-12-13 14:21:16 +01:00
tattsun
e714079cf2
storage: fix error message ( #2270 )
...
* storage: add error message
2016-12-09 22:36:27 +00:00
Fabian Reinartz
9ecea36ef9
Merge pull request #2259 from prometheus/federationerr
...
web: don't return federation errors over HTTP
2016-12-06 16:18:03 +01:00
Fabian Reinartz
cef2e04aa3
web: add error counter for federation responses
2016-12-06 16:09:50 +01:00
Fabian Reinartz
0ea0a19848
Merge pull request #2240 from agaoglu/read-timeout
...
Set read-timeout for http.Server
2016-12-06 16:01:45 +01:00
Fabian Reinartz
9d68e81b32
web: don't return federation errors over HTTP
...
We are writing federation responses streaming. So after
the first byte we wrote, the status header is fixed. We cannot
return an HTTP error for intermediate error but should just abort
and log instead.
2016-12-06 15:52:50 +01:00
Erdem Agaoglu
054f8ebbfb
Increase default max-connections
2016-12-06 17:45:19 +03:00
Erdem Agaoglu
2260079c12
Vendor x/net/netutil
2016-12-06 12:52:29 +03:00
Erdem Agaoglu
e487477a17
LimitListener to limit max number of connections
...
This also drops tcp keep-alive in ListenAndServe but it's no longer
necessary since we now close idle connections long before that.
2016-12-06 12:45:59 +03:00
Fabian Reinartz
893390e0c6
Merge pull request #2248 from msiebuhr/cwd-in-status
...
web: Display current working directory on status-page
2016-12-05 21:41:37 +01:00
Morten Siebuhr
c5b17263a6
web: Display current working directory on status-page
2016-12-05 19:46:41 +01:00
Björn Rabenstein
a932c1a4b6
Merge pull request #1794 from cmluciano/cml/persistenceerror
...
Clarify error message when Prometheus data dir finds unexpected files
2016-12-05 18:40:51 +01:00
Christopher M. Luciano
148b006e25
Clarify error message when Prometheus data dir finds unexpected files
2016-12-05 10:51:57 -05:00
Fabian Reinartz
0459dcd2e2
Merge pull request #2234 from brancz/targets-api
...
web/api: add targets endpoint
2016-12-05 14:14:04 +01:00
Frederic Branczyk
33b583d50e
web/api: add targets endpoint
2016-12-05 13:13:21 +01:00
Frederic Branczyk
8f8cea4fbd
retrieval: refactor TargetManager to return flat list of Targets
2016-12-02 13:28:58 +01:00
Erdem Agaoglu
9986b28380
Set read-timeout for http.Server
...
This also specifies a timeout for idle client connections, which may
cause "too many open files" errors.
See #2238
2016-12-01 16:29:45 +03:00
Fabian Reinartz
63fe65bf2f
Merge pull request #2235 from prometheus/beorn7/doc
...
Kubernetes SD: More fixes to example config
2016-11-30 09:55:09 +01:00
beorn7
5770d9e545
Kubernetes SD: More fixes to example config
...
- Avoid mentioning the `in_cluster` option. (It doesn't exist anymore.)
- Replace `__meta_kubernetes_service_namespace` and
`__meta_kubernetes_pod_namespace` (which don't exist anymore) by
`__meta_kubernetes_namespace`.
2016-11-29 18:42:35 +01:00
Fabian Reinartz
2a89e8733f
Merge pull request #2230 from prometheus/cut-1.4.1
...
*: cut 1.4.1
2016-11-28 09:33:26 +01:00
Fabian Reinartz
6be1e98278
*: cut 1.4.1
2016-11-28 09:29:23 +01:00
Fabian Reinartz
d95e61d418
Merge pull request #2223 from prometheus/consulfix
...
consul: start service watch as goroutine
2016-11-28 08:00:41 +01:00
Fabian Reinartz
35da23fd82
consul: start service watch as goroutine
2016-11-27 11:01:16 +01:00
Fabian Reinartz
56f57a826f
Merge pull request #2219 from prometheus/builderimg
...
circle: update golang-builder image version
2016-11-25 16:05:53 +01:00
Fabian Reinartz
340de6c31c
circle: update golang-builder image version
2016-11-25 14:29:07 +01:00
Fabian Reinartz
ecad074e46
Merge pull request #2218 from prometheus/cut-1.4.0
...
*: cut 1.4.0
2016-11-25 13:35:04 +01:00
Fabian Reinartz
80455950ee
*: cut 1.4.0
2016-11-25 13:28:29 +01:00
Fabian Reinartz
b97f19a85e
travis: update used Go compiler version
2016-11-25 13:28:19 +01:00
Fabian Reinartz
9b7f5c7f29
Merge pull request #2217 from prometheus/alertingsd
...
Extract alertmanager into interface
2016-11-25 11:28:38 +01:00
Fabian Reinartz
2ad56aabd4
notifier: extract alertmanager into interface
2016-11-25 11:19:43 +01:00
Fabian Reinartz
cc35104504
config: fix naming and typo
2016-11-25 11:04:33 +01:00
Fabian Reinartz
fd51ab46e5
Merge pull request #2215 from prometheus/alertingsd2
...
Discover Alertmanagers dynamically
2016-11-25 10:38:00 +01:00
Fabian Reinartz
b1f28b48a3
Fix typo
2016-11-25 08:47:04 +01:00
Fabian Reinartz
3fb4d1191b
config: rename AlertingConfig, resolve file paths
2016-11-24 15:19:37 +01:00