Commit graph

7507 commits

Author SHA1 Message Date
Alec 109252f3aa Update encoding_helpers.go (len of be64 should be 8) (#521) 2019-02-13 11:04:21 +02:00
Tom Wilkie e248ffb220 Add alert for WAL remote write falling behind.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-02-12 15:22:58 +00:00
Callum Styan 37e35f9e0c Various improvements to WAL based remote write.
- Use the queue name in WAL watcher logging.
- Don't return from watch if the reader error was EOF.
- Fix sample timestamp check logic regarding what samples we send.
- Refactor so we don't need readToEnd/readSeriesRecords
- Fix wal_watcher tests since readToEnd no longer exists

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-02-12 11:39:13 +00:00
Tom Wilkie b93bafeee1 Various fixes to locking & shutdown for WAL-based remote write.
- Remove datarace in the exported highest scrape timestamp.
- Backoff on enqueue should be per-sample - reset the result for each sample.
- Remove diffKeys, unused ctx and cancelfunc in WALWatcher, 'name' from writeTo interface, and pass it to constructor.
- Reorder functions in WALWatcher depth-first according to call graph.
- Fix vendor/modules.txt.
- Split out the various timer periods into consts at the top of the file.
- Move w.currentSegmentMetric.Set close to where we set the currentSegment.
- Combine r.Next() and isClosed(w.quit) into a single loop.
- Unnest some ifs in WALWatcher.watch, propagate erros in decodeRecord, add some new lines to make it easier to read.
- Reorganise checkpoint handling to reduce nesting and make it easier to follow.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-02-12 11:39:13 +00:00
Callum Styan 6f69e31398 Tail the TSDB WAL for remote_write
This change switches the remote_write API to use the TSDB WAL.  This should reduce memory usage and prevent sample loss when the remote end point is down.

We use the new LiveReader from TSDB to tail WAL segments.  Logic for finding the tracking segment is included in this PR.  The WAL is tailed once for each remote_write endpoint specified. Reading from the segment is based on a ticker rather than relying on fsnotify write events, which were found to be complicated and unreliable in early prototypes.

Enqueuing a sample for sending via remote_write can now block, to provide back pressure.  Queues are still required to acheive parallelism and batching.  We have updated the queue config based on new defaults for queue capacity and pending samples values - much smaller values are now possible.  The remote_write resharding code has been updated to prevent deadlocks, and extra tests have been added for these cases.

As part of this change, we attempt to guarantee that samples are not lost; however this initial version doesn't guarantee this across Prometheus restarts or non-retryable errors from the remote end (eg 400s).

This changes also includes the following optimisations:
- only marshal the proto request once, not once per retry
- maintain a single copy of the labels for given series to reduce GC pressure

Other minor tweaks:
- only reshard if we've also successfully sent recently
- add pending samples, latest sent timestamp, WAL events processed metrics

Co-authored-by: Chris Marchbanks <csmarchbanks.com> (initial prototype)
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> (sharding changes)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-02-12 11:39:13 +00:00
Krasi Georgiev 9f28ffa6f4
Merge pull request #465 from krasi-georgiev/shutdown-during-compaction
use context to cancel compactions
2019-02-12 11:25:40 +02:00
Krasi Georgiev bf79c767f0 new line
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-12 11:08:09 +02:00
Krasi Georgiev beee5c58f3 Merge remote-tracking branch 'upstream/master' into shutdown-during-compaction
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-12 10:56:45 +02:00
Krasi Georgiev 6fce018def
Merge pull request #512 from krasi-georgiev/delete-compact-block-on-reload-error
Delete compact block on reload error
2019-02-12 10:50:21 +02:00
Maria Nemtinova 8e3a39f725 Web UI QoL improvements (#5201)
1. Added an ability to resize text area on mouseclick
2. Remember selected target status button on page reload

Signed-off-by: Maria Nemtinova <nemtinovamasha@gmail.com>
2019-02-12 00:22:05 +01:00
Ganesh Vernekar ce69dcb0e5
Propose @codesome as 2.8 release shepherd
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2019-02-11 23:44:01 +05:30
Krasi Georgiev bf2239079d refactor multi errors
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-11 12:28:46 +02:00
Krasi Georgiev c3a5c1d891 refactor error handling
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-11 12:22:11 +02:00
Krasi Georgiev 07df4fd383 nits
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-11 11:25:57 +02:00
JoeWrightss 4cb6c202ff Fix fmt.Errorf error message (#5199)
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-02-10 15:16:20 +05:30
Tariq Ibrahim a2a6e24f9f show list of offending labels in the error message in many-to-many scenarios (#5189)
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-02-09 10:17:52 +01:00
Krasi Georgiev 0f8f5027ef remove nested for if
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-08 18:09:23 +02:00
Krasi Georgiev d48606827c simplify closers
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-08 13:35:32 +02:00
Krasi Georgiev e138c7ed7e Merge remote-tracking branch 'upstream/master' into delete-compact-block-on-reload-error
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-08 13:26:28 +02:00
Krasi Georgiev da9da9fbee fix the sleep logic
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-08 12:39:25 +02:00
Krasi Georgiev 0b72f9af4c
Merge pull request #270 from codesome/master
Head: don't create stones, delete samples directly
2019-02-08 12:35:01 +02:00
Minh-Long Do b26b5c9e96 Add rendering test of template based web endpoints (#5188)
Signed-off-by: Minh-Long  Do <minhlong.langos@gmail.com>
2019-02-08 10:17:47 +00:00
Krasi Georgiev 1e91d619de
Merge pull request #493 from simonpasquier/update-makefile-common
fix static check errors and removed a lot of unused code and vars.
updated Makefile.common
2019-02-08 11:47:06 +02:00
Krasi Georgiev 457534d5c4 simplify nesting.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-08 11:36:30 +02:00
jritchieBAE b8f0a41745 Update to Bootstrap 4.1.3 (#5192)
* web: updated bootstrap3-typeahead file to work with bootstrap 4.0.0

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: Replaced bootstrap-3.3.1 with bootstrap 4.0.0

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: Added bootstrap4-glyphicons as 4.0.0 doesnt include bootstrap3 glyphicons

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: updated js jquery to 3.3.1

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: updated _base.html to import new bootstrap 4.0.0, jquery3.3.1 and bootstrap class tags to be 4.0 compatible

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: _base.html missed word out in title tag (Server).

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: updated alerts.html class names and tags to be bootstrap 4 compatible.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: updated config.html class names and tags to be bootstrap 4 compatible.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: updated flags.html class names and tags to be bootstrap 4 compatible.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: updated service-discovery.html class names and tags to be bootstrap 4 compatible.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: updated status.html class names and tags to be bootstrap 4 compatible.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: updated targets.html class names and tags to be bootstrap 4 compatible.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: updated graph_template.handlebar class names and tags to be bootstrap 4 compatible.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: alerts.css fix for button color inheritance on alerts page.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: graph.css fix for color inheritance.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: prometheus.css updated to fix nav bar.

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* web: previous merge conflict not fixed correctly on _base.html

Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>

* menu.lib and prom.lib imports updated

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* bootstrap 4.1.3 imported

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Bootstrap 4.1.3 imported into _base.html

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* bootstrap 4.1.3 imported into prom.lib

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* menu.lib style adjusted to view sidebar

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Alert colour uplifted to bootstrap 4.1.3

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Alerts display code reformatted similarly to config

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Consoles pages adjusted to account for new navbar

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* LHS Menu fixed in console pages

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Minor changes to prom_console to adjust lhs nav

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Prom.lib and some css updated to fix console graph controls

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Bootstrap 4.0.0 files removed

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Consoles configured so that the graph fits with the new side bar, css files also adjusted

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Import popper.min.js for dropdowns

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Popper.min.js imported locally

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Re-added #4764 and fixed css

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Removed .DS_Store

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Rebuilt assets

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* Spaces between buttons and inputs on graph page removed

Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>

* fixed spacing in buttons on /targets

Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>

* Updated vfsdata.go

Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>

* fixed typeahead issue

Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>

* added css for dropdown

Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>

* changed order of css imports

Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>

* tinkered with CSS changes to make keyboard select and mouseover match

Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>
2019-02-07 22:18:09 +01:00
Ganesh Vernekar 5481549324
Merge remote-tracking branch 'upstream/master'
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2019-02-07 22:15:06 +05:30
Simon Pasquier fc10f6d814
Unset GO111MODULE variable in Makefile.common (#5191)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-07 17:22:04 +01:00
Simon Pasquier 272fd0eabf Update Makefile.common
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-07 15:06:10 +01:00
Simon Pasquier 95334f13c5 Merge branch 'master' into update-makefile-common
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-07 12:10:22 +01:00
Krasi Georgiev 2ae0620205 rename some vars and use Gauge instead of Counter for metrics
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-07 10:09:42 +02:00
Erik Hollensbe 9154e9aca8 Add miekg/dns 1.1.4
Signed-off-by: Erik Hollensbe <github@hollensbe.org>
2019-02-06 21:41:36 +00:00
Goutham Veeramachaneni 9b8bbe3246
Merge pull request #5187 from prometheus/beorn7/release
Merge v2.7 bugfixes into master
2019-02-06 21:32:06 +01:00
Krasi Georgiev 776769377e fix merr logic.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-06 16:59:28 +02:00
beorn7 d26e134bd4 Merge branch 'release-2.7' into beorn7/release
Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-06 15:22:40 +01:00
Björn Rabenstein 3db36f34ec
Merge pull request #5186 from prometheus/beorn7/metrics
Fix prometheus_rule_group_last_evaluation_timestamp_seconds
2019-02-06 15:19:08 +01:00
Krasi Georgiev 08e7bc8ee8 always remove tmp
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-06 16:09:42 +02:00
Krasi Georgiev 45acaadd81 review changes
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-06 14:07:35 +02:00
beorn7 2db1eeb4ec Fix prometheus_rule_group_last_evaluation_timestamp_seconds
It should be a unix timestamp, not the seconds in the minute.

Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-06 11:02:49 +01:00
Krasi Georgiev 77cc2ea210
Update index.md(format of series entry) (#515)
* Update index.md
* more clear format representation
* move dots to the bottom
* update #labels count, #chunks count
* use correct variable types.

Signed-off-by: naivewong <867245430@qq.com>
2019-02-06 12:13:38 +03:00
naivewong 9b227d27e7 update labels count, chunks count
Signed-off-by: naivewong <867245430@qq.com>
2019-02-06 10:00:18 +08:00
naivewong 8e7e2041d3 update #labels count, #chunks count
Signed-off-by: naivewong <867245430@qq.com>
2019-02-06 00:48:25 +08:00
naivewong 42ee59689b move dots to the bottom
Signed-off-by: naivewong <867245430@qq.com>
2019-02-06 00:34:15 +08:00
naivewong 0649dfddf0 more clear format representation
Signed-off-by: naivewong <867245430@qq.com>
2019-02-06 00:05:30 +08:00
naivewong 1f723a8eb5 update #labels and #chunks
Signed-off-by: naivewong <867245430@qq.com>
2019-02-05 23:20:43 +08:00
Alec 8e589474c6 Update index.md
Signed-off-by: naivewong <867245430@qq.com>
2019-02-05 10:34:25 +08:00
Alec 2b6bc9fb32 Update index.md
Signed-off-by: naivewong <867245430@qq.com>
2019-02-05 10:34:25 +08:00
zhulongcheng fd964426a7 web: predeclare and reuse errors (#5180)
Predeclare and reuse errors to reduce duplicate code

Signed-off-by: zhulongcheng <zhulongcheng.me@gmail.com>
2019-02-04 13:06:26 +01:00
Alec 1bcda9d23f Update tombstones.md (format of tombstone) (#511) 2019-02-04 12:30:47 +03:00
Krasi Georgiev ce4a2083fb nit
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-04 11:15:08 +02:00
Krasi Georgiev 752ab86e4e change the test block series for more stable tests
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2019-02-04 11:14:39 +02:00