* Adding TSDB Head Stats like cardinality to Status Page
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* Moving mutx to Head
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* Renaming variabls
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* Renaming variabls and html
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* Removing unwanted whitespaces
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* Adding Tests, Banchmarks and Max Heap for Postings Stats
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* Adding more tests for postingstats and web handler
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* Adding more tests for postingstats and web handler
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* Remove generated asset file that is no longer used
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
* Changing comment and variable name for more readability
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* Using time.Duration in postings status function and removing refresh button from web page
Signed-off-by: Sharad Gaur <sgaur@splunk.com>
* pass the value to the input instead to downshift
Signed-off-by: blalov <boyko.lalov@tick42.com>
* adjust expression input tests
Signed-off-by: blalov <boyko.lalov@tick42.com>
* improve ExpressionInput test coverage
Signed-off-by: blalov <boyko.lalov@tick42.com>
* React UI: Support custom path prefixes
The challenge was that the path prefix can be set dynamically as a flag
on Prometheus, but the React app bundle is statically compiled in to
expect a given path prefix. By adding a placeholder value to the React
app's index.html and replacing it in Prometheus with the right path
prefix during serving, this injects Prometheus's path prefix into the
React app via a global const.
Threading the path prefix into the different React components could have
been done with React's Contexts (https://reactjs.org/docs/context.html),
but I found the consumer side of context values to be a bit cumbersome
(wrapping entire components in context consumers), so I ended up
preferring direct threading of the path prefix values to components that
needed them. Also, using contexts in tests is more verbose than just
passing in path prefix values directly.
Fixes https://github.com/prometheus/prometheus/issues/6163
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Review feedback
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* React UI: Improve styling of autocomplete sections
I removed the Card-related components and went back to normal <ul>/<li>,
since the style that Cards added just got in the way (like adding extra
borders and rounding, etc.), and from the examples at
https://getbootstrap.com/docs/4.3/components/card/, it doesn't seem like
multiple Cards are meant to be used as part of a larger list
(style-wise).
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Address review feedback
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* React UI: More conversions to Function Components
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Address chat feedback over Riot
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Implement the /flags page in react
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
* Use custom react hook for calling api
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
* local storage selectedTab on targets tab was renamed
Signed-off-by: Michał Szczygieł <1153719+mszczygiel@users.noreply.github.com>
* added filters when displaying alerts
Signed-off-by: Michał Szczygieł <1153719+mszczygiel@users.noreply.github.com>
* function was simplified
Signed-off-by: Michał Szczygieł <1153719+mszczygiel@users.noreply.github.com>
* fixed rebase
Signed-off-by: Michał Szczygieł <1153719+mszczygiel@users.noreply.github.com>
* minor rename
Signed-off-by: Michał Szczygieł <1153719+mszczygiel@users.noreply.github.com>
* Active -> Pending
Signed-off-by: Michał Szczygieł <1153719+mszczygiel@users.noreply.github.com>
This makes React UI URLs look nicer than the previous
/static/graph-new/app.html, but internally still serves all React UI
files from the compiled-in static assets directory.
Also, to allow future usage of the React / Reach router, we need to
serve the main React app's index.html on certain sub-paths that
correspond to current Prometheus's UI pages, instead of trying to serve
actual files that match the provided path name.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Use root relative font size rather than px to avoid hidpi issues.
* Darken to 50% saturation of base font color.
Signed-off-by: Ben Kochie <superq@gmail.com>
The metric names only get loaded once initially, so there is no reason
to mix them up with the handling of ongoing query history.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Adds the query stats to UI
Adds the query load time, resolution and total number of time series,
as the current UI has
Signed-off-by: cstdev <pietomb00@hotmail.com>
* Implement unit test for QueryStats
Signed-off-by: cstdev <pietomb00@hotmail.com>
* Tidy Query Stats component
Rename it and expose a interface for the values it displays
Make it a functional component as it has no state or lifecycle
Better null/undefined checks
Only render if needed, decided by the panel
Remove old stats if the next errors
Signed-off-by: cstdev <pietomb00@hotmail.com>
* make expression input controlled
Signed-off-by: blalov <boyko.lalov@tick42.com>
* close menu explicitly when autosuggestion dropdown is hidden
Signed-off-by: blalov <boyko.lalov@tick42.com>
* Add component to sanitize html
Signed-off-by: Ritesh Shrivastav <ritesh.conf@gmail.com>
* Use SanitizeHTML component to allow only supported elements
Signed-off-by: Ritesh Shrivastav <ritesh.conf@gmail.com>
* Add allowedTags props in SanitizeHTML component
Signed-off-by: Ritesh Shrivastav <ritesh.conf@gmail.com>
* Update all React app node modules
I ran "yarn upgrade --latest" and then fixed items that caused errors
with new linter settings in the React UI source.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fix more React UI lint errors that fail CI
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Initial commit from Create React App
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Initial Prometheus expression browser code
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Grpahing, try out echarts
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Switch to flot
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add metrics fetching and stuff
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Autosuggest and graph improvements
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Start implementing graph controls, add loading spinner
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* So many new features and fixes
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fixed and built more features
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Make datetimepicker clear work
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Don't abort when executing empty expression
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove TabPaneAlert
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Split components into separate files
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add table time input
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Move first files to TypeScript!
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* More TypeScript conversions
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* More TS conversions
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* More TS conversions
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* More TS conversions
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* More TS conversions
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* More TS fixes
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Convert Graph to TS
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Changes
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Resize detector, start building legend, axis font colors
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Make graph legend work
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add URL params support and much more
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Put panel state into panel list, write URL options
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Change order of Graph and Table tabs
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Generalize time input naming more
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Work on history functionality
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* npm updates
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Move loading indicator into "Execute" button
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fix typo
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Revert "Move loading indicator into "Execute" button"
This reverts commit ce7daee1f1af35da6c0d8b5517272839285ccfec.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Improve error message when failing to fetch server time
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Move all code to Prometheus repo target dir
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add react-app Makefile step and check in generated assets
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add preliminary npm packages notice to NOTICE file
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Update React app's favicon and metadata
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove RP server refs, cleanups
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Use CircleCI image that includes NodeJS
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add some missing React output assets
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Preserve CRLF in generated React files
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Switch from npm to yarn for React UI
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Save npm licenses and include them in release tarball
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Install npm on Travis
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove npm license tarball from source
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove React graph bundle from source
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Don't check in any compiled web assets
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Update README.md with node/yarn/React UI info
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fix asset build step on CircleCI promu crossbuild
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Try to fix multi-arch go generate
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove check_assets from Travis CI build
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Prevent rebuilding of unchanged React app parts
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fix npm license tarball path for promu
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Simplify Makefile
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Clarify build instructions in README.md
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Make minimal JS test pass
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Integrate React app tests into Makefile
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Separate react-app-tests target, but run it from CI
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fix working directory for React app tests
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove local modifications to Makefile.common
This means that CircleCI will not run the React app tests, but at least
Travis still will...
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Depend on node_modules path for npm_licenses target
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Simplify tarball/docker/build Makefile targets
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Include React tests in "test" target
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Remove reference to removed "check_assets" target
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Do initial resize of expression input field
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add React app proxying to local Prometheus in dev mode
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* web/ui: handle null case
The call might sometimes return /api/v1/label/__name__/values the
following:
```
{"status":"success","data":null}
```
Then the `index.js` file assumes that `data` is not `null`. However,
that assumption fails and then we get this error in the console:
```
graph.js?v=foo:317 Uncaught TypeError: Cannot read property 'length' of null
at Object.success (graph.js?v=foo:317)
...
```
Then it becomes impossible to, for example, send a simple query like
`time()` and graph the results.
Fix it by using an empty array as the result if it is `null`.
Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
* ui: update static assets data
Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
* Change the global variable 'name' to a local variable so that it can not populate the global space.
Signed-off-by: 朱正浩,Zhu Zhenghao <zhenghao.zhu@daocloud.io>
* run make assets
Signed-off-by: 朱正浩,Zhu Zhenghao <zhenghao.zhu@daocloud.io>
* Show warnings in UI if query have returned some warnings
+ improve warning (error) text if query to remote was finished with error
* Add prefixes for remote_read errors
Signed-off-by: Stan Putrya <root.vagner@gmail.com>
* Fix context for the showWarning function
If the difference between the current time on a client and time on a server is quite big, Prometheus tries to show a related warning in UI on the Graph tab. But in the code, an incorrect context is used to invoke this method. As a result, an error is showed in the web developer console and the whole page stop working at all. This commit fixes the context.
CC @juliusv
Signed-off-by: Vyacheslav Kulakov <vkulakov@swiftserve.com>
* Fix context for the showWarning function
Fixed assets
CC @juliusv
Signed-off-by: Vyacheslav Kulakov <vkulakov@swiftserve.com>
* Add tests to ensure we can marshal and unmarshal our min/max times
Related to https://github.com/prometheus/client_golang/issues/614
Instead of implementing all the time parsing, we can special-case handle
these 2 times. This means if times in this format show up that
time.Parse can't handle they will still error, but we can marshal/parse
our own min/max time
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
* web: add prometheus_http_requests_total metrics
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Add unit test for requestCounter metric
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Working group name
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Working categorised by group name
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Changed group sorting in web
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Fixed group sorting and comments
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Fixed group sorting and comments with gofmt
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Added file and group name
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* reverted back to full path to yml file
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
Currently, When `/etc/mime.types` has a unusual mime type, web of prometheus uses the type and you may get unexpected result.
With this change, web returns consistent Content-Type header for static js and css files
To reproduce:
1. Add a type at the end of `/etc/mime` like `text/x-js js`
2. Run prometheus
3. Request js file like `http://localhost:9090/static/vendor/js/jquery.min.js`
4. You will see Content-Type of the response is `text/x-js` instead of `application/javascript`
Signed-off-by: mrasu <m.rasu.hitsuji@gmail.com>
The goal is to remove almost all references to the
golang.org/x/net/context package.
github.com/gogo/protobuf => v1.2.1
google.golang.org/grpc => v1.19.1
github.com/grpc-ecosystem/grpc-gateway => v1.18.5
It also replaces github.com/cockroachdb/cmux by github.com/soheilhy/cmux
because of [1] which fixes#3909 incidentally.
[1] https://github.com/grpc/grpc-go/issues/2636
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
- Unmarshall external_labels config as labels.Labels, add tests.
- Convert some more uses of model.LabelSet to labels.Labels.
- Remove old relabel pkg (fixes#3647).
- Validate external label names.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
* Display correct values for the retention in the flags web gui.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* adding a log entry
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* added the retention info to the runtime status page
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* simplify the retention display
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
- input key handler causes 2 layout cycles on each keypress which can
clog up browser rendering when typing quickly
- this change adds a debounce to the key press handler of 500ms
Fixes#5308
Signed-off-by: David Kaltschmidt <david.kaltschmidt@gmail.com>
This change switches the remote_write API to use the TSDB WAL. This should reduce memory usage and prevent sample loss when the remote end point is down.
We use the new LiveReader from TSDB to tail WAL segments. Logic for finding the tracking segment is included in this PR. The WAL is tailed once for each remote_write endpoint specified. Reading from the segment is based on a ticker rather than relying on fsnotify write events, which were found to be complicated and unreliable in early prototypes.
Enqueuing a sample for sending via remote_write can now block, to provide back pressure. Queues are still required to acheive parallelism and batching. We have updated the queue config based on new defaults for queue capacity and pending samples values - much smaller values are now possible. The remote_write resharding code has been updated to prevent deadlocks, and extra tests have been added for these cases.
As part of this change, we attempt to guarantee that samples are not lost; however this initial version doesn't guarantee this across Prometheus restarts or non-retryable errors from the remote end (eg 400s).
This changes also includes the following optimisations:
- only marshal the proto request once, not once per retry
- maintain a single copy of the labels for given series to reduce GC pressure
Other minor tweaks:
- only reshard if we've also successfully sent recently
- add pending samples, latest sent timestamp, WAL events processed metrics
Co-authored-by: Chris Marchbanks <csmarchbanks.com> (initial prototype)
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> (sharding changes)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
1. Added an ability to resize text area on mouseclick
2. Remember selected target status button on page reload
Signed-off-by: Maria Nemtinova <nemtinovamasha@gmail.com>
* web: updated bootstrap3-typeahead file to work with bootstrap 4.0.0
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: Replaced bootstrap-3.3.1 with bootstrap 4.0.0
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: Added bootstrap4-glyphicons as 4.0.0 doesnt include bootstrap3 glyphicons
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: updated js jquery to 3.3.1
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: updated _base.html to import new bootstrap 4.0.0, jquery3.3.1 and bootstrap class tags to be 4.0 compatible
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: _base.html missed word out in title tag (Server).
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: updated alerts.html class names and tags to be bootstrap 4 compatible.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: updated config.html class names and tags to be bootstrap 4 compatible.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: updated flags.html class names and tags to be bootstrap 4 compatible.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: updated service-discovery.html class names and tags to be bootstrap 4 compatible.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: updated status.html class names and tags to be bootstrap 4 compatible.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: updated targets.html class names and tags to be bootstrap 4 compatible.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: updated graph_template.handlebar class names and tags to be bootstrap 4 compatible.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: alerts.css fix for button color inheritance on alerts page.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: graph.css fix for color inheritance.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: prometheus.css updated to fix nav bar.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* web: previous merge conflict not fixed correctly on _base.html
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* menu.lib and prom.lib imports updated
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* bootstrap 4.1.3 imported
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Bootstrap 4.1.3 imported into _base.html
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* bootstrap 4.1.3 imported into prom.lib
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* menu.lib style adjusted to view sidebar
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Alert colour uplifted to bootstrap 4.1.3
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Alerts display code reformatted similarly to config
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Consoles pages adjusted to account for new navbar
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* LHS Menu fixed in console pages
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Minor changes to prom_console to adjust lhs nav
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Prom.lib and some css updated to fix console graph controls
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Bootstrap 4.0.0 files removed
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Consoles configured so that the graph fits with the new side bar, css files also adjusted
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Import popper.min.js for dropdowns
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Popper.min.js imported locally
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Re-added #4764 and fixed css
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Removed .DS_Store
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Rebuilt assets
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* Spaces between buttons and inputs on graph page removed
Signed-off-by: ksherryBAE <kieran.sherry@baesystems.com>
* fixed spacing in buttons on /targets
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* Updated vfsdata.go
Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
* fixed typeahead issue
Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>
* added css for dropdown
Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>
* changed order of css imports
Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>
* tinkered with CSS changes to make keyboard select and mouseover match
Signed-off-by: James Ritchie <james.g.ritchie@baesystems.com>
* *: bump gRPC dependencies
This change updates the gRPC dependencies to more recent versions:
* github.com/gogo/protobuf => v1.2.0
* github.com/grpc-ecosystem/grpc-gateway => v1.6.3
* google.golang.org/grpc => v1.17.0
In addition scripts/genproto.sh leverages Go modules information instead of
hardcoding SHA1 commits. This ensures that the code is generated from
the exact same sources.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Run 'make proto' in CI
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Revert tabs -> spaces change
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Fix 'make proto' step
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* 'go get' grpc/protobuf dependencies
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Prepopulate cache with go mod download
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* *: use latest release of staticcheck
It also fixes a couple of things in the code flagged by the additional
checks.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Use official release of staticcheck
Also run 'go list' before staticcheck to avoid failures when downloading packages.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* added `Copy to clipboard` button
Signed-off-by: Stafford Williams <stafford.williams@gmail.com>
* generate vsfdata
Signed-off-by: Stafford Williams <stafford.williams@gmail.com>
* new lines
Signed-off-by: Stafford Williams <stafford.williams@gmail.com>
* single newline
Signed-off-by: Stafford Williams <stafford.williams@gmail.com>
When a metric has a null value, number formatters like
`humanizeNoSmallPrefix` will throw "Uncaught TypeError: Cannot read
property 'toPrecision' of null".
This is fixed by explicitly checking for `null` and returning the string
"null".
Note: This is usually not seen as rickshaw doesn't show annotations for
null values, but still calls the formatter.
Signed-off-by: David Coles <coles.david@gmail.com>
* update promlog to latest version
Signed-off-by: Alex Yu <yu.alex96@gmail.com>
* Update api tests, fix main setup
Signed-off-by: Alex Yu <yu.alex96@gmail.com>
* tidy go.sum
Signed-off-by: Alex Yu <yu.alex96@gmail.com>
* revendor prometheus/common
Signed-off-by: Alex Yu <yu.alex96@gmail.com>
* only initialize config; use kingpin for remote_storage_adapter
Signed-off-by: Alex Yu <yu.alex96@gmail.com>
* actually parse the flags
Signed-off-by: Alex Yu <yu.alex96@gmail.com>
* clean up imports
Signed-off-by: Alex Yu <yu.alex96@gmail.com>
* web: added ability to set page title through flag.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* Reformatted variable names and Flag description for readability.
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* assets_vfsdata.go
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* Flag name changed from web.ui-title to web.page-title
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
* make assets
Signed-off-by: Andrew Chiu <andrew.chiu2@baesystems.com>
By default the gRPC client of the REST API gateway relies on the
HTTP_PROXY variable to connect to the local gRPC server which isn't
desired as the server runs in the same process. This change uses a
custom dialer that connects directly to the server's address.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* *: move to go 1.11
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Reduce number of places where we specify the Go version
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Add evaluationTimestamp (Last Evaluation) column to display on /rules
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Add lastScrapeDuration ("Scrape Duration") to display on /targets
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Updates based on Julius' feedback
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Update to set timestamp to when eval started (after eval completes)
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Update /rules to display time since last evaluation
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
* Re-order Last Eval/Eval Time to be consistent with targets page
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
With the addition of the errors in the views list, it is now difficult
to have a view on all the rules in a screen witdh.
This commit adds wrapping to improve the overall display of the rules
page.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
The scrape manage receiver's channel now just saves the target sets
and another backgorund runner updates the scrape loops every 5 seconds.
This is so that the scrape manager doesn't block the receiving channel
when it does the long background reloading of the scrape loops.
Active and dropped targets are now saved in each scrape pool instead of
the scrape manager. This is mainly to avoid races when getting the
targets via the web api.
When reloading the scrape loops now happens in parallel to speed up the
final disared state and this also speeds up the prometheus's shutting
down.
Also updated some funcs signatures in the web package for consistency.
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
* web: fix asset paths for Windows platforms
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* web: add tests
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Limit the number of samples remote read can return.
- Return 413 entity too large.
- Limit can be set be a flag. Allow 0 to mean no limit.
- Include limit in error message.
- Set default limit to 50M (* 16 bytes = 800MB).
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
When prom2 came out the storage querier interface consolidated to a
single Select() method. While doing this it makes it impossible as the
implementer of the querier to know if you are being called for metadata
or actual data. The workaround has been to check if the SelectParams are
nil, which the federation call is always nil. This has 2 negative
consequences (1) remote implementations interpret this as a metadata
call, which makes the federation endpoint return nothing. (2) this means
that the storage implementations don't get the same information passed
down to them as far as SelectParams goes.
This diff simply adds SelectParams to the Select() call in the
federation handler
Mitigation for #4057
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
Looking at https://tech.townsourced.com/post/embedding-static-files-in-go/ (which was mentioned in the issue), vfsgen has all the needed features.
In particular:
- Reproducible builds (no issue with timestamping).
- Well maintained and relatively popular.
- Integration with go generate.
- Self-contained (no external dependency).
* [WIP] Replace go-bindata by vfsgen
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Add license + remove doc.go
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Generate templates assets
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Use new templates assets
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* split static assets
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Idempotent make assets
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Update vendor/
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* vendor vfsgendev
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Update README.md
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Simplify assets generation
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Fix README.md
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Use generate helper program instead of vfsgen
This avoids installing vfsgendev in the target environment.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Remove unused vfsgen package
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Fix Makefile
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* vendoring shurcooL/vfsgen
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Fix go generate command
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Sync web/ui/assets_vfsdata.go
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* adding information about the health and errors for Rules
adding Health() and LastError() to the Rule interface. This will allow
us to easily surface information about rules.
Signed-off-by: noqcks <benny@noqcks.io>
* updating rules.html with fields for Rule errors and health state
Signed-off-by: noqcks <benny@noqcks.io>
* fix code comment grammar & access Rule health/error info using a mutex
Signed-off-by: noqcks <benny@noqcks.io>
* s/Errors/Error/ in rules.html to remain consistent with targets.html
Signed-off-by: noqcks <benny@noqcks.io>
* adding periods to code comments in reporting/alerting
Signed-off-by: noqcks <benny@noqcks.io>
* putting health/error below mutex in struct field
Signed-off-by: noqcks <benny@noqcks.io>
It was added 5 years ago by Matt and I'm not sure anyone ever used
it after public release (since we have /debug/pprof/heap as well).
It also lacked error checking and allows people to write to disk over HTTP.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Allow for BufferedSeriesIterator instances to be created without an underlying iterator, to simplify their usage.
Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
* Add Start/End to SelectParams
* Make remote read use the new selectParams for start/end
This commit will continue sending the start/end time of the remote read
query as the overarching promql time and the specific range of data that
the query is intersted in receiving a response to is now part of the
ReadHints (upstream discussion in #4226).
* Remove unused vendored code
The genproto.sh script was updated, but the code wasn't regenerated.
This simply removes the vendored deps that are no longer part of the
codegen output.
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
This adds a per-target cache of scraped metadata. The metadata is only
available for the lifecycle of the attached target. An API endpoint allows
to select metadata by metric name and a label selection of targets.
Signed-off-by: Fabian Reinartz <freinartz@google.com>
Displaying all the dropped targets in the service-discovery page hurts
the Prometheus server as well as the browser when thousands of dropped
targets exist. This change limits this number to 1,000 and display the
number of active/total targets per scrape configuration.
Add warning when more than 100 targets are dropped
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Move range logic to 'eval'
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make aggregegate range aware
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* PromQL is statically typed, so don't eval to find the type.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Extend rangewrapper to multiple exprs
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Start making function evaluation ranged
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make instant queries a special case of range queries
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Eliminate evalString
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Evaluate range vector functions one series at a time
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make unary operators range aware
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make binops range aware
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Pass time to range-aware functions.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make simple _over_time functions range aware
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Reduce allocs when working with matrix selectors
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Add basic benchmark for range evaluation
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Reuse objects for function arguments
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Do dropmetricname and allocating output vector only once.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Add range-aware support for range vector functions with params
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Optimise holt_winters, cut cpu and allocs by ~25%
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make rate&friends range aware
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make more functions range aware. Document calling convention.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make date functions range aware
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make simple math functions range aware
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Convert more functions to be range aware
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make more functions range aware
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Specialcase timestamp() with vector selector arg for range awareness
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Remove transition code for functions
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Remove the rest of the engine transition code
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Remove more obselete code
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Remove the last uses of the eval* functions
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Remove engine finalizers to prevent corruption
The finalizers set by matrixSelector were being called
just before the value they were retruning to the pool
was then being provided to the caller. Thus a concurrent query
could corrupt the data that the user has just been returned.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Add new benchmark suite for range functinos
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Migrate existing benchmarks to new system
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Expand promql benchmarks
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Simply test by removing unused range code
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* When testing instant queries, check range queries too.
To protect against subsequent steps in a range query being
affected by the previous steps, add a test that evaluates
an instant query that we know works again as a range query
with the tiimestamp we care about not being the first step.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Reuse ring for matrix iters. Put query results back in pool.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Reuse buffer when iterating over matrix selectors
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Unary minus should remove metric name
Cut down benchmarks for faster runs.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Reduce repetition in benchmark test cases
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Work series by series when doing normal vectorSelectors
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Optimise benchmark setup, cuts time by 60%
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Have rangeWrapper use an evalNodeHelper to cache across steps
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Use evalNodeHelper with functions
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Cache dropMetricName within a node evaluation.
This saves both the calculations and allocs done by dropMetricName
across steps.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Reuse input vectors in rangewrapper
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Reuse the point slices in the matrixes input/output by rangeWrapper
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make benchmark setup faster using AddFast
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Simplify benchmark code.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Add caching in VectorBinop
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Use xor to have one-level resultMetric hash key
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Add more benchmarks
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Call Query.Close in apiv1
This allows point slices allocated for the response data
to be reused by later queries, saving allocations.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Optimise histogram_quantile
It's now 5-10% faster with 97% less garbage generated for 1k steps
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make the input collection in rangeVector linear rather than quadratic
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Optimise label_replace, for 1k steps 15x fewer allocs and 3x faster
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Optimise label_join, 1.8x faster and 11x less memory for 1k steps
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Expand benchmarks, cleanup comments, simplify numSteps logic.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Address Fabian's comments
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Comments from Alin.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Address jrv's comments
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Remove dead code
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Address Simon's comments.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Rename populateIterators, pre-init some sizes
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Handle case where function has non-matrix args first
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Split rangeWrapper out to rangeEval function, improve comments
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Cleanup and make things more consistent
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Make EvalNodeHelper public
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* Fabian's comments.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Fix race by properly locking access to scrape pools. Use separate mutex for information needed by UI so that UI isn't blocked when targets are being updated.
* web: replace deprecated InstrumentHandler()
This change replaces the deprecated InstrumentHandler function by the
equivalent functions from the promhttp package.
The following metrics are removed:
* http_request_duration_microseconds (Summary).
* http_request_size_bytes (Summary).
* http_requests_total (Counter).
And the following metrics are added instead:
* prometheus_http_request_duration_seconds (Histogram).
* prometheus_http_response_size_bytes (Histogram).
* promhttp_metric_handler_requests_in_flight (Gauge).
* promhttp_metric_handler_requests_total (Counter).
* Update github.com/prometheus/common/route package
* web: refactor using the new prometheus/common/route package
After removing the checkbox in #3913 the only remaining element that
looked like it was the new Show Annotations checkbox on the Alerts page.
Which in turn didn't look like the Enable query history checkout on the
graph page. So:
1. This takes the Enable query history button as canonical.
2. Updates the show annotations button code to match it.
3. Simplifies the JS for the checkbox.
The new Service Discovery page uses the CSS/JS from the Targets page but
used slightly differently. This makes the job header match in the
Service Discovery page for a more consistent look-n-feel.
* Added only healthy to Targets
This adds a "Only heathly" button to supplement the "Only unhealthy"
button. The two are mutually exclusive.
I've also added a red/green text color to the buttons.
Arguably this could be a toggle instead if folks think this is
worthwhile... Happy to modify it.
* Moved functions above init
* Simplifed code and made prettier
* Appeased codeacy
* Made buttons square
* Fix JS error: cannot read source of undefined
When the page was refreshed with queries on the page,
the updateTypeaheadMetricsSet function was called before
the typeahead had been initialized.
* Fix: updates URL when query submits
When queries were submitted by pressing enter, the URL did not update
to reflect the change. Not sure why, but this was only the case when
the queries were non-simple, meaning when either labels werre specified
or other promql functions were used.
* Rebase master and make assets
This is a very minor UX change. The current "No Alert rules" present
table row has the `alert_header` class attached. This changes the cursor
and some other stuff and makes sense with the populated table but less
sense with the unpopulated table. So removing it the latter case.
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
When you have no alerting rules defined you get a screen sharing this
information in the WebUI. If no rules are defined then you instead see
an empty white screen. This adds a "No rules" defined `else` clause and
a `Rules` header to the page.
* Do not autoselect the first item in the dropdown
* Historical queries only show in dropdown when toggled on
* Move shared behavior to queryHistory.isEnabled function
* Do not auto submit selected history queries
net.Listener converts 0.0.0.0 to :: which fails for hosts where IPv6 is
disabled. This change uses the original listen address parameter instead
of grpcl.Addr().String().
Federation makes use of dedupedSeriesSet to merge SeriesSets for every
query into one output stream. If many match[] arguments are provided,
many dedupedSeriesSet objects will get chained. This has the downside of
causing a potential O(n*k) running time, where n is the number of series
and k the number of match[] arguments.
In the mean time, the storage package provides a mergeSeriesSet that
accomplishes the same with an O(n*log(k)) running time by making use of
a binary heap. Let's just get rid of dedupedSeriesSet and change all
existing callers to use mergeSeriesSet.
When there is an empty result set, the Prometheus server replies with
{"status":"success","data":{"resultType":"vector","result":null}}
That "null" reply was not handled correctly by the graphing library.
This commit handles that case and shows "no data" in the UI console view
instead of throwing an error in the browser javascript console.
Fixes#3515
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
API consumers should be able to get insight into the query run times.
The UI currently measures total roundtrip times. This PR allows for more
fine grained metrics to be exposed.
* adds new timer for total execution time (queue + eval)
* expose new timer, queue timer, and eval timer in stats field of the
range query response:
```json
{
"status": "success",
"data": {
"resultType": "matrix",
"result": [],
"stats": {
"execQueueTimeNs": 4683,
"execTotalTimeNs": 2086587,
"totalEvalTimeNs": 2077851
}
}
}
```
* stats field is optional, only set when query parameter `stats` is not
empty
Try it via
```sh
curl 'http://localhost:9090/api/v1/query_range?query=up&start=1486480279&end=1486483879&step=14000&stats=true'
```
Review feedback
* moved query stats json generation to query_stats.go
* use seconds for all query timers
* expose all timers available
* Changed ExecTotalTime string representation from Exec queue total time to Exec total time
This PR fixes#3072 by providing POST endpoints for `query` and `query_range`.
POST request must be made with `Content-Type: application/x-www-form-urlencoded` header.
* Add UI warning for time drift >30 seconds
* Yellow time drift warning & better warning message
* Set warning threshold to 30 sec
* Include changed assets
* Re-add contexts to storage.Storage.Querier()
These are needed when replacing the storage by a multi-tenant
implementation where the tenant is stored in the context.
The 1.x query interfaces already had contexts, but they got lost in 2.x.
* Convert promql.Engine to use native contexts
No matter how we refactor docs, `/docs/` will stay the prefix, so there's not long-term risk in changing this.
One we version docs, we should probably try and keep link & version in sync.
Whenever a route prefix is applied, the router prepends the prefix to
the URL path on the request. For most handlers, this is not an issue
because the request's path is only used for routing and is not actually
needed by the handler itself. However, Prometheus delegates the handling
of the /debug/* endpoints to the http.DefaultServeMux which has it's own
routing logic that depends on the url.Path. As a result, whenever a
prefix is applied, the prefixed URL is passed to the DefaultServeMux
which has no awareness of the prefix and returns a 404.
This change fixes the issue by creating a new serveDebug handler which
routes requests /debug/* requests to appropriate net/http/pprof handler
and removing the net/http/pprof import in cmd/prometheus since it is no
longer necessary.
Fixes#2183.
This PR adds the `/status/config` endpoint which exposes the currently
loaded Prometheus config. This is the same config that is displayed on
`/config` in the UI in YAML format. The response payload looks like
such:
```
{
"status": "success",
"data": {
"yaml": <CONFIG>
}
}
```
Issue #3046 is triggered by html/template changes in go1.9.
See https://tip.golang.org/pkg/html/template. Quote:
// To ease migration to Go 1.9 and beyond, "html" and "urlquery" will
// continue to be allowed as the last command in a pipeline. However, if the
// pipeline occurs in an unquoted attribute value context, "html" is
// disallowed. Avoid using "html" and "urlquery" entirely in new templates.
The commit also includes a trivial whitespace fix.
To cover the cases where stale markers may not be available,
we need to infer the interval and mark series stale based on that.
As we're lacking stale markers this is less accurate, however
it should be good enough for these cases.
We need 4 intervals as if say we had data at t=0 and t=10,
coming via federation. The next data point should be at t=20 however it
could take up to t=30 for it actually to be ingested, t=40 for it to be
scraped via federation and t=50 for it to be ingested.
We then add 10% on to that for slack, as we do elsewhere.
* Use request.Context() instead of a global map of contexts.
* Add some basic opentracing instrumentation on the query path.
* Remove tracehandler endpoint.
This is needed for federating non-instance level metrics, so they don't
end up with the instance label of the prometheus target.
Also sort external labels, so label output order is consistent.
* Fixed int64 overflow for timestamp in v1/api parseDuration and parseTime
This led to unexpected results on wrong query with "(...)&start=148966367200.372&end=1489667272.372"
That query is wrong because of `start > end` but actually internal int64 overflow caused start to be something around MinInt64 (huge negative value) and was passing validation.
BTW: Not sure if negative timestamp makes sense even.. But model.Earliest is actually MinInt64, can someone explain me why?
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
* Added missing trailing periods on comments.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
* MOved to only `<` and `>`. Removed equal.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Expose buildQueryUrl, refactor dispatch to use
buildQueryUrl will allow users to execute queries over the range of an
existing graph. This will be helpful to select data series they wish to
annotate the graph with, for example.
The fuzzy library didn't try to find a "best match", but settled on the
first fuzzy match that exists. This patch includes a modified version of
the fuzzy library, which recursivley tries on the rest of the search
string to find a better match. If found, returns that one.
Another small modification is that if a pattern fully matches, it
skips the lookup entirley and returns the highest score possible for
that match.
For some of the queries, the fuzzy lookup was not filtering properly.
The problem is due to the "replace" beind made on the query itself. It
accidently removes only the first underscore. This patch changes it so
that it removes all of the whitespaces, letting the fuzzy algorithm do
its magic, also fixing this problem.
Originally, the underscore were replaced by a space for this specific
reason, to let the user type a space and have the lookup treat it as the
word break.
Fixes#2380
retreival.Target contains a mutex. It was copied in the Targets()
call. This potentially can wreak a lot of havoc.
It might even have caused the issues reported as #2266 and #2262 .
Right now the /alerts page of Prometheus sorts alerts by severity
(firing, pending, inactive). Once multiple alerts have the same
severity, their order seems to correlate to how they are placed in the
configuration files, but not always. Looking at the code, we make use of
sort.Sort(), which is documented not to provide a stable sort. The
Less() function also only takes the alert state into account.
This change extends the Less() function to provide a lexicographic order
on both the alert state and the name. This means I can finally find the
alerts I'm looking for without using my browser's search feature.
We are writing federation responses streaming. So after
the first byte we wrote, the status header is fixed. We cannot
return an HTTP error for intermediate error but should just abort
and log instead.
Adds also the moment.js library, which is a dependency of it.
Following conventions in the web/ui directory, I am not including the original
sources or LICENSE files.
If an existing request is aborted due to a new request, ignore the completion of the initial request.
Example:
1. Chrome dev tools: enable 5 second network latency
2. Execute query
3. A second later, execute the query again
4. Currently, the spinner will hide, and the stats will immediately display, as if the request had completed. Instead, the spinner and stats should wait until the 2nd execution finishes.
* Add fuzzy search to /graph textarea
We have a few thousands different metrics and looking up some of them
can be quite annoying with the simple string matching.
This patch adds a fuzzy search to the textarea lookup box on the /graph
page. It uses a small neat library from github.com/mattyork/fuzzy.
* Add fuzzy lib to NOTICE and re-build assets
Previously built assets changed the mode.
This extracts Querier as an instantiateable and closeable object
rather than just defining extending methods of the storage interface.
This improves composability and allows abstracting query transactions,
which can be useful for transaction-level caches, consistent data views,
and encapsulating teardown.
If an existing request is aborted due to a new request, ignore the completion of the initial request.
Example:
1. Chrome dev tools: enable 5 second network latency
2. Execute query
3. A second later, execute the query again
4. Currently, the spinner will hide, and the stats will immediately display, as if the request had completed. Instead, the spinner and stats should wait until the 2nd execution finishes.
This is based on https://github.com/prometheus/prometheus/pull/1997.
This adds contexts to the relevant Storage methods and already passes
PromQL's new per-query context into the storage's query methods.
The immediate motivation supporting multi-tenancy in Frankenstein, but
this could also be used by Prometheus's normal local storage to support
cancellations and timeouts at some point.
For Weaveworks' Frankenstein, we need to support multitenancy. In
Frankenstein, we initially solved this without modifying the promql
package at all: we constructed a new promql.Engine for every
query and injected a storage implementation into that engine which would
be primed to only collect data for a given user.
This is problematic to upstream, however. Prometheus assumes that there
is only one engine: the query concurrency gate is part of the engine,
and the engine contains one central cancellable context to shut down all
queries. Also, creating a new engine for every query seems like overkill.
Thus, we want to be able to pass per-query contexts into a single engine.
This change gets rid of the promql.Engine's built-in base context and
allows passing in a per-query context instead. Central cancellation of
all queries is still possible by deriving all passed-in contexts from
one central one, but this is now the responsibility of the caller. The
central query context is now created in main() and passed into the
relevant components (web handler / API, rule manager).
In a next step, the per-query context would have to be passed to the
storage implementation, so that the storage can implement multi-tenancy
or other features based on the contextual information.
This will avoid duplicate MetricFamilies, thereby shrinking the size
of the federation payload and also creating legal text format.
Also, add unit tests for federation. They were also needed for the
previous state of the code, but were missing.
This reverts commit aa43d34a86.
This brings back the /graph changes so that @grandbora can continue to
work on the redirect for backwards compatibility. And other changes
can already take the new /graph parameters into account.
This revert will be reverted once v1.1 is released and has its own
release branch. Since we had already change on top of this, there was
no cleaner way of cutting those changes out.
This commit reverts the following commits:
Revert "Update backend helpers and templates to new url schema"
This reverts commit fc6cdd0611.
Revert "Refactor graph.js"
This reverts commit 445fac56e0.
Revert "Use query parameters in the url"
This reverts commit 3e18d86d8a.
Revert "Point to correct place for GraphLinkForExpression"
This reverts commit 3da825fc76.
Assets are also updated.
There's no corresponding table column for this table header. The
placeholder link for silences was removed in e8800730.
Accordingly, regenerate `web/ui/bindata.go` by running:
make assets format
See discussion in
https://groups.google.com/forum/#!topic/prometheus-developers/bkuGbVlvQ9g
The main idea is that the user of a storage shouldn't have to deal with
fingerprints anymore, and should not need to do an individual preload
call for each metric. The storage interface needs to be made more
high-level to not expose these details.
This also makes it easier to reuse the same storage interface for remote
storages later, as fewer roundtrips are required and the fingerprint
concept doesn't work well across the network.
NOTE: this deliberately gets rid of a small optimization in the old
query Analyzer, where we dedupe instants and ranges for the same series.
This should have a minor impact, as most queries do not have multiple
selectors loading the same series (and at the same offset).
I got feedback from different sources about rules and targets being
too heavy in the status tab if their are lots of them.
This change also allows for more fine-granular locking.
Prometheus is Apache 2 licensed, and most source files have the
appropriate copyright license header, but some were missing it without
apparent reason. Correct that by adding it.
The chunk encoding was hardcoded there because it mostly doesn't
matter what encoding is chosen in that test. Since type 1 is
battle-hardened enough, I'm switching to type 2 here so that we can
catch unexpected problems as a byproduct. My expectation is that the
chunk encoding doesn't matter anyway, as said, but then "unexpected
problems" contains the word "unexpected".
WIP: This needs more tests.
It now gets a from and through value, which it may opportunistically
use to optimize the retrieval. With possible future range indices,
this could be used in a very efficient way. This change merely applies
some easy checks, which should nevertheless solve the use case of
heavy rule evaluations on servers with a lot of series churn.
Idea is the following:
- Only archive series that are at least as old as the headChunkTimeout
(which was already extremely unlikely to happen).
- Then maintain a high watermark for the last archival, i.e. no
archived series has a sample more recent than that watermark.
- Any query that doesn't reach to a time before that watermark doesn't
have to touch the archive index at all. (A production server at
Soundcloud with the aforementioned series churn and heavy rule
evaluations spends 50% of its CPU time in archive index
lookups. Since rule evaluations usually only touch very recent
values, most of those lookup should disappear with this change.)
- Federation with a very broad label matcher will profit from this,
too.
As a byproduct, the un-needed MetricForFingerprint method was removed
from the Storage interface.
This commit simplifies the TargetHealth type and moves the target
status into the target itself. This also removes a race where error
and last scrape time could have been out of sync.
Formalize ZeroSamplePair as return value for non-existing samples.
Change LastSamplePairForFingerprint to return a SamplePair (and not a
pointer to it), which saves allocations in a potentially extremely
frequent call.
It's actually happening in several places (and for flags, we use the
standard Go time.Duration...). This at least reduces all our
home-grown parsing to one place (in model).
This enables metric name autocompletion for every word in an expression,
not just the very first one. It would be great to also support all
language keywords during autocompletion in the future.
This adapts some functionality from the Go standard library for string
literal lexing and unquoting/unescaping.
The following string types are now supported:
Double- or single-quoted strings:
These support all escape sequences that Go supports in double-quoted
string literals. The difference is that Prometheus also has
single-quoted strings (instead of single-quoted runes in Go). Raw
newlines are not allowed.
Backtick-quoted raw strings:
Strings quoted in backticks are treated as raw strings just like in Go
and may contain raw newlines and other special characters directly.
Fixes https://github.com/prometheus/prometheus/issues/1122
Fixes https://github.com/prometheus/prometheus/issues/1121
This is with `golint -min_confidence=0.5`.
I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
Let's remove the silencing links until we actually have support for that.
A silencing link shouldn't only redirect to Alertmanager, but also open a
silencing dialog for the respective alert name or active alert element.
This got broken in
78047326b4
since it stopped using the DefaultServeMux.
This approach will defer pprof requests to the DefaultServeMux, which
may or may not have pprof enabled (in Prometheus, it gets it included in
main.go). An alternative approach would be to duplicate the four lines in
https://golang.org/src/net/http/pprof/pprof.go#L62. When choosing that
approach though, we would not automatically gain any new endpoints added
by net/http/pprof or other /debug endpoints in the future.