Commit graph

290 commits

Author SHA1 Message Date
Brian Brazil 68e8b80ffe
Reorder startup and shutdown to prevent panics. (#4321)
Start rule manager only after tsdb and config is loaded.
Stop rule manager before tsdb to avoid writing to closed storage.
Wait for any in-progress reloads to complete before shutting
down rule manager, so that rule manager doesn't get updated after
being shut down.

Remove incorrect comment around shutting down query enginge.
Log when config reload is completed.

Fixes #4133
Fixes #4262

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-07-04 13:41:16 +01:00
Michael Khalil 78e0784d04 return error exit status in prometheus cli (#4296)
Signed-off-by: mikeykhalil <mikeyfkhalil@gmail.com>
2018-06-21 08:32:26 +01:00
Tom Wilkie 8acad5f3cd make it compile
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-24 15:40:24 +01:00
Tom Wilkie e51d6c4b6c Make remote flush deadline a command line param.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-23 15:06:01 +01:00
Sneha Inguva c1a851074b promtool: add query instant and query range commands (#4085)
* promtool: add QueryInstant and QueryRange cmds

* promtool: add more query functions

* promtool: finished query Instant

* promtool: add range query

* promtool: add query command and address arguments

* vendor client and api
2018-04-26 20:41:56 +02:00
Mario Trangoni 464e747f1e fix some comments typos (#4059) 2018-04-08 10:51:54 +01:00
Sneha Inguva 7be846754a main: actor functionality comments 2018-04-01 11:19:30 -07:00
Marek Siarkowicz bb86c3f62b Report internal runtime information on status page (#3921)
Add information about tsdb, wal and config reload
2018-03-21 16:08:37 +00:00
James Turnbull ba5273a0ab Minor edits to help text (#3990) 2018-03-20 16:54:36 +00:00
Simon Pasquier e1fd96db25 cmd: fix help text (#3989) 2018-03-20 15:58:19 +00:00
ferhat elmas ffa673f7d8 General simplifications (#3887)
Another try as in #1516
2018-02-26 07:58:10 +00:00
Bartek Plotka 93a63ac5fd api: Added v1/status/flags endpoint. (#3864)
Endpoint URL: /api/v1/status/flags
Example Output:
```json
{
  "status": "success",
  "data": {
    "alertmanager.notification-queue-capacity": "10000",
    "alertmanager.timeout": "10s",
    "completion-bash": "false",
    "completion-script-bash": "false",
    "completion-script-zsh": "false",
    "config.file": "my_cool_prometheus.yaml",
    "help": "false",
    "help-long": "false",
    "help-man": "false",
    "log.level": "info",
    "query.lookback-delta": "5m",
    "query.max-concurrency": "20",
    "query.timeout": "2m",
    "storage.tsdb.max-block-duration": "36h",
    "storage.tsdb.min-block-duration": "2h",
    "storage.tsdb.no-lockfile": "false",
    "storage.tsdb.path": "data/",
    "storage.tsdb.retention": "15d",
    "version": "false",
    "web.console.libraries": "console_libraries",
    "web.console.templates": "consoles",
    "web.enable-admin-api": "false",
    "web.enable-lifecycle": "false",
    "web.external-url": "",
    "web.listen-address": "0.0.0.0:9090",
    "web.max-connections": "512",
    "web.read-timeout": "5m",
    "web.route-prefix": "/",
    "web.user-assets": ""
  }
}
```

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2018-02-21 08:49:02 +00:00
Fabian Reinartz 7ccd4b39b8 *: implement query params
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
2018-02-13 12:17:22 +01:00
Conor Broderick 5169ccf258
Merge pull request #3724 from simonpasquier/fix-bad-data-error
Don't reset FiredAt for inactive alerts
2018-02-01 16:18:09 +00:00
Krasi Georgiev b75428ec19 rename package retrieve to scrape
no fucnctinal changes just renaming retrieval to scrape
2018-02-01 09:55:07 +00:00
Krasi Georgiev 7858745c04 rename structs for consistency 2018-01-30 17:49:05 +00:00
Krasi Georgiev acc4197098 remove dicovery race for the context field 2018-01-29 15:18:07 +00:00
Julien Pivotto 8b20cb1e8d last config success time gauge: use SetToCurrentTime() (#3750)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2018-01-27 07:48:13 +00:00
Simon Pasquier 81c0ab69e0 Don't reset FiredAt for inactive alerts
Otherwise AlertManager receives resolved alerts where StartsAt is zero which
fails the validation.
2018-01-22 17:17:33 +01:00
Krasi Georgiev 719c579f7b refactor main execution reloadReady handling, update some comments 2018-01-17 18:14:24 +00:00
Krasi Georgiev 0eafaf32d3 set the correct config reloading execution for scraper and notifier 2018-01-17 13:06:56 +00:00
Krasi Georgiev 97f0461e29 refactor the config reloading execution 2018-01-17 12:02:13 +00:00
Krasi Georgiev 5260c650ec use the config hash for the map lookup 2018-01-16 11:10:54 +00:00
Krasi Georgiev 8369826808 comment to rethink the map reference for the notifier discovery 2018-01-16 09:47:53 +00:00
Krasi Georgiev d12e6f29fc discovery manager ApplyConfig now takes a direct ServiceDiscoveryConfig so that it can be used for the notify manager
reimplement the service discovery for the notify manager

Signed-off-by: Krasi Georgiev <krasi.root@gmail.com>
2018-01-15 13:39:44 +00:00
Shubheksha Jalan 0471e64ad1 Use shared types from the common repo (#3674)
* refactor: use shared types from common repo, remove util/config

* vendor: add common/config

* fix nit
2018-01-11 16:10:25 +01:00
Goutham Veeramachaneni 35a6ffbaf3
Merge pull request #3587 from krasi-georgiev/web-test-error-check
handle web_test webhandler errors.
2018-01-10 22:03:25 +05:30
Shubheksha Jalan ec94df49d4 Refactor SD configuration to remove config dependency (#3629)
* refactor: move targetGroup struct and CheckOverflow() to their own package

* refactor: move auth and security related structs to a utility package, fix import error in utility package

* refactor: Azure SD, remove SD struct from config

* refactor: DNS SD, remove SD struct from config into dns package

* refactor: ec2 SD, move SD struct from config into the ec2 package

* refactor: file SD, move SD struct from config to file discovery package

* refactor: gce, move SD struct from config to gce discovery package

* refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil

* refactor: consul, move SD struct from config into consul discovery package

* refactor: marathon, move SD struct from config into marathon discovery package

* refactor: triton, move SD struct from config to triton discovery package, fix test

* refactor: zookeeper, move SD structs from config to zookeeper discovery package

* refactor: openstack, remove SD struct from config, move into openstack discovery package

* refactor: kubernetes, move SD struct from config into kubernetes discovery package

* refactor: notifier, use targetgroup package instead of config

* refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup

* refactor: retrieval, use targetgroup package instead of config.TargetGroup

* refactor: storage, use config util package

* refactor: discovery manager, use targetgroup package instead of config.TargetGroup

* refactor: use HTTPClient and TLS config from configUtil instead of config

* refactor: tests, use targetgroup package instead of config.TargetGroup

* refactor: fix tagetgroup.Group pointers that were removed by mistake

* refactor: openstack, kubernetes: drop prefixes

* refactor: remove import aliases forced due to vscode bug

* refactor: move main SD struct out of config into discovery/config

* refactor: rename configUtil to config_util

* refactor: rename yamlUtil to yaml_config

* refactor: kubernetes, remove prefixes

* refactor: move the TargetGroup package to discovery/

* refactor: fix order of imports
2017-12-29 21:01:34 +01:00
Brian Brazil ecc24b554d
Hide block duration flags. (#3618)
Users are starting to use these mistakenly thinking they'll help
with issues, and thus causing some confusion.
Thus hide them and make it clear that they're only there for testing
reasons.
2017-12-24 12:13:48 +00:00
Krasi Georgiev c94fa731aa bypass the proxy for the tests 2017-12-20 18:21:10 +00:00
Krasi Georgiev ad66476c4f fix flaky main.go test and simplify a bit 2017-12-19 15:07:49 +00:00
Fabian Reinartz 2881d73ed8
Merge pull request #3362 from krasi-georgiev/discovery-refactoring
Decouple the discovery and refactor the retrieval package
2017-12-19 12:56:34 +01:00
Goutham Veeramachaneni 9c9f96b2c0
Merge pull request #3529 from krasi-georgiev/main-integration-test
main.go integration test for Startup interrupting.
2017-12-18 22:12:13 -06:00
Krasi Georgiev 587dec9eb9 rebased and resolved conflicts with the new Discovery GUI page
Signed-off-by: Krasi Georgiev <krasi.root@gmail.com>
2017-12-18 20:10:03 +00:00
Krasi Georgiev 1ec76d1950 rearange the contexts variables and logic
split the groupsMerge function to set and get
other small nits
2017-12-18 17:23:47 +00:00
Krasi Georgiev 6ff1d5c51e add the scrape manager config reloader
handle errors with invalid scrape config
2017-12-18 17:23:47 +00:00
Krasi Georgiev b0d4f6ee08 resolved merge confilc in main.go 2017-12-18 17:23:46 +00:00
Krasi Georgiev c5cb0d2910 simplify naming and API. 2017-12-18 17:22:50 +00:00
Krasi Georgiev 9c61f0e8a0 scrape pool doesn't rely on context as Stop() needs to be blocking to prevent Scrape loops trying to write to a closed TSDB storage. 2017-12-18 17:22:49 +00:00
Krasi Georgiev e405e2f1ea refactored discovery 2017-12-18 17:22:49 +00:00
pasquier-s 2440696961 Log file descriptor limits at startup (#3567)
Fixes #3564
2017-12-11 13:01:53 +00:00
Alberto Cortés 29da2fb9cd testutil: update to go1.9 testing.Helper 2017-12-08 19:06:53 +01:00
Alberto Cortés 8f6a9f7833 config: simplify tests by using testutil.NotOk (#3289)
Also include filename in all LoadFile errors

Also add mesage to testuitl.NotOk so we can identify failing tests when
using table driven tests.
2017-12-08 16:52:25 +00:00
Krasi Georgiev 740662644e write to temp dir and remove it at the end.
Signed-off-by: Krasi Georgiev <krasi.root@gmail.com>
2017-12-06 10:45:58 +00:00
Brian Brazil b97f4cf48c Add metrics for rule group interval and last duration. 2017-12-04 11:44:38 +00:00
Krasi Georgiev 2c2a962da3 main.go integration test for Startup interrupting. 2017-12-01 10:58:01 +00:00
Goutham Veeramachaneni 823b7f90b3
Use the files globbed files and not the files in cfg
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-11-30 17:08:34 +05:30
Fabian Reinartz 62461379b7 rules: decouple notifier packages
The dependency on the notifier packages caused a transitive dependency
on discovery and with that all client libraries our service discovery
uses.
2017-11-27 16:38:14 +01:00
Fabian Reinartz 4d964a0a0d rules: make glob expansion a concern of main 2017-11-24 08:22:57 +01:00
Fabian Reinartz bd9f7460eb rules: remove config package dependency 2017-11-24 07:57:54 +01:00
Fabian Reinartz 2d0e3746ac rules: remove dependency on promql.Engine 2017-11-24 07:57:54 +01:00
Krasi Georgiev e2f4850fea Refactor main.go with oklog/pkg/group actors pattern 2017-11-11 12:33:15 +00:00
Thibault Chataigner fc4406201e Tsdb StartTime : Use a simplier way to compute StartTime 2017-10-25 17:41:00 +02:00
Julius Volz 099df0c5f0 Migrate "golang.org/x/net/context" -> "context" (#3333)
In some places, where ctxhttp or gRPC are concerned, we still need to use the
old contexts.
2017-10-24 21:21:42 -07:00
Julius Volz 9d43176ab3 Remove unused printVersion variable (#3335)
Kingpin now automatically does this via --version.
2017-10-23 08:50:13 +01:00
Julius Volz 82c5b98496 Capitalize Prometheus in startup message (#3332)
Hey, branding :)
2017-10-23 08:49:28 +01:00
Thibault Chataigner bf4a279a91 Remote storage reads based on oldest timestamp in primary storage (#3129)
Currently all read queries are simply pushed to remote read clients.
This is fine, except for remote storage for wich it unefficient and
make query slower even if remote read is unnecessary.
So we need instead to compare the oldest timestamp in primary/local
storage with the query range lower boundary. If the oldest timestamp
is older than the mint parameter, then there is no need for remote read.
This is an optionnal behavior per remote read client.

Signed-off-by: Thibault Chataigner <t.chataigner@criteo.com>
2017-10-18 12:08:14 +01:00
Julius Volz 5f715f5733 Fix typo in flag description (#3302) 2017-10-16 23:00:05 +01:00
Tobias Schmidt 3589f2f1d4 Merge pull request #3285 from jlevesy/use-testutils-in-cmd-subpackage
Use testutil assertion helpers in cmd package
2017-10-13 00:12:39 +02:00
Julien Levesy d7b4fa8d78 use testutil assertions in the cmd/prometheus package 2017-10-12 13:45:38 +02:00
Mathieu Pasquet 38afa507bb Provide better errors messages in commandline
Instead or only printing the help message, which is not always helpful.
For example, when upgrading from prometheus v1, the retention time value
format has changed and now only accepts one unit (e.g. "15d") where it
previously allowed more complex strings (e.g. "360h0m0s").

This commit provides the error message as an explanation for the parsing
failure.
2017-10-09 16:25:50 +02:00
Marc Sluiter 6a633eece1 Added go-conntrack for monitoring http connections (#3241)
Added metrics for in- and outgoing traffic with go-conntrack.
2017-10-06 11:22:19 +01:00
Fabian Reinartz 2d0b8e8b94 Merge branch 'master' into dev-2.0 2017-10-05 13:09:18 +02:00
Paul Gier 08af129b4d cmd/prometheus: don't allow quotes at beginning or end of url
This prevents accidental copy/paste error where a the web.external-url
or alertmanager.url params could have an extra set of quotes.
See also: https://github.com/prometheus/prometheus/issues/1229
2017-10-04 10:10:02 -05:00
Paul Gier f79b55d057 cmd/prometheus: remove govalidator for url validation
The usage of govalidator is redundant with the call to url.Parse for
url validation. Removing it has the following benefits:

 - The explicit error message is displayed instead of just a generic
   valid/invalid message
 - Slightly smaller code with one fewer external dependency
 - Speed improvement by removing duplicate call to url.Parse (inside
   govalidator.IsURL()
 - Resolves issue #2717

The only potential drawback of removing govalidator is that certain
URLs will be considered valid which were previously invalid. For example:

 - URLs with hostnames that start and/or end with an underscore (http://_example.com_)
 - URLs with hostnames that contain some special characters (http://foo&*bar.org)

These are valid URIs according to RFC 3986 and valid domain names per RFC 2181,
however they are not valid hostnames per RFC 952.
2017-10-04 10:08:34 -05:00
Fabian Reinartz 7b02bfee0a web: start web handler while TSDB is starting up 2017-09-20 15:03:19 +02:00
Goutham Veeramachaneni f5aed810f9 logging: Port to common/promlog
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-15 12:40:50 +05:30
Fabian Reinartz d21f149745 *: migrate to go-kit/log 2017-09-08 22:01:51 +05:30
Fabian Reinartz c70379e1c7 Merge branch 'dev-2.0' of github.com:prometheus/prometheus into dev-2.0 2017-09-04 13:10:50 +02:00
Fabian Reinartz fffe51fb03 Add mutex and block profiling via envvar 2017-09-04 13:10:32 +02:00
Ben Kochie 59aca4138b Fix staticcheck issues. 2017-08-28 17:29:01 +02:00
Matt Bostock 64973f5c65 cmd/prometheus: Fix capitalisation in log line (#3123)
Change 'Ready' to 'ready'.
2017-08-28 11:03:25 +01:00
Mark Adams 77c816b309 Fix pprof endpoints when -web.route-prefix or -web.external-url is used (#3054)
Whenever a route prefix is applied, the router prepends the prefix to
the URL path on the request. For most handlers, this is not an issue
because the request's path is only used for routing and is not actually
needed by the handler itself. However, Prometheus delegates the handling
of the /debug/* endpoints to the http.DefaultServeMux which has it's own
routing logic that depends on the url.Path. As a result, whenever a
prefix is applied, the prefixed URL is passed to the DefaultServeMux
which has no awareness of the prefix and returns a 404.

This change fixes the issue by creating a new serveDebug handler which
routes requests /debug/* requests to appropriate net/http/pprof handler
and removing the net/http/pprof import in cmd/prometheus since it is no
longer necessary.

Fixes #2183.
2017-08-23 00:00:56 +01:00
Callum Styan 8912f81ffe check if file_sd files exist in checkConfig 2017-08-22 15:25:30 -07:00
Fabian Reinartz 25f3e1c424 Merge branch 'master' into mergemaster 2017-08-10 17:04:25 +02:00
KalivarapuReshma 686050d816 Change -config.file to --config.file in Readme and error message 2017-08-08 12:49:35 +05:30
emluque ff54c5c11a 2831 Add Healthy and Ready endpoints 2017-08-07 17:34:04 -03:00
Fabian Reinartz 4d3d8ee229 Merge pull request #2850 from tomwilkie/dev-2.0-remote
Remote APIs for v2
2017-08-03 13:39:09 +02:00
Julius Volz cc50aa2c6b main: Consistently end flag descriptions with periods. (#2977) 2017-07-20 23:48:35 +02:00
Tom Wilkie 2dda5775e3 Initial port of remote storage to v2. 2017-07-12 12:27:57 +01:00
Fabian Reinartz 32226e30f5 Guard reload and quit endpoints by flag 2017-07-11 14:25:07 +02:00
Fabian Reinartz 45ac064669 web: disable Amin APIs by default 2017-07-10 09:29:41 +02:00
Fabian Reinartz ccf9e62972 *: add admin grpc API 2017-07-10 09:14:14 +02:00
Fabian Reinartz be32afd6df cmd/prometheus: add back tsdb.no-lockfile flag 2017-06-22 15:02:10 +02:00
Goutham Veeramachaneni f9202c6511
Move from .yaml to .yml in update rules
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-21 18:38:37 +05:30
Goutham Veeramachaneni e3701077c3
Move promtool to kingpin
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-21 17:42:57 +05:30
Fabian Reinartz 867b8d108f cmd/prometheus: cleanup 2017-06-21 11:38:13 +02:00
Fabian Reinartz 34ab7a885a cmd/prometheus: switch to kingpin 2017-06-20 17:38:01 +02:00
Goutham Veeramachaneni 592cb00c2f
Remove version from RuleGroups
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-19 16:38:46 +05:30
Goutham Veeramachaneni 37e7b69f56
Merge remote-tracking branch 'upstream/dev-2.0' into rulegroups
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-19 16:34:55 +05:30
Goutham Veeramachaneni 67dc73fd59
Flag changes for 2.0
Fixes: prometheus/prometheus#2087

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 20:21:41 +05:30
Goutham Veeramachaneni d407bd150c Consolidate the duration params in CLI
* All CLI params moved to model.Duration

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 20:20:57 +05:30
Goutham Veeramachaneni 6b70a4d850
Incorporate PR feedback
* Move fingerprint to Hash()
* Move away from tsdb.MultiError
* 0777 -> 0666 for files
* checkOverflow of extra fields

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 16:44:33 +05:30
Goutham Veeramachaneni 6c1617fd13
Simplify usage string
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 15:55:13 +05:30
Goutham Veeramachaneni 507790a357
Rework logging to use explicitly passed logger
Mostly cleaned up the global logger use. Still some uses in discovery
package.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 15:52:44 +05:30
Goutham Veeramachaneni dc69645e92
Move back to go-yaml
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-16 10:46:21 +05:30
Goutham Veeramachaneni 8abb91f656
Move CLI commander to cobra
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-15 16:38:08 +05:30
Goutham Veeramachaneni 1c08743721
Update check-rules to new format.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-14 13:32:26 +05:30
Goutham Veeramachaneni cea1e99f78
Add update-rules command to promtool
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-14 11:38:54 +05:30
Fabian Reinartz 669075c6b9 Merge branch 'master' into dev-2.0 2017-06-06 09:36:51 +02:00