The Prometheus monitoring system and time series database.
Find a file
Matt Bostock 926a5ab3dd rules/manager.go: Fix race between reload and stop
On one relatively large Prometheus instance (1.7M series), I noticed
that upgrades were frequently resulting in Prometheus undergoing crash
recovery on start-up.

On closer examination, I found that Prometheus was panicking on
shutdown.

It seems that our configuration management (or misconfiguration thereof)
is reloading Prometheus then immediately restarting it, which I suspect
is causing this race:

    Sep 21 15:12:42 host systemd[1]: Reloading prometheus monitoring system.
    Sep 21 15:12:42 host prometheus[18734]: time="2016-09-21T15:12:42Z" level=info msg="Loading configuration file /etc/prometheus/config.yaml" source="main.go:221"
    Sep 21 15:12:42 host systemd[1]: Reloaded prometheus monitoring system.
    Sep 21 15:12:44 host systemd[1]: Stopping prometheus monitoring system...
    Sep 21 15:12:44 host prometheus[18734]: time="2016-09-21T15:12:44Z" level=warning msg="Received SIGTERM, exiting gracefully..." source="main.go:203"
    Sep 21 15:12:44 host prometheus[18734]: time="2016-09-21T15:12:44Z" level=info msg="See you next time!" source="main.go:210"
    Sep 21 15:12:44 host prometheus[18734]: time="2016-09-21T15:12:44Z" level=info msg="Stopping target manager..." source="targetmanager.go:90"
    Sep 21 15:12:52 host prometheus[18734]: time="2016-09-21T15:12:52Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:548"
    Sep 21 15:12:56 host prometheus[18734]: time="2016-09-21T15:12:56Z" level=warning msg="Error on ingesting out-of-order samples" numDropped=1 source="scrape.go:467"
    Sep 21 15:12:56 host prometheus[18734]: time="2016-09-21T15:12:56Z" level=error msg="Error adding file watch for \"/etc/prometheus/targets\": no such file or directory" source="file.go:84"
    Sep 21 15:12:56 host prometheus[18734]: time="2016-09-21T15:12:56Z" level=error msg="Error adding file watch for \"/etc/prometheus/targets\": no such file or directory" source="file.go:84"
    Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Stopping rule manager..." source="manager.go:366"
    Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Rule manager stopped." source="manager.go:372"
    Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Stopping notification handler..." source="notifier.go:325"
    Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Stopping local storage..." source="storage.go:381"
    Sep 21 15:13:01 host prometheus[18734]: time="2016-09-21T15:13:01Z" level=info msg="Stopping maintenance loop..." source="storage.go:383"
    Sep 21 15:13:01 host prometheus[18734]: panic: close of closed channel
    Sep 21 15:13:01 host prometheus[18734]: goroutine 7686074 [running]:
    Sep 21 15:13:01 host prometheus[18734]: panic(0xba57a0, 0xc60c42b500)
    Sep 21 15:13:01 host prometheus[18734]: /usr/local/go/src/runtime/panic.go:500 +0x1a1
    Sep 21 15:13:01 host prometheus[18734]: github.com/prometheus/prometheus/rules.(*Manager).ApplyConfig.func1(0xc6645a9901, 0xc420271ef0, 0xc420338ed0, 0xc60c42b4f0, 0xc6645a9900)
    Sep 21 15:13:01 host prometheus[18734]: /home/build/packages/prometheus/tmp/build/gopath/src/github.com/prometheus/prometheus/rules/manager.go:412 +0x3c
    Sep 21 15:13:01 host prometheus[18734]: created by github.com/prometheus/prometheus/rules.(*Manager).ApplyConfig
    Sep 21 15:13:01 host prometheus[18734]: /home/build/packages/prometheus/tmp/build/gopath/src/github.com/prometheus/prometheus/rules/manager.go:423 +0x56b
    Sep 21 15:13:03 host systemd[1]: prometheus.service: main process exited, code=exited, status=2/INVALIDARGUMENT
2016-09-21 22:03:02 +01:00
.github .github: Add issue template 2016-06-06 11:48:14 +02:00
cmd Add HTTP Basic Auth & TLS support to the generic write path. (#1957) 2016-09-19 22:47:51 +02:00
config Add HTTP Basic Auth & TLS support to the generic write path. (#1957) 2016-09-19 22:47:51 +02:00
console_libraries Add blackbox console. 2015-11-01 20:06:52 +00:00
consoles The metrics are no longer ms, we can remove the scaling. 2016-06-29 01:09:24 +01:00
documentation Switch back to protos over HTTP, instead of GRPC. 2016-09-15 23:21:54 +01:00
notifier Simplify struct initialization 2016-09-14 23:13:27 -04:00
promql storage: Contextify storage interfaces. 2016-09-19 16:29:07 +02:00
relabel move relabeling functionality to its own package 2016-08-09 14:19:20 +02:00
retrieval Add HTTP Basic Auth & TLS support to the generic write path. (#1957) 2016-09-19 22:47:51 +02:00
rules rules/manager.go: Fix race between reload and stop 2016-09-21 22:03:02 +01:00
scripts New release process using docker, circleci and a centralized 2016-04-18 22:41:04 +02:00
storage Add HTTP Basic Auth & TLS support to the generic write path. (#1957) 2016-09-19 22:47:51 +02:00
template storage: Contextify storage interfaces. 2016-09-19 16:29:07 +02:00
util Add HTTP Basic Auth & TLS support to the generic write path. (#1957) 2016-09-19 22:47:51 +02:00
vendor Add deps for google cloud support 2016-09-16 08:51:58 +02:00
web storage: Contextify storage interfaces. 2016-09-19 16:29:07 +02:00
.dockerignore New release process using docker, circleci and a centralized 2016-04-18 22:41:04 +02:00
.gitignore gitignore: clean up 2016-07-04 11:34:33 +02:00
.promu.yml Use the default go version for the crossbuilt process 2016-07-30 11:19:56 +02:00
.travis.yml Add go_import_path to travis so it works on a fork. (#1995) 2016-09-15 17:05:56 -04:00
AUTHORS.md Update Fabian's email address 2016-03-24 17:02:57 +01:00
CHANGELOG.md Cut v1.1.3 2016-09-16 13:08:16 +02:00
circle.yml Use golang-builder base image for tests in CircleCI 2016-09-09 13:13:21 +02:00
CONTRIBUTING.md Update CONTRIBUTING.md. 2015-01-22 15:07:20 +01:00
Dockerfile Docker: Move console dirs to /usr/share/prometheus 2016-07-29 14:00:47 +01:00
LICENSE Clean up license issues. 2015-01-21 20:07:45 +01:00
Makefile Add promu installation logging to Makefile 2016-09-16 00:59:56 +02:00
NOTICE Add support for Zookeeper Serversets for SD. 2015-06-16 11:02:08 +01:00
README.md Link to goreport from README 2016-09-14 23:09:26 -04:00
VERSION Cut v1.1.3 2016-09-16 13:08:16 +02:00

Prometheus Build Status

CircleCI Docker Repository on Quay Docker Pulls Go Report Card

Visit prometheus.io for the full documentation, examples and guides.

Prometheus, a Cloud Native Computing Foundation project, is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.

Prometheus' main distinguishing features as compared to other monitoring systems are:

  • a multi-dimensional data model (timeseries defined by metric name and set of key/value dimensions)
  • a flexible query language to leverage this dimensionality
  • no dependency on distributed storage; single server nodes are autonomous
  • timeseries collection happens via a pull model over HTTP
  • pushing timeseries is supported via an intermediary gateway
  • targets are discovered via service discovery or static configuration
  • multiple modes of graphing and dashboarding support
  • support for hierarchical and horizontal federation

Architecture overview

Install

There are various ways of installing Prometheus.

Precompiled binaries

Precompiled binaries for released versions are available in the download section on prometheus.io. Using the latest production release binary is the recommended way of installing Prometheus. See the Installing chapter in the documentation for all the details.

Debian packages are available.

Docker images

Docker images are available on Quay.io.

Building from source

To build Prometheus from the source code yourself you need to have a working Go environment with version 1.5 or greater installed.

You can directly use the go tool to download and install the prometheus and promtool binaries into your GOPATH. We use Go 1.5's experimental vendoring feature, so you will also need to set the GO15VENDOREXPERIMENT=1 environment variable in this case:

$ GO15VENDOREXPERIMENT=1 go get github.com/prometheus/prometheus/cmd/...
$ prometheus -config.file=your_config.yml

You can also clone the repository yourself and build using make:

$ mkdir -p $GOPATH/src/github.com/prometheus
$ cd $GOPATH/src/github.com/prometheus
$ git clone https://github.com/prometheus/prometheus.git
$ cd prometheus
$ make build
$ ./prometheus -config.file=your_config.yml

The Makefile provides several targets:

  • build: build the prometheus and promtool binaries
  • test: run the tests
  • format: format the source code
  • vet: check the source code for common errors
  • assets: rebuild the static assets
  • docker: build a docker container for the current HEAD

More information

  • The source code is periodically indexed: Prometheus Core.
  • You will find a Travis CI configuration in .travis.yml.
  • All of the core developers are accessible via the Prometheus Developers Mailinglist and the #prometheus channel on irc.freenode.net.

Contributing

Refer to CONTRIBUTING.md

License

Apache License 2.0, see LICENSE.