Commit graph

5879 commits

Author SHA1 Message Date
Fabian Reinartz 437f51a85f Fix cache maintenance on changing metric representations
We were not properly maintaining the scrape cache when the same metric
was exposed with a different string representation.
This overall reduces the scraping cache's complexity, which fixes the
issue and saves about 10% of memory in a scraping-only Prometheus
instance.
2017-09-19 15:03:27 +02:00
Fabian Reinartz a04be0bc1c vendor: update prometheus/tsdb 2017-09-19 14:31:15 +02:00
Takahito Yamatoya b1151bdabc ui changed, limit the number of digits for the larger units is 5 2017-09-19 11:11:39 +09:00
Fabian Reinartz d6fbfb49eb Merge pull request #3137 from krasi-georgiev/3083-api-ignores-prefix-option
fixes #3083 - api ignores prefix option - when binary started with custom web.external-url
2017-09-18 18:10:50 +02:00
Krasi Georgiev b4b0999e7f add prefix to the api when prometheus started with custom web.external-url
Signed-off-by: Krasi Georgiev <krasi.root@gmail.com>
2017-09-18 17:59:27 +03:00
Takahito Yamatoya 1eac566d09 add ; , and change from B to G, and change from K to k, and add all the prefixes 2017-09-18 22:55:22 +09:00
Tom Wilkie fae3bd17b9 Merge pull request #3132 from tomwilkie/fix-debug-handlers
Get pprof handlers working again
2017-09-18 14:24:08 +01:00
Tom Wilkie bbc9671d50 Get profile handlers working again after #3054 and #3146.
Ensures the pprod endpoints deal with path-prefixes correctly; adds a test so we don't break it again.
2017-09-18 13:27:09 +01:00
Goutham Veeramachaneni 6c0070986d Merge pull request #3152 from Gouthamve/go-kit/log
Move logging to go-kit logger
2017-09-18 16:35:44 +05:30
beorn7 e7aab2791a Forward-merge bug fixes frem branch 'release-1.7' 2017-09-18 12:14:37 +02:00
beorn7 f6367afca4 Merge branch 'yamatoya-fix_web_ui_utc' into release-1.7 2017-09-18 12:08:14 +02:00
beorn7 7a8e340c1a Merge branch 'fix_web_ui_utc' of git://github.com/yamatoya/prometheus into yamatoya-fix_web_ui_utc 2017-09-18 12:07:52 +02:00
Takahito Yamatoya 5d707d3aa3 #2439 library version update JQuery / JQuery.Selection / JQuery.hotkey (#3183) 2017-09-18 11:45:57 +02:00
Takahito Yamatoya ff038a4a39 bug fix 2017-09-17 00:20:39 +09:00
Takahito Yamatoya 7a3c348f83 fix decimal y-axis 2017-09-17 00:16:40 +09:00
Tom Wilkie 758d64ffd9 s/EncodReadResponse/EncodeReadResponse/ 2017-09-16 11:15:03 +02:00
Tom Wilkie febed48703 Implement remote read server in Prometheus. 2017-09-16 11:13:01 +02:00
Takahito Yamatoya 738a51bea6 #2371 fix to display utc date at datetime picker 2017-09-16 11:38:29 +09:00
Goutham Veeramachaneni 3f0267c548 Merge branch 'dev-2.0' into go-kit/log
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-15 23:15:27 +05:30
beorn7 84211bd2df Foward-merge bug fixes and cherry-picks from 'release-1.7' 2017-09-15 13:44:22 +02:00
Matt Palmer 3369422327 Improve DNS response handling to prevent "stuck" records [Fixes #2799] (#3138)
The problem reported in #2799 was that in the event that all records for a
name were removed, the target group was never updated to be the "empty" set.
Essentially, whatever Prometheus last saw as a non-empty list of targets
would stay that way forever (or at least until Prometheus restarted...).  This
came about because of a fairly naive interpretation of what a valid-looking
DNS response actually looked like -- essentially, the only valid DNS responses
were ones that had a non-empty record list.  That's fine as long as your
config always lists only target names which have non-empty record sets; if
your environment happens to legitimately have empty record sets sometimes,
all hell breaks loose (otherwise-cleanly shutdown systems trigger up==0 alerts,
for instance).

This patch is a refactoring of the DNS lookup behaviour that maintains
existing behaviour with regard to search paths, but correctly handles empty
and non-existent record sets.

RFC1034 s4.3.1 says there's three ways a recursive DNS server can respond:

1.  Here is your answer (possibly an empty answer, because of the way DNS
   considers all records for a name, regardless of type, when deciding
   whether the name exists).

2. There is no spoon (the name you asked for definitely does not exist).

3. I am a teapot (something has gone terribly wrong).

Situations 1 and 2 are fine and dandy; whatever the answer is (empty or
otherwise) is the list of targets.  If something has gone wrong, then we
shouldn't go updating the target list because we don't really *know* what
the target list should be.

Multiple DNS servers to query is a straightforward augmentation; if you get
an error, then try the next server in the list, until you get an answer or
run out servers to ask.  Only if *all* the servers return errors should you
return an error to the calling code.

Where things get complicated is the search path.  In order to be able to
confidently say, "this name does not exist anywhere, you can remove all the
targets for this name because it's definitely GORN", at least one server for
*all* the possible names need to return either successful-but-empty
responses, or NXDOMAIN.  If any name errors out, then -- since that one
might have been the one where the records came from -- you need to say
"maintain the status quo until we get a known-good response".

It is possible, though unlikely, that a poorly-configured DNS setup (say,
one which had a domain in its search path for which all configured recursive
resolvers respond with REFUSED) could result in the same "stuck" records
problem we're solving here, but the DNS configuration should be fixed in
that case, and there's nothing we can do in Prometheus itself to fix the
problem.

I've tested this patch on a local scratch instance in all the various ways I
can think of:

1. Adding records (targets get scraped)

2. Adding records of a different type

3. Remove records of the requested type, leaving other type records intact
   (targets don't get scraped)

4. Remove all records for the name (targets don't get scraped)

5. Shutdown the resolver (targets still get scraped)

There's no automated test suite additions, because there isn't a test suite
for DNS discovery, and I was stretching my Go skills to the limit to make
this happen; mock objects are beyond me.
2017-09-15 12:26:10 +02:00
Goutham Veeramachaneni f5aed810f9 logging: Port to common/promlog
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-15 12:40:50 +05:30
Björn Rabenstein 4b8666b739 Merge pull request #3176 from prometheus/beorn7/release
Backport the templating fix from master
2017-09-14 19:07:52 +02:00
beorn7 7622c2bc5f Move to Go1.9 2017-09-14 18:26:57 +02:00
beorn7 a3fd7dd335 Backport the templating fix from master
The original fix is in commit 5f5d77848e
2017-09-14 18:12:00 +02:00
Julius Volz 8ebeed0b44 remote: Expose ClientConfig type (#3165)
The Client type is already exposed, but can't be used without the config for it
also being exposed. Using the remote.Client from other programs is useful to do
full end-to-end tests of Prometheus's remote protocol against adapter
implementations.
2017-09-14 15:25:09 +02:00
Björn Rabenstein df4bc3e407 Merge pull request #3170 from tomwilkie/1.7-2969-negative-shards
Prevent number of remote write shards from going negative.
2017-09-14 13:29:34 +02:00
Fabian Reinartz 1b80f631a8 Merge pull request #3172 from prometheus/cutbeta4
*: cut 2.0.0-beta.4
2017-09-14 13:20:26 +02:00
Fabian Reinartz a31e6522e4 *: cut 2.0.0-beta.4 2017-09-14 12:46:49 +02:00
Tom Wilkie f66f882d08 Merge pull request #3160 from bboreham/remote-keepalive
Re-enable http keepalive on remote storage
2017-09-14 08:23:43 +01:00
Tom Wilkie 4f8efdbd59 Prevent number of remote write shards from going negative.
This can happen in the situation where the system scales up the number of shards massively (to deal with some backlog), then scales it down again as the number of samples sent during the time period is less than the number received.
2017-09-14 08:07:40 +01:00
Fabian Reinartz 1121b9f7d4 retrieval: cache dropped series, mutate labels in place 2017-09-14 08:36:19 +02:00
Ben Kochie 1ab0bbb2c2 Merge pull request #3125 from prometheus/bjk/staticcheck
Enable statitcheck at build time.
2017-09-13 14:42:29 -07:00
Björn Rabenstein 4d8e7ca185 Merge pull request #3159 from mattbostock/1.7_marathon_sd_cherrypick
Marathon SD: Set port index label
2017-09-12 18:53:40 +02:00
Fabian Reinartz 13f59329ab Merge pull request #3162 from prometheus/cutbeta3
*: cut 2.0.0-beta.3
2017-09-12 12:13:16 +02:00
Fabian Reinartz 7f300f27cb *: cut 2.0.0-beta.3 2017-09-12 12:13:31 +02:00
Fabian Reinartz 2b2e214857 vendor: update prometheus/tsdb 2017-09-12 12:01:54 +02:00
Fabian Reinartz 63c246f924 Merge branch 'dev-2.0' of github.com:prometheus/prometheus into dev-2.0 2017-09-12 12:01:09 +02:00
Matt Bostock e758260986 Marathon SD: Set port index label
The changes [1][] to Marathon service discovery to support multiple
ports mean that Prometheus now attempts to scrape all ports belonging to
a Marathon service.

You can use port definition or port mapping labels to filter out which
ports to scrape but that requires service owners to update their
Marathon configuration.

To allow for a smoother migration path, add a
`__meta_marathon_port_index` label, whose value is set to the port's
sequential index integer. For example, PORT0 has the value `0`, PORT1
has the value `1`, and so on.

This allows you to support scraping both the first available port (the
previous behaviour) in addition to ports with a `metrics` label.

For example, here's the relabel configuration we might use with
this patch:

    - action: keep
      source_labels: ['__meta_marathon_port_definition_label_metrics', '__meta_marathon_port_mapping_label_metrics', '__meta_marathon_port_index']
      # Keep if port mapping or definition has a 'metrics' label with any
      # non-empty value, or if no 'metrics' port label exists but this is the
      # service's first available port
      regex: ([^;]+;;[^;]+|;[^;]+;[^;]+|;;0)

This assumes that the Marathon API returns the ports in sorted order
(matching PORT0, PORT1, etc), which it appears that it does.

[1]: https://github.com/prometheus/prometheus/pull/2506
2017-09-11 13:40:51 +01:00
Bryan Boreham 9d6b945e41 Default HTTP keep-alive ON for remote read/write 2017-09-11 09:48:30 +00:00
Bryan Boreham e0a4d18301 Allow http keep-alive setting to be overridden in config 2017-09-11 09:07:14 +00:00
Fabian Reinartz e746282772 Merge branch 'master' into dev-2.0 2017-09-11 10:55:19 +02:00
Fabian Reinartz e6d819952b Merge pull request #3145 from prometheus/mempool
Use memory pools for scrape buffer
2017-09-11 10:23:19 +02:00
Tobias Schmidt 8bee283f8a Merge pull request #2895 from jamiemoore/ec2_discovery_rolearn
Add the ability to assume a role for ec2 discovery
2017-09-09 19:20:47 +02:00
Jamie Moore 7a135e0a1b Add the ability to assume a role for ec2 discovery 2017-09-10 00:36:43 +10:00
Fabian Reinartz d21f149745 *: migrate to go-kit/log 2017-09-08 22:01:51 +05:30
Fabian Reinartz 9b4c3d4254 Merge pull request #3146 from prometheus/fixprofpath
web: fix profile paths
2017-09-08 14:19:46 +02:00
Fabian Reinartz 64c7c56df8 Merge pull request #3147 from dvrkps/patch-1
travis: add 1.x to go versions
2017-09-08 09:36:14 +02:00
Davor Kapsa bb853abf24 travis: add 1.x to go versions 2017-09-07 17:24:02 +02:00
Fabian Reinartz 27bdddbf51 web: fix profile paths 2017-09-07 16:24:12 +02:00