Commit graph

7154 commits

Author SHA1 Message Date
Fabian Reinartz 7ada9cd805 Simplify series create logic in head 2017-09-18 12:38:39 +02:00
beorn7 e7aab2791a Forward-merge bug fixes frem branch 'release-1.7' 2017-09-18 12:14:37 +02:00
beorn7 f6367afca4 Merge branch 'yamatoya-fix_web_ui_utc' into release-1.7 2017-09-18 12:08:14 +02:00
beorn7 7a8e340c1a Merge branch 'fix_web_ui_utc' of git://github.com/yamatoya/prometheus into yamatoya-fix_web_ui_utc 2017-09-18 12:07:52 +02:00
Takahito Yamatoya 5d707d3aa3 #2439 library version update JQuery / JQuery.Selection / JQuery.hotkey (#3183) 2017-09-18 11:45:57 +02:00
Goutham Veeramachaneni 2187388292 Merge pull request #146 from prometheus/deadlock
Add missing unlock on early return
2017-09-18 15:02:32 +05:30
Fabian Reinartz ab8d9b9706 Add missing unlock on early return 2017-09-18 11:23:22 +02:00
Fabian Reinartz 99d39174f6 Merge branch 'master' of github.com:prometheus/tsdb 2017-09-18 11:20:45 +02:00
Fabian Reinartz 8214dc82a7 Remove infinite block in benchmark 2017-09-18 11:20:25 +02:00
Takahito Yamatoya ff038a4a39 bug fix 2017-09-17 00:20:39 +09:00
Takahito Yamatoya 7a3c348f83 fix decimal y-axis 2017-09-17 00:16:40 +09:00
Tom Wilkie 758d64ffd9 s/EncodReadResponse/EncodeReadResponse/ 2017-09-16 11:15:03 +02:00
Tom Wilkie febed48703 Implement remote read server in Prometheus. 2017-09-16 11:13:01 +02:00
Takahito Yamatoya 738a51bea6 #2371 fix to display utc date at datetime picker 2017-09-16 11:38:29 +09:00
Goutham Veeramachaneni 3f0267c548 Merge branch 'dev-2.0' into go-kit/log
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-15 23:15:27 +05:30
beorn7 84211bd2df Foward-merge bug fixes and cherry-picks from 'release-1.7' 2017-09-15 13:44:22 +02:00
Matt Palmer 3369422327 Improve DNS response handling to prevent "stuck" records [Fixes #2799] (#3138)
The problem reported in #2799 was that in the event that all records for a
name were removed, the target group was never updated to be the "empty" set.
Essentially, whatever Prometheus last saw as a non-empty list of targets
would stay that way forever (or at least until Prometheus restarted...).  This
came about because of a fairly naive interpretation of what a valid-looking
DNS response actually looked like -- essentially, the only valid DNS responses
were ones that had a non-empty record list.  That's fine as long as your
config always lists only target names which have non-empty record sets; if
your environment happens to legitimately have empty record sets sometimes,
all hell breaks loose (otherwise-cleanly shutdown systems trigger up==0 alerts,
for instance).

This patch is a refactoring of the DNS lookup behaviour that maintains
existing behaviour with regard to search paths, but correctly handles empty
and non-existent record sets.

RFC1034 s4.3.1 says there's three ways a recursive DNS server can respond:

1.  Here is your answer (possibly an empty answer, because of the way DNS
   considers all records for a name, regardless of type, when deciding
   whether the name exists).

2. There is no spoon (the name you asked for definitely does not exist).

3. I am a teapot (something has gone terribly wrong).

Situations 1 and 2 are fine and dandy; whatever the answer is (empty or
otherwise) is the list of targets.  If something has gone wrong, then we
shouldn't go updating the target list because we don't really *know* what
the target list should be.

Multiple DNS servers to query is a straightforward augmentation; if you get
an error, then try the next server in the list, until you get an answer or
run out servers to ask.  Only if *all* the servers return errors should you
return an error to the calling code.

Where things get complicated is the search path.  In order to be able to
confidently say, "this name does not exist anywhere, you can remove all the
targets for this name because it's definitely GORN", at least one server for
*all* the possible names need to return either successful-but-empty
responses, or NXDOMAIN.  If any name errors out, then -- since that one
might have been the one where the records came from -- you need to say
"maintain the status quo until we get a known-good response".

It is possible, though unlikely, that a poorly-configured DNS setup (say,
one which had a domain in its search path for which all configured recursive
resolvers respond with REFUSED) could result in the same "stuck" records
problem we're solving here, but the DNS configuration should be fixed in
that case, and there's nothing we can do in Prometheus itself to fix the
problem.

I've tested this patch on a local scratch instance in all the various ways I
can think of:

1. Adding records (targets get scraped)

2. Adding records of a different type

3. Remove records of the requested type, leaving other type records intact
   (targets don't get scraped)

4. Remove all records for the name (targets don't get scraped)

5. Shutdown the resolver (targets still get scraped)

There's no automated test suite additions, because there isn't a test suite
for DNS discovery, and I was stretching my Go skills to the limit to make
this happen; mock objects are beyond me.
2017-09-15 12:26:10 +02:00
Goutham Veeramachaneni f5aed810f9 logging: Port to common/promlog
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-15 12:40:50 +05:30
Björn Rabenstein 4b8666b739 Merge pull request #3176 from prometheus/beorn7/release
Backport the templating fix from master
2017-09-14 19:07:52 +02:00
beorn7 7622c2bc5f Move to Go1.9 2017-09-14 18:26:57 +02:00
beorn7 a3fd7dd335 Backport the templating fix from master
The original fix is in commit 5f5d77848e
2017-09-14 18:12:00 +02:00
Julius Volz 8ebeed0b44 remote: Expose ClientConfig type (#3165)
The Client type is already exposed, but can't be used without the config for it
also being exposed. Using the remote.Client from other programs is useful to do
full end-to-end tests of Prometheus's remote protocol against adapter
implementations.
2017-09-14 15:25:09 +02:00
Björn Rabenstein df4bc3e407 Merge pull request #3170 from tomwilkie/1.7-2969-negative-shards
Prevent number of remote write shards from going negative.
2017-09-14 13:29:34 +02:00
Fabian Reinartz 1b80f631a8 Merge pull request #3172 from prometheus/cutbeta4
*: cut 2.0.0-beta.4
2017-09-14 13:20:26 +02:00
Fabian Reinartz a31e6522e4 *: cut 2.0.0-beta.4 2017-09-14 12:46:49 +02:00
Tom Wilkie f66f882d08 Merge pull request #3160 from bboreham/remote-keepalive
Re-enable http keepalive on remote storage
2017-09-14 08:23:43 +01:00
Tom Wilkie 4f8efdbd59 Prevent number of remote write shards from going negative.
This can happen in the situation where the system scales up the number of shards massively (to deal with some backlog), then scales it down again as the number of samples sent during the time period is less than the number received.
2017-09-14 08:07:40 +01:00
Fabian Reinartz 1121b9f7d4 retrieval: cache dropped series, mutate labels in place 2017-09-14 08:36:19 +02:00
Ben Kochie 1ab0bbb2c2 Merge pull request #3125 from prometheus/bjk/staticcheck
Enable statitcheck at build time.
2017-09-13 14:42:29 -07:00
Fabian Reinartz 643563068b Merge pull request #144 from Gouthamve/misc-
Expose NewIndexReader() and cleanups
2017-09-13 10:23:07 +02:00
Goutham Veeramachaneni 8919baef03
Expose NewIndexReader() and cleanups
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-13 13:47:20 +05:30
Fabian Reinartz e45bb1d328 Merge pull request #142 from krasi-georgiev/107-swap-cobra-with-kingpin
replaced cobra with kingpin for the tsdb cli tool
2017-09-12 19:28:35 +02:00
Björn Rabenstein 4d8e7ca185 Merge pull request #3159 from mattbostock/1.7_marathon_sd_cherrypick
Marathon SD: Set port index label
2017-09-12 18:53:40 +02:00
Fabian Reinartz 13f59329ab Merge pull request #3162 from prometheus/cutbeta3
*: cut 2.0.0-beta.3
2017-09-12 12:13:16 +02:00
Fabian Reinartz 7f300f27cb *: cut 2.0.0-beta.3 2017-09-12 12:13:31 +02:00
Fabian Reinartz 2b2e214857 vendor: update prometheus/tsdb 2017-09-12 12:01:54 +02:00
Fabian Reinartz 63c246f924 Merge branch 'dev-2.0' of github.com:prometheus/prometheus into dev-2.0 2017-09-12 12:01:09 +02:00
Matt Bostock e758260986 Marathon SD: Set port index label
The changes [1][] to Marathon service discovery to support multiple
ports mean that Prometheus now attempts to scrape all ports belonging to
a Marathon service.

You can use port definition or port mapping labels to filter out which
ports to scrape but that requires service owners to update their
Marathon configuration.

To allow for a smoother migration path, add a
`__meta_marathon_port_index` label, whose value is set to the port's
sequential index integer. For example, PORT0 has the value `0`, PORT1
has the value `1`, and so on.

This allows you to support scraping both the first available port (the
previous behaviour) in addition to ports with a `metrics` label.

For example, here's the relabel configuration we might use with
this patch:

    - action: keep
      source_labels: ['__meta_marathon_port_definition_label_metrics', '__meta_marathon_port_mapping_label_metrics', '__meta_marathon_port_index']
      # Keep if port mapping or definition has a 'metrics' label with any
      # non-empty value, or if no 'metrics' port label exists but this is the
      # service's first available port
      regex: ([^;]+;;[^;]+|;[^;]+;[^;]+|;;0)

This assumes that the Marathon API returns the ports in sorted order
(matching PORT0, PORT1, etc), which it appears that it does.

[1]: https://github.com/prometheus/prometheus/pull/2506
2017-09-11 13:40:51 +01:00
Bryan Boreham 9d6b945e41 Default HTTP keep-alive ON for remote read/write 2017-09-11 09:48:30 +00:00
Bryan Boreham e0a4d18301 Allow http keep-alive setting to be overridden in config 2017-09-11 09:07:14 +00:00
Fabian Reinartz e746282772 Merge branch 'master' into dev-2.0 2017-09-11 10:55:19 +02:00
Fabian Reinartz 3870ec285c Merge pull request #140 from prometheus/locks
Fix various races
2017-09-11 10:41:33 +02:00
Fabian Reinartz 3d8be398d6 Merge branch 'locks' of github.com:prometheus/tsdb into locks 2017-09-11 10:39:55 +02:00
Fabian Reinartz 24362567b9 Fix test flakes 2017-09-11 10:33:17 +02:00
Fabian Reinartz e6d819952b Merge pull request #3145 from prometheus/mempool
Use memory pools for scrape buffer
2017-09-11 10:23:19 +02:00
Fabian Reinartz 30d29b889c Merge pull request #141 from prometheus/dectmpstr
Fixup allocation regression during compaction
2017-09-11 10:06:49 +02:00
Fabian Reinartz bfe75cec35 Merge pull request #138 from Gouthamve/chunk-compress
Compress the series chunk details in index.
2017-09-11 10:01:42 +02:00
Krasi Georgiev 92d0414993 replaced cobra with kingpin
Signed-off-by: Krasi Georgiev <krasi.root@gmail.com>
2017-09-10 12:35:02 +03:00
Tobias Schmidt 8bee283f8a Merge pull request #2895 from jamiemoore/ec2_discovery_rolearn
Add the ability to assume a role for ec2 discovery
2017-09-09 19:20:47 +02:00
Jamie Moore 7a135e0a1b Add the ability to assume a role for ec2 discovery 2017-09-10 00:36:43 +10:00