Commit graph

4026 commits

Author SHA1 Message Date
Brian Brazil 37bc607e96 Rework sample limit to work for 2.0
Correctly update reported series.
Increment prometheus_target_scrapes_exceeded_sample_limit_total.
Add back unittests.
Ignore stale markers when calculating sample limit.

Fixes #2770
2017-05-31 15:41:51 +01:00
Brian Brazil 72a276e7ed Pass through storage errors in limitAppender. 2017-05-26 11:28:22 +01:00
Fabian Reinartz ab0ce4a8d9 *: cut v2.0.0-alpha.1 2017-05-24 17:31:48 +02:00
Fabian Reinartz 8fef036078 Merge pull request #2765 from prometheus/memmap
retrieval: Don't allocate map on every scrape
2017-05-24 17:23:59 +02:00
Fabian Reinartz 3d8661b8d5 Add comment 2017-05-24 17:05:42 +02:00
Fabian Reinartz 43ca652217 retrieval: Don't allocate map on every scrape 2017-05-24 16:23:48 +02:00
Fabian Reinartz 4c31061251 Merge branch 'master' into dev-2.0 2017-05-24 15:36:17 +02:00
Fabian Reinartz 025f5531ad Merge pull request #2681 from prometheus/grobie/reduce-noisy-append-errors
Handle errSeriesDropped correctly
2017-05-24 15:30:02 +02:00
Fabian Reinartz d3f662f15e Merge branch 'dev-2.0' into grobie/reduce-noisy-append-errors 2017-05-24 15:29:30 +02:00
Brian Brazil e5f94145b8 Drop series for federation if latest sample is stale. 2017-05-24 14:27:17 +01:00
Brian Brazil 220e78b9c3 Consider a series stale after 4.1 intervals with no data.
To cover the cases where stale markers may not be available,
we need to infer the interval and mark series stale based on that.
As we're lacking stale markers this is less accurate, however
it should be good enough for these cases.

We need 4 intervals as if say we had data at t=0 and t=10,
coming via federation. The next data point should be at t=20 however it
could take up to t=30 for it actually to be ingested, t=40 for it to be
scraped via federation and t=50 for it to be ingested.
We then add 10% on to that for slack, as we do elsewhere.
2017-05-24 14:27:17 +01:00
Brian Brazil c02c25d5ba Allow peeking back further in buffer. 2017-05-24 14:27:17 +01:00
Fabian Reinartz 10d8b6b633 Merge pull request #2764 from prometheus/nullparse
pkg/textparse: allow null bytes in label values
2017-05-24 15:24:38 +02:00
conorbroderick 9c953064c3 check if result is a scalar in order to display correct number of returned time series 2017-05-24 14:07:24 +01:00
Fabian Reinartz bdc763f95f pkg/textparse: allow null bytes in label values 2017-05-24 14:52:46 +02:00
Brian Brazil dcea3e4773 Don't append a 0 when alert is no longer pending/firing
With staleness we no longer need this behaviour.
2017-05-24 13:52:45 +01:00
Brian Brazil cc867dae60 Copy previous series and alert state more intelligently.
Usually rules don't more around, and if they do it's likely
that rules/alerts with the same name stay in the same order.

If rules/alerts with the same name are added/removed this
could cause a blip for one cycle, but this is unavoidable
without requiring rule and alert names to be unique - which we don't
want to do.
2017-05-24 13:52:45 +01:00
Brian Brazil 9bc68db7e6 Track staleness per rule rather than per group. 2017-05-24 13:52:45 +01:00
Brian Brazil 0451d6d31b Add unittest for rule staleness, and rules generally. 2017-05-24 13:52:45 +01:00
Brian Brazil 0400f3cfd2 Very basic staleness handling for rules. 2017-05-24 13:52:45 +01:00
Brian Brazil 9aa8f822c1 Fix typo 2017-05-24 13:52:45 +01:00
Fabian Reinartz 09fcbf78df Merge pull request #2755 from brancz/redirect-prefix
prefix redirect with external url path
2017-05-24 10:09:47 +02:00
Tobias Schmidt 5405a4724f Use tag names consistently (#2743) 2017-05-23 14:14:15 +02:00
Fabian Reinartz fa58fdbd5d Merge pull request #2753 from prometheus/uptsdb2
storage: update TSDB
2017-05-22 16:32:25 +02:00
Frederic Branczyk ad22606a3d
web: prefix redirect with ExternalURL path 2017-05-22 14:56:52 +02:00
Frederic Branczyk 45df5c2daf
Merge branch 'release-1.6' 2017-05-22 13:44:44 +02:00
Fabian Reinartz d289dc55c3 storage: update TSDB 2017-05-22 11:53:08 +02:00
Fabian Reinartz ea09299ca5 pkg/textparse: handle trailing labels comma (#2752) 2017-05-22 11:15:40 +02:00
Jacky Wu 75b89739de Fix go version hint. (#2750) 2017-05-20 18:33:14 +02:00
Fabian Reinartz 10cccd2e45 Bump version file 2017-05-19 09:56:41 +02:00
Frederic Branczyk 7d17ecbd48 Merge pull request #2735 from brancz/cut-1.6.3
cut 1.6.3
2017-05-18 16:56:54 +02:00
Frederic Branczyk 53a2bd71b9
*: cut 1.6.3 2017-05-18 16:51:46 +02:00
Tobias Schmidt 2ae2b663a9
Create sha256 checksums file during release 2017-05-18 16:50:44 +02:00
Tom Wilkie e9787382b4
Ensure ewma int64s are always aligned. (#2675) 2017-05-18 16:50:44 +02:00
Frederic Branczyk 363554f675 Merge pull request #2739 from Conorbro/stack-graph-fix
Fixed graph ui max/min logic to accommodate for toggling of stacked graph option
2017-05-18 16:49:30 +02:00
conorbroderick 9287a01bbf Fixed fixed yaxis of stacked graph being cut off 2017-05-18 15:18:29 +01:00
Frederic Branczyk b916b3784b Merge pull request #2731 from brancz/lset-non-cloned
notifier: clone and not reuse LabelSet in AM discovery
2017-05-18 10:59:38 +02:00
Frederic Branczyk 94e8b43aae
notifier: clone and not reuse LabelSet in AM discovery 2017-05-18 10:12:42 +02:00
Fazal Majid 0e05cccfbd updated logrus so Prometheus can build on Solaris/Illumos (#2733) 2017-05-17 22:50:43 +02:00
Brian Brazil 0920972f79 Initilise scraped sample map, and rename to series map. 2017-05-16 18:33:51 +01:00
Brian Brazil bf38963118 Plumb through logger with target field to scrape loop. 2017-05-16 18:33:51 +01:00
Brian Brazil d657d722dc Log count of dupliates/out of order samples as warnings.
Keep log of each sample as debug log.
2017-05-16 18:33:51 +01:00
Brian Brazil 8b9d3e7547 Put end of run staleness handler in seperate function.
Improve log message.
2017-05-16 18:33:51 +01:00
Brian Brazil d532272520 Add stalemarkers to synthetic series too when target stops. 2017-05-16 18:33:51 +01:00
Brian Brazil b87d3ca9ea Create stale markers when a target is stopped.
When a target is no longer returned from SD stop()
is called. However it may be recreated before the
next scrape interval happens. So we wait to set stalemarkers
until the scrape of the new target would have happened
and been ingested, which is 2 scrape intervals.

If we're shutting down the context will be cancelled,
so return immediately rather than holding things up for potentially
minutes waiting to safely set stalemarkers no newer than now.
If the server starts immediately back up again all is well.
If not, we're missing some stale markers.
2017-05-16 18:33:51 +01:00
Brian Brazil 73049ba79d Make the choice of NaN values clearer.
Also switch stale nan value to one more suitable for expansion.
2017-05-16 18:33:51 +01:00
Brian Brazil 95162ebc16 Add log messages for out of order samples 2017-05-16 18:33:51 +01:00
Brian Brazil 3c45400130 Don't fail scrape if one sample violates ordering.
In Prometheus 1.x one sample that is out of order
or that has a duplicate timestamp is discarded, and
the rest of the scrape ingestion continues on.
This will now also be true for 2.0.
2017-05-16 18:33:51 +01:00
Brian Brazil fd5c5a50a3 Add stale markers on parse error.
If we fail to parse the result of a scrape,
we should treat that as a failed scrape and
add stale markers.
2017-05-16 18:33:51 +01:00
Brian Brazil c0c7e32e61 Treat a failed scrape as an empty scrape for staleness.
If a target has died but is still in SD, we want the previously
scraped values to go stale. This would also apply to brief blips.
2017-05-16 18:33:51 +01:00