Commit graph

270 commits

Author SHA1 Message Date
Matthieu MOREL 7eaefcf379
ci(lint): enable errorlint on scrape (#12923)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
Signed-off-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>
Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>
2023-11-01 20:06:46 +01:00
George Krajcsovits e399395b01
Native histograms vs labels (#13005)
* Document le and quantile label transition due to native histograms

Fixes: #12984

For full explanation see the related issue. The le and quantile labels
are formatted as float with trailing .0 for whole number values when
native histograms is enabled, e.g. 10.0. This changes the resulting series
in Prometheus if previously we scraped the whole number itself, e.g. 10
over the text format.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
2023-11-01 18:30:34 +01:00
Björn Rabenstein a43669e611
Merge pull request #12928 from alexandear/ci-enable-godot
ci(lint): enable godot; append dot at the end of comments
2023-11-01 17:15:41 +01:00
Julien Pivotto 84aadfc45b scrape: Added trackTimestampsStaleness configuration option
Add the ability to track staleness when an explicit timestamp is set.
Useful for cAdvisor.

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-10-31 16:58:42 -04:00
Oleksandr Redko fa90ca46e5 ci(lint): enable godot; append dot at the end of comments
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2023-10-31 19:53:38 +02:00
Oleksandr Redko 8e5f0387a2
ci(lint): enable nolintlint and remove redundant comments (#12926)
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2023-10-31 12:35:13 +01:00
Paulin Todev 5752050b42
Scrape metrics can now be registered with a non-default registry.
* A registerer is passed to the scrape Manager,
and all scrape metrics register with it.
* For now the registry which we pass to the scrape
Manager is still the global one.

Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
2023-10-11 16:19:00 +01:00
Bartlomiej Plotka 624b973ebf
Added ability to specify scrape protocols to accept during HTTP content type negotiation. (#12738)
* Added ability to specify scrape protocols to accept during HTTP content type negotiation.


This is done via new option in GlobalConfig and ScrapeConfig: "scrape_protocol"

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Fixed readability and log message.

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
2023-10-10 11:16:55 +01:00
Bryan Boreham f6d9c84fde
scraping: delay creating buffer, to save memory (#12953)
We don't need the buffer to read the response until the scrape http call
returns; creating it earlier makes the buffer pool larger.

I split `scrape()` into `scrape()` which returns with the http response,
and `readResponse()` which decompresses and copies the data into the
supplied buffer. This design was chosen to minimize impact on the logic.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-10-09 17:23:53 +01:00
Bryan Boreham 7c934ae18c scraping: hoist labels variable to save garbage
`lset` escapes to heap due to being passed through the text-parser
interface, so we can reduce garbage by hoisting it out of the loop so
only one allocation is done for every series in a scrape.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-10-05 11:04:59 +00:00
Goutham Veeramachaneni 86729d4d7b
Update exp package (#12650) 2023-09-21 22:53:51 +02:00
Arve Knudsen 6daee89e5f
Add context argument to Querier.Select (#12660)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-12 12:37:38 +02:00
Bryan Boreham d73b4acb30
Merge pull request #12737 from prometheus/beorn7/histogram
textparse: fix infinite loop during exemplar parsing
2023-08-23 09:36:56 +01:00
György Krajcsovits 983c0c5e9d Add missing buckets
My previous proposal for a fix was wrong and also missed these.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2023-08-21 14:44:53 +02:00
György Krajcsovits 2ae8c2bd3d Set expected values in test
The parsing doesn't seem to be perfect as I don't get all classic buckets
possibly another bug found?

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2023-08-21 13:55:13 +02:00
György Krajcsovits 2a781ec5ac Replicate infinite loop in native-classic histogram scrape
Enable scraping a native histogram with exemplars that leads to
infinite loop.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2023-08-21 13:12:45 +02:00
Bryan Boreham 611f50bb3d scrape: retain all dropped targets when KeepDroppedTargets is zero
This was a bug.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-20 14:32:23 +01:00
Bryan Boreham 627c99424b scrape: extend TestDroppedTargetsList to check counts
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-20 14:32:23 +01:00
Bryan Boreham 1e3fef6ab0
scraping: limit detail on dropped targets, to save memory (#12647)
It's possible (quite common on Kubernetes) to have a service discovery
return thousands of targets then drop most of them in relabel rules.
The main place this data is used is to display in the web UI, where
you don't want thousands of lines of display.

The new limit is `keep_dropped_targets`, which defaults to 0
for backwards-compatibility.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-14 15:39:25 +01:00
beorn7 536a487af4 scrape: Refactor names of float samples
Continue to remove confusion that histogram samples are also samples
and histogram values are also values etc. by renaming float values and
float samples using the same schema as for histograms.

Concretely:
- result → resultFloats (corresponding to resultHistograms)
- pendingResult → pendingFloats (corresponding to pendingHistograms)
- rolledbackResult → rolledbackFloats (corresponding to rolledbackHistograms)
- sample → floatSample (corresponding to histogramSample)

This also order the fields in `collectResultAppender` more
consistently.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-07-13 14:27:51 +02:00
beorn7 0e3f35324b scrape: Enable ingestion of multiple exemplars per sample
This has become a requirement for native histograms, as a single
histogram sample commonly has many buckets, so that providing many
exemplars makes sense.

Since OM text doesn't support native histograms yet, the test had to
be expanded to also support protobuf test cases.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-07-13 14:16:10 +02:00
Bryan Boreham 5255bf06ad Replace sort.Slice with faster slices.SortFunc
The generic version is more efficient.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-07-02 22:17:08 +00:00
Julius Volz ac8abdaacd
Rename remaining jitterSeed -> offsetSeed variables (#12414)
I had changed the naming from "jitter" to "offset" in:

cb045c0e4b

...but I forgot to add this file to the commit to complete the renaming,
doing that now.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2023-06-05 17:36:11 +02:00
Julius Volz cb045c0e4b Fix wording from "jitterSeed" -> "offsetSeed" for server-wide scrape offsets
In digital communication, "jitter" usually refers to how much a signal deviates
from true periodicity, see https://en.wikipedia.org/wiki/Jitter. The way we are
using the "jitterSeed" in Prometheus does not affect the true periodicity at
all, but just introduces a constant phase shift (or offset) within the period.
So it would be more correct and less confusing to call the "jitterSeed" an
"offsetSeed" instead.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2023-05-25 11:54:00 +02:00
beorn7 9e500345f3 textparse/scrape: Add option to scrape both classic and native histograms
So far, if a target exposes a histogram with both classic and native
buckets, a native-histogram enabled Prometheus would ignore the
classic buckets. With the new scrape config option
`scrape_classic_histograms` set, both buckets will be ingested,
creating all the series of a classic histogram in parallel to the
native histogram series. For example, a histogram `foo` would create a
native histogram series `foo` and classic series called `foo_sum`,
`foo_count`, and `foo_bucket`.

This feature can be used in a migration strategy from classic to
native histograms, where it is desired to have a transition period
during which both native and classic histograms are present.

Note that two bugs in classic histogram parsing were found and fixed
as a byproduct of testing the new feature:

1. Series created from classic _gauge_ histograms didn't get the
   _sum/_count/_bucket prefix set.
2. Values of classic _float_ histograms weren't parsed properly.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-05-13 01:32:25 +02:00
Björn Rabenstein bd98fc8c45
Merge pull request #12254 from zenador/histogram-bucket-limit
Implement bucket limit for native histograms
2023-05-10 17:42:29 +02:00
Jeanette Tan 40240c9c1c Update according to code review
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-05-05 02:33:00 +08:00
György Krajcsovits 19a4f314f5 Refactor testutil/protobuf.go into scrape package
Renamed to clientprotobuf.go and added comments to indicate the
intended usage.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2023-05-04 08:36:44 +02:00
Russ Cox 28f5502828 scrape: fix two loop variable scoping bugs in test
Consider code like:

	for i := 0; i < numTargets; i++ {
		stopFuncs = append(stopFuncs, func() {
			time.Sleep(i*20*time.Millisecond)
		})
	}

Because the loop variable i is shared by all closures,
all the stopFuncs sleep for numTargets*20 ms.

If the i were made per-iteration, as we are considering
for a future Go release, the stopFuncs would have sleep
durations ranging from 0 to (numTargets-1)*20 ms.

Two tests had code like this and were checking that the
aggregate sleep was at least numTargets*20 ms
("at least as long as the last target slept"). This is only true
today because i == numTarget during all the sleeps.

To keep the code working even if the semantics of this loop
change, this PR computes

	d := time.Duration((i+1)*20) * time.Millisecond

outside the closure (but inside the loop body), and then each
closure has its own d. Now the sleeps range from 20 ms
to numTargets*20 ms, keeping the test passing
(and probably behaving closer to the intent of the test author).

The failure being fixed can be reproduced by using the current
Go development branch with

	GOEXPERIMENT=loopvar go test

Signed-off-by: Russ Cox <rsc@golang.org>
2023-04-26 10:33:10 -04:00
Jeanette Tan dfabc69303 Add tests according to code review
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-04-25 02:07:36 +08:00
Jeanette Tan 2ad39baa72 Treat bucket limit like sample limit and make it fail the whole scrape and return an error
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-04-22 03:25:07 +08:00
György Krajcsovits 071426f72f Add unit test for bucket limit appender
Refactors textparser test to use a common test utility to create
protobuf representation from MetricFamily

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2023-04-22 03:14:19 +08:00
Jeanette Tan 4d21ac23e6 Implement bucket limit for native histograms
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-04-22 03:14:19 +08:00
Matthieu MOREL bae9a21200
Merge branch 'main' into linter/nilerr
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-19 19:56:39 +02:00
beorn7 5b53aa1108 style: Replace else if cascades with switch
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.

The exceptions that I have found in our codebase are just these two:

* The `if else` is followed by an additional statement before the next
  condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
  used. In this case, using `switch` would require tagging the `for`
  loop, which probably tips the balance.

Why are `switch` statements more readable?

For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.

I'm sure the aforemention wise coders can list even more reasons.

In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:22:31 +02:00
beorn7 c3c7d44d84 lint: Adjust to the lint warnings raised by current versions of golint-ci
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.

There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).

I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.

I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:10:10 +02:00
Matthieu MOREL fb3eb21230 enable gocritic, unconvert and unused linters
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-13 19:20:22 +00:00
Bryan Boreham b987afa7ef labels: simplify call to get Labels from Builder
It took a `Labels` where the memory could be re-used, but in practice
this hardly ever benefitted. Especially after converting `relabel.Process`
to `relabel.ProcessBuilder`.

Comparing the parameter to `nil` was a bug; `EmptyLabels` is not `nil`
so the slice was reallocated multiple times by `append`.

Lastly `Builder.Labels()` now estimates that the final size will depend
on labels added and deleted.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-22 17:05:20 +00:00
Bryan Boreham 0c09c3feb0 scrape sync: avoid copy of labels for dropped targets
Since the Target object was just created in this function, nobody else
has a reference to it and there are no concerns about it being modified
concurrently so we don't need to copy the value.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-16 20:35:13 +00:00
Bryan Boreham 0dfa1e73f8 scrape: use LabelsRange instead of Labels, for performance
Includes a rewrite of `resolveConflictingExposedLabels` to use
`labels.Builder.Get`, which simplifies it considerably.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-16 20:35:13 +00:00
Bryan Boreham 2fde2fb37d scrape: add Target.LabelsRange
This allows users of a Target to iterate labels without allocating heap memory.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-16 20:35:13 +00:00
Bryan Boreham b96b89ef8b
Merge pull request #12048 from bboreham/faster-targets
Scraping targets are synced by creating the full set, then adding/removing any which have changed.
This PR speeds up the process of creating the full set.

I added a benchmark for `TargetsFromGroup`; it uses configuration from a typical Kubernetes SD.

The crux of the change is to do relabeling inside labels.Builder instead of converting to labels.Labels and back again for every rule. The change is broken into several commits for easier review.

This is a breaking change to `scrape.PopulateLabels()`, but `relabel.Process` is left as-is, with a new `relabel.ProcessBuilder` option.
2023-03-09 11:10:01 +00:00
Julien Pivotto 1fd59791e1 Update tests
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-08 16:32:39 +01:00
Julien Pivotto 0c56e5d014 Update our own dependencies, support proxy from env
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-08 12:00:17 +01:00
Bryan Boreham f4fd9b0d68 scrape: re-use memory in TargetsFromGroup
Common service discovery mechanisms such as Kubernetes can generate a
lot of target groups, so this function was allocating a lot of memory
which then immediately became garbage. Re-using the structures across
an entire Sync saves effort.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-07 17:21:37 +00:00
Bryan Boreham 5cfe759348 scrape: make TargetsFromGroup work with Builder not []Label
Save work converting to `Labels` then to `Builder`.
`PopulateLabels()` now takes as Builder as input.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-07 17:21:37 +00:00
Bryan Boreham c1dbc7b838 scrape: make PopulateLabels work with Builder not Labels
Save work converting to and fro.

Uses the recently-added relabel.ProcessBuilder variant.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-07 17:21:37 +00:00
Bryan Boreham 95fc032a61 scrape: add benchmark for TargetsFromGroup
`loadConfiguration` is made more general.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-03-07 09:46:19 +00:00
Julien Pivotto 599b70a05d Add include scrape configs
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-06 23:35:39 +01:00
Jimmie Han a13249a98f scrape: fix prometheus_target_scrape_pool_target_limit metric not set on creating scrape pool (#12001)
Signed-off-by: Jimmie Han <hanjinming@outlook.com>
2023-02-21 13:14:04 +08:00