prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-14 01:24:04 -08:00

Author	SHA1	Message	Date
Bryan Boreham	1e3fef6ab0	scraping: limit detail on dropped targets, to save memory (#12647 ) It's possible (quite common on Kubernetes) to have a service discovery return thousands of targets then drop most of them in relabel rules. The main place this data is used is to display in the web UI, where you don't want thousands of lines of display. The new limit is `keep_dropped_targets`, which defaults to 0 for backwards-compatibility. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-08-14 15:39:25 +01:00
Bryan Boreham	87cbd26f6b	Merge pull request #12598 from bboreham/labels-json Faster streaming of Labels to JSON, via jsoniter.	2023-08-02 09:53:19 +01:00
Goutham Veeramachaneni	ad4f514e66	Add OTLP Ingestion endpoint (#12571 ) * Add OTLP Ingestion endpoint We copy files from the otel-collector-contrib. See the README in `storage/remote/otlptranslator/README.md`. This supersedes: https://github.com/prometheus/prometheus/pull/11965 Signed-off-by: gouthamve <gouthamve@gmail.com> * Return a 200 OK It is what the OTEL Golang SDK expect :( https://github.com/open-telemetry/opentelemetry-go/issues/4363 Signed-off-by: Goutham <gouthamve@gmail.com> --------- Signed-off-by: gouthamve <gouthamve@gmail.com> Signed-off-by: Goutham <gouthamve@gmail.com>	2023-07-28 12:35:28 +02:00
Bryan Boreham	dcadb32eb1	web/api: use stream encoder for embedded labels This is much more efficient. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-07-24 20:42:36 +01:00
Bryan Boreham	bb528d4a55	Add jsoniter encoder for Labels Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-07-24 20:13:34 +01:00
Bryan Boreham	54e1046616	web/api: extend BenchmarkRespond with more types of data Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-07-24 20:12:44 +01:00
Julien Pivotto	c572d9d6d9	Merge pull request #11905 from charleskorn/api-response-format-extension-point Add extension point for returning different content types from API endpoints	2023-07-15 22:49:29 +02:00
Marco Pracucci	7cc4292328	Export MinTime and MaxTime Signed-off-by: Marco Pracucci <marco@pracucci.com>	2023-07-06 17:48:13 +02:00
Julien Pivotto	0186ec7873	Merge pull request #12516 from vinted/convert_queryopts_to_interface promql: convert QueryOpts to interface	2023-07-04 23:38:31 +02:00
Julien Pivotto	986fde06b2	Merge pull request #11688 from damnever/fix/datamodelvalidation-remotewriteapi Validate the metric names and labels in the remote write handler	2023-07-04 13:52:02 +02:00
Charles Korn	097faf33c6	Merge branch 'main' into api-response-format-extension-point # Conflicts: # web/api/v1/api.go # web/api/v1/api_test.go	2023-07-04 13:26:13 +10:00
Giedrius Statkevičius	3f230fc9f8	promql: convert QueryOpts to interface Convert QueryOpts to an interface so that downstream projects like https://github.com/thanos-community/promql-engine could extend the query options with engine specific options that are not in the original engine. Will be used to enable query analysis per-query. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	2023-07-03 16:20:31 +03:00
Julien Pivotto	e043b273a6	Merge pull request #12439 from prometheus/release-2.45 Merge release 2.45.0 back to main	2023-06-17 10:16:48 +02:00
Arthur Silva Sens	1ea477f4bc	Add feature flag to squash metadata from /api/v1/metadata (#12391 ) Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2023-06-12 16:17:20 +01:00
Jesus Vazquez	bfa466d00f	Create release candidate 2.45.0-rc.0 (#12435 ) Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>	2023-06-07 12:29:04 +02:00
Baskar Shanmugam	905a0bd63a	Added 'limit' query parameter support to /api/v1/status/tsdb endpoint (#12336 ) * Added 'topN' query parameter support to /api/v1/status/tsdb endpoint Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Updated query parameter for tsdb status to 'limit' Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Corrected Stats() parameter name from topN to limit Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Fixed p.Stats CI failure Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> --------- Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>	2023-05-22 14:37:07 +02:00
Vladimir Varankin	d281ebb178	web: display GOMEMLIMIT in runtime info Signed-off-by: Vladimir Varankin <vladimir@varank.in>	2023-04-23 20:24:34 +02:00
Julien Pivotto	8f1dc4a70f	Merge pull request #12248 from yeya24/consistent-response Use same error for instant and range query when 400	2023-04-21 11:44:20 +02:00
Julien Pivotto	e2512078e5	Merge pull request #12241 from mmorel-35/linter/nilerr enable gocritic, unconvert and unused linters	2023-04-20 15:13:31 +02:00
gotjosh	2f22c8b7f8	Merge pull request #12270 from prometheus/gotjosh/allow-filtering-of-rules-by-name-api Rules API: Allow filtering by rule name	2023-04-20 12:03:08 +01:00
gotjosh	e78be38cc0	don't show empty groups Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-04-20 11:20:20 +01:00
Matthieu MOREL	bae9a21200	Merge branch 'main' into linter/nilerr Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-04-19 19:56:39 +02:00
beorn7	5b53aa1108	style: Replace `else if` cascades with `switch` Wiser coders than myself have come to the conclusion that a `switch` statement is almost always superior to a statement that includes any `else if`. The exceptions that I have found in our codebase are just these two: * The `if else` is followed by an additional statement before the next condition (separated by a `;`). * The whole thing is within a `for` loop and `break` statements are used. In this case, using `switch` would require tagging the `for` loop, which probably tips the balance. Why are `switch` statements more readable? For one, fewer curly braces. But more importantly, the conditions all have the same alignment, so the whole thing follows the natural flow of going down a list of conditions. With `else if`, in contrast, all conditions but the first are "hidden" behind `} else if `, harder to spot and (for no good reason) presented differently from the first condition. I'm sure the aforemention wise coders can list even more reasons. In any case, I like it so much that I have found myself recommending it in code reviews. I would like to make it a habit in our code base, without making it a hard requirement that we would test on the CI. But for that, there has to be a role model, so this commit eliminates all `if else` occurrences, unless it is autogenerated code or fits one of the exceptions above. Signed-off-by: beorn7 <beorn@grafana.com>	2023-04-19 17:22:31 +02:00
beorn7	c3c7d44d84	lint: Adjust to the lint warnings raised by current versions of golint-ci We haven't updated golint-ci in our CI yet, but this commit prepares for that. There are a lot of new warnings, and it is mostly because the "revive" linter got updated. I agree with most of the new warnings, mostly around not naming unused function parameters (although it is justified in some cases for documentation purposes – while things like mocks are a good example where not naming the parameter is clearer). I'm pretty upset about the "empty block" warning to include `for` loops. It's such a common pattern to do something in the head of the `for` loop and then have an empty block. There is still an open issue about this: https://github.com/mgechev/revive/issues/810 I have disabled "revive" altogether in files where empty blocks are used excessively, and I have made the effort to add individual `// nolint:revive` where empty blocks are used just once or twice. It's borderline noisy, though, but let's go with it for now. I should mention that none of the "empty block" warnings for `for` loop bodies were legitimate. Signed-off-by: beorn7 <beorn@grafana.com>	2023-04-19 17:10:10 +02:00
gotjosh	96b6463f25	review comments Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-04-18 16:26:32 +01:00
gotjosh	f3394bf7a1	Rules API: Allow filtering by rule name Introduces support for a new query parameter in the `/rules` API endpoint that allows filtering by rule names. If all the rules of a group are filtered, we skip the group entirely. Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-04-18 10:12:08 +01:00
Ben Ye	fd3630b9a3	add ctx to QueryEngine interface Signed-off-by: Ben Ye <benye@amazon.com>	2023-04-17 21:32:38 -07:00
Matthieu MOREL	fb3eb21230	enable gocritic, unconvert and unused linters Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-04-13 19:20:22 +00:00
beorn7	c0879d64cf	promql: Separate `Point` into `FPoint` and `HPoint` In other words: Instead of having a “polymorphous” `Point` that can either contain a float value or a histogram value, use an `FPoint` for floats and an `HPoint` for histograms. This seemingly small change has a _lot_ of repercussions throughout the codebase. The idea here is to avoid the increase in size of `Point` arrays that happened after native histograms had been added. The higher-level data structures (`Sample`, `Series`, etc.) are still “polymorphous”. The same idea could be applied to them, but at each step the trade-offs needed to be evaluated. The idea with this change is to do the minimum necessary to get back to pre-histogram performance for functions that do not touch histograms. Here are comparisons for the `changes` function. The test data doesn't include histograms yet. Ideally, there would be no change in the benchmark result at all. First runtime v2.39 compared to directly prior to this commit: ``` name old time/op new time/op delta RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 542µs ± 1% +38.58% (p=0.000 n=9+8) RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 617µs ± 2% +36.48% (p=0.000 n=10+10) RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.36ms ± 2% +21.58% (p=0.000 n=8+10) RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 8.94ms ± 1% +14.21% (p=0.000 n=10+10) RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.30ms ± 1% +10.67% (p=0.000 n=9+10) RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.10ms ± 1% +11.82% (p=0.000 n=10+10) RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 11.8ms ± 1% +12.50% (p=0.000 n=8+10) RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 87.4ms ± 1% +12.63% (p=0.000 n=9+9) RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 32.8ms ± 1% +8.01% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.6ms ± 2% +9.64% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 117ms ± 1% +11.69% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 876ms ± 1% +11.83% (p=0.000 n=9+10) ``` And then runtime v2.39 compared to after this commit: ``` name old time/op new time/op delta RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 547µs ± 1% +39.84% (p=0.000 n=9+8) RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 616µs ± 2% +36.15% (p=0.000 n=10+10) RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.26ms ± 1% +12.20% (p=0.000 n=8+10) RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 7.95ms ± 1% +1.59% (p=0.000 n=10+8) RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.38ms ± 2% +13.49% (p=0.000 n=9+10) RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.02ms ± 1% +9.80% (p=0.000 n=10+9) RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 10.8ms ± 1% +3.08% (p=0.000 n=8+10) RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 78.1ms ± 1% +0.58% (p=0.035 n=9+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 33.5ms ± 4% +10.18% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.0ms ± 1% +7.98% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 107ms ± 1% +1.92% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 775ms ± 1% -1.02% (p=0.019 n=9+9) ``` In summary, the runtime doesn't really improve with this change for queries with just a few steps. For queries with many steps, this commit essentially reinstates the old performance. This is good because the many-step queries are the one that matter most (longest absolute runtime). In terms of allocations, though, this commit doesn't make a dent at all (numbers not shown). The reason is that most of the allocations happen in the sampleRingIterator (in the storage package), which has to be addressed in a separate commit. Signed-off-by: beorn7 <beorn@grafana.com>	2023-04-13 19:25:16 +02:00
Ben Ye	fb67d368a2	use consistent error for instant and range query 400 Signed-off-by: Ben Ye <benye@amazon.com>	2023-04-11 13:45:34 -07:00
Xiaochao Dong (@damnever)	2b7202c4cc	Validate the metric names and labels in the remote write handler Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2023-04-05 19:09:05 +08:00
pbudner	46683eadf7	fix: advertise correct flag to enable remote write receiver Signed-off-by: pbudner <mail@pascalbudner.de>	2023-03-11 13:50:52 +01:00
Charles Korn	38c1930f48	Merge branch 'main' into api-response-format-extension-point Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-03-09 12:06:26 +11:00
Charles Korn	46a28899a0	Implement fully-featured content negotiation for API requests, and allow overriding the default API codec. Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-03-09 12:02:45 +11:00
Julien Pivotto	db2d759b81	Add support for lookbackdelta per query via the API Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2023-03-08 00:30:05 +01:00
Charles Korn	eaad7c0fc8	Merge branch 'main' into api-response-format-extension-point Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com>	2023-02-15 14:18:23 +01:00
Charles Korn	deba5120ea	Address PR feeedback: reduce log level. Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-02-11 15:34:25 +01:00
Fish-pro	43d77f7c41	Use http constants instead of string Signed-off-by: Fish-pro <zechun.chen@daocloud.io>	2023-02-10 10:21:05 +08:00
Charles Korn	857b23873f	Expose QueryData so that implementations of Codec.CanEncode() can perform a type assertion against Response.Data. Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-02-02 15:30:56 +11:00
Charles Korn	a0dd1468be	Move custom jsoniter code into json_codec.go. Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-02-02 13:10:20 +11:00
Charles Korn	3e94dd8c8f	Add extension point for returning different content types from API endpoints Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-02-02 13:10:19 +11:00
Marco Pracucci	3db77b4491	API: change HTTP status code tracked in metrics form 503/422 to 499 if a request is canceled Signed-off-by: Marco Pracucci <marco@pracucci.com>	2023-01-26 13:06:37 +01:00
Julien Pivotto	2c408289f8	Add stabilizing to UI Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2023-01-19 11:33:54 +01:00
Julien Pivotto	ce55e5074d	Add 'keep_firing_for' field to alerting rules This commit adds a new 'keep_firing_for' field to Prometheus alerting rules. The 'resolve_delay' field specifies the minimum amount of time that an alert should remain firing, even if the expression does not return any results. This feature was discussed at a previous dev summit, and it was determined that a feature like this would be useful in order to allow the expression time to stabilize and prevent confusing resolved messages from being propagated through Alertmanager. This approach is simpler than having two PromQL queries, as was sometimes discussed, and it should be easy to implement. This commit does not include tests for the 'resolve_delay' field. This is intentional, as the purpose of this commit is to gather comments on the proposed design of the 'resolve_delay' field before implementing tests. Once the design of the 'resolve_delay' field has been finalized, a follow-up commit will be submitted with tests." See https://github.com/prometheus/prometheus/issues/11570 Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2023-01-13 12:11:39 +01:00
Łukasz Mierzwa	e1b7082008	Show individual scrape pools on /targets page (#11142 ) * Add API endpoints for getting scrape pool names This adds api/v1/scrape_pools endpoint that returns the list of names of all the scrape pools configured. Having it allows to find out what scrape pools are defined without having to list and parse all targets. The second change is adding scrapePool query parameter support in api/v1/targets endpoint, that allows to filter returned targets by only finding ones for passed scrape pool name. Both changes allow to query for a specific scrape pool data, rather than getting all the targets for all possible scrape pools. The problem with api/v1/targets endpoint is that it returns huge amount of data if you configure a lot of scrape pools. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com> * Add a scrape pool selector on /targets page Current targets page lists all possible targets. This works great if you only have a few scrape pools configured, but for systems with a lot of scrape pools and targets this slow things down a lot. Not only does the /targets page load very slowly in such case (waiting for huge API response) but it also take a long time to render, due to huge number of elements. This change adds a dropdown selector so it's possible to select only intersting scrape pool to view. There's also scrapePool query param that will open selected pool automatically. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com> Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-12-23 11:55:08 +01:00
Bryan Boreham	fd57569683	Update package web tests for new labels.Labels type Use `FromStrings` instead of assuming the data structure. And don't sort individual labels, since `labels.Labels` are always sorted. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-12-19 15:22:09 +00:00
Ganesh Vernekar	e3719d670b	Merge remote-tracking branch 'upstream/main' into sparsehistogram Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-10-25 14:38:56 -04:00
Alan Protasio	5ac12ac351	api: Wrapped promQL based API errors with `returnAPIError` function (#11356 ) * wrap api error on get series/labels on `returnAPIError` function Signed-off-by: Alan Protasio <approtas@amazon.com> * lint Signed-off-by: Alan Protasio <approtas@amazon.com> * query exemplars Signed-off-by: Alan Protasio <approtas@amazon.com> Signed-off-by: Alan Protasio <approtas@amazon.com>	2022-10-20 11:17:00 +02:00
Jesus Vazquez	e934d0f011	Merge 'main' into sparsehistogram Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>	2022-10-05 22:14:49 +02:00
Ganesh Vernekar	f024d769e7	Add API test for histogram Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-10-03 18:53:57 +05:30

1 2 3 4 5 ...

333 commits