prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-12-28 15:09:39 -08:00

Author	SHA1	Message	Date
beorn7	7ee1836ef5	Merge branch 'main' into sparsehistogram	2022-04-05 18:31:19 +02:00
Robert Fratto	44a5e705be	discovery: Expose custom HTTP client options to discoverers (#10462 ) * discovery: expose HTTP client options to discoverers Signed-off-by: Robert Fratto <robertfratto@gmail.com> * discovery/http: use HTTP client options for created client Signed-off-by: Robert Fratto <robertfratto@gmail.com> * scrape: use a list of HTTP client options instead of just dial context Signed-off-by: Robert Fratto <robertfratto@gmail.com> * discovery: rephrase comment Signed-off-by: Robert Fratto <robertfratto@gmail.com>	2022-03-24 18:16:59 -04:00
Goutham Veeramachaneni	4d8bbfd416	Add target to context (#10473 ) Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>	2022-03-24 16:53:04 +01:00
beorn7	4210aac74a	Merge branch 'main' into sparsehistogram	2022-03-22 14:47:42 +01:00
Alvin Lin	8b5eb562b1	Re-generate test cert to fix test_windows test failures Signed-off-by: Alvin Lin <alvinlin@amazon.com>	2022-03-17 19:37:18 +01:00
Goutham Veeramachaneni	c4f8020dca	Embed MetadaStore in scrape context (#10450 ) This will allow downstream users to easily access metadata required. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>	2022-03-16 09:45:15 +01:00
Robert Fratto	f0ec619eec	scrape: allow providing a custom Dialer for scraping (#10415 ) * scrape: allow providing a custom Dialer for scraping This commit extends config.ScrapeConfig with an optional field to override how HTTP connections to targets are created. This field is not set directly in Prometheus, and is only added for the convenience of downstream importers. Closes #9706 Signed-off-by: Robert Fratto <robertfratto@gmail.com> * scrape: move custom dial function to scrape.Options Signed-off-by: Robert Fratto <robertfratto@gmail.com>	2022-03-09 00:48:47 +01:00
Jayapriya Pai	edfe657b54	scrape: Fix label_limits cache usage (#10370 ) Fixes #10344 Signed-off-by: Jayapriya Pai <janantha@redhat.com>	2022-03-03 18:37:53 +01:00
Julien Pivotto	f695df843f	Improve content-type error handling - Call err everywhere - Change log message to underscore-separated field Followup on #10186 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2022-02-08 11:02:51 +01:00
Matheus Pimenta	8d8ce641a4	error for invalid media type should not be completely swallowed (#10186 ) * error for invalid media type should not be completely swallowed Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>	2022-02-08 10:57:56 +01:00
Jonatan Ivanov	b6df3b6f67	Prefer 1.0.0 in the accept header for application/openmetrics-text (#9431 ) related: https://github.com/prometheus/client_java/issues/702 fixes gh-9430 Signed-off-by: Jonatan Ivanov <jonatan.ivanov@gmail.com>	2022-01-28 00:37:51 +01:00
beorn7	86cc83b13c	storage: iterator fixes after merge Signed-off-by: beorn7 <beorn@grafana.com>	2021-12-18 14:12:01 +01:00
beorn7	64c7bd2b08	Merge branch 'main' into sparsehistogram	2021-12-18 14:04:25 +01:00
Julius Volz	fa552b98bb	Merge pull request #9996 from roidelapluie/fixreportlimit Fix reporting metrics when sample limit is reached during the report	2021-12-17 13:17:07 +01:00
Julien Pivotto	67a64ee092	Remove check against cfg so interval/ timeout are always set (#10023 ) (#10031 ) Signed-off-by: Nicholas Blott <blottn@tcd.ie> Co-authored-by: Nicholas Blott <blottn@tcd.ie>	2021-12-16 16:46:14 +01:00
Julien Pivotto	e94a0b28e1	Append reporting metrics without limit If reporting metrics fails due to reaching the limit, this makes the target appear as UP in the UI, but the metrics are missing. This commit bypasses that limit for report metrics. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-12-16 13:26:53 +01:00
Björn Rabenstein	7e42acd3b1	tsdb: Rework iterators (#9877 ) - Pick At... method via return value of Next/Seek. - Do not clobber returned buckets. - Add partial FloatHistogram suppert. Note that the promql package is now _only_ dealing with FloatHistograms, following the idea that PromQL only knows float values. As a byproduct, I have removed the histogramSeries metric. In my understanding, series can have both float and histogram samples, so that metric doesn't make sense anymore. As another byproduct, I have converged the sampleBuf and the histogramSampleBuf in memSeries into one. The sample type stored in the sampleBuf has been extended to also contain histograms even before this commit. Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-29 13:24:23 +05:30
beorn7	5d4db805ac	Merge branch 'main' into sparsehistogram	2021-11-17 19:57:31 +01:00
beorn7	4c28d9fac7	Move to histogram.Histogram pointers This is to avoid copying the many fields of a histogram.Histogram all the time. This also fixes a bunch of formerly broken tests. Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-12 23:17:35 +01:00
beorn7	c954cd9d1d	Move packages out of deprecated pkg directory This creates a new `model` directory and moves all data-model related packages over there: exemplar labels relabel rulefmt textparse timestamp value All the others are more or less utilities and have been moved to `util`: gate logging modetimevfs pool runtime Signed-off-by: beorn7 <beorn@grafana.com>	2021-11-09 08:03:10 +01:00
Dieter Plaetinck	cda025b5b5	TSDB: demistify SeriesRefs and ChunkRefs (#9536 ) * TSDB: demistify seriesRefs and ChunkRefs The TSDB package contains many types of series and chunk references, all shrouded in uint types. Often the same uint value may actually mean one of different types, in non-obvious ways. This PR aims to clarify the code and help navigating to relevant docs, usage, etc much quicker. Concretely: * Use appropriately named types and document their semantics and relations. * Make multiplexing and demuxing of types explicit (on the boundaries between concrete implementations and generic interfaces). * Casting between different types should be free. None of the changes should have any impact on how the code runs. TODO: Implement BlockSeriesRef where appropriate (for a future PR) Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * feedback Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * agent: demistify seriesRefs and ChunkRefs Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-11-06 15:40:04 +05:30
Mateusz Gozdek	1a6c2283a3	Format Go source files using 'gofumpt -w -s -extra' Part of #9557 Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 19:52:34 +01:00
Darshan Chaudhary	a7e554b158	add check service-discovery command (#8970 ) Signed-off-by: darshanime <deathbullet@gmail.com>	2021-11-01 14:42:12 +01:00
DrAuYueng	69e309d202	Expose TargetsFromGroup/AlertmanagerFromGroup func and reuse this for (#9343 ) static/file sd config check in promtool Signed-off-by: DrAuYueng <ouyang1204@gmail.com>	2021-10-28 02:01:28 +02:00
Furkan Türkal	a6e6011d55	Add scrape_body_size_bytes metric (#9569 ) Fixes #9520 Signed-off-by: Furkan <furkan.turkal@trendyol.com>	2021-10-24 23:45:31 +02:00
Levi Harrison	5d409b0637	Remove `interval` and `timeout` parameters (#9578 )	2021-10-24 10:38:21 -04:00
Julien Pivotto	b0c98e01c8	Include scrape labels in the hash (#9551 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-10-20 23:44:45 +02:00
beorn7	a9008f5423	Merge branch 'main' into sparsehistogram	2021-10-19 17:14:23 +02:00
beorn7	b8d953a5a0	scrape: Avoid creating a label map during conflict resolution This also avoids the recursive function call. I think it is quite readable. And much less code. Signed-off-by: beorn7 <beorn@grafana.com>	2021-10-15 21:56:48 +02:00
Shirley Leu	c890ea407f	Resolve conflicts between multiple exported label prefixes (#9479 ) Resolve conflicts between multiple exported label prefixes Signed-off-by: Shirley Leu <shirley.w.leu@gmail.com>	2021-10-15 20:31:03 +02:00
beorn7	7a8bb8222c	Style cleanup of all the changes in sparsehistogram so far A lot of this code was hacked together, literally during a hackathon. This commit intends not to change the code substantially, but just make the code obey the usual style practices. A (possibly incomplete) list of areas: * Generally address linter warnings. * The `pgk` directory is deprecated as per dev-summit. No new packages should be added to it. I moved the new `pkg/histogram` package to `model` anticipating what's proposed in #9478. * Make the naming of the Sparse Histogram more consistent. Including abbreviations, there were just too many names for it: SparseHistogram, Histogram, Histo, hist, his, shs, h. The idea is to call it "Histogram" in general. Only add "Sparse" if it is needed to avoid confusion with conventional Histograms (which is rare because the TSDB really has no notion of conventional Histograms). Use abbreviations only in local scope, and then really abbreviate (not just removing three out of seven letters like in "Histo"). This is in the spirit of https://github.com/golang/go/wiki/CodeReviewComments#variable-names * Several other minor name changes. * A lot of formatting of doc comments. For one, following https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences , but also layout question, anticipating how things will look like when rendered by `godoc` (even where `godoc` doesn't render them right now because they are for unexported types or not a doc comment at all but just a normal code comment - consistency is queen!). * Re-enabled `TestQueryLog` and `TestEndopints` (they pass now, leaving them disabled was presumably an oversight). * Bucket iterator for histogram.Histogram is now created with a method. * HistogramChunk.iterator now allows iterator recycling. (I think @dieterbe only commented it out because he was confused by the question in the comment.) * HistogramAppender.Append panics now because we decided to treat staleness marker differently. Signed-off-by: beorn7 <beorn@grafana.com>	2021-10-11 13:02:03 +02:00
beorn7	fd5ea4e0b5	Merge branch 'main' into sparsehistogram	2021-10-07 23:16:42 +02:00
Julien Pivotto	63b3e4e5ec	Enable HTTP2 again (#9398 ) We are re-enabling HTTP 2 again. There has been a few bugfixes upstream in go, and we have also enabled ReadIdleTimeout. Fix #7588 Fix #9068 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-09-26 23:16:12 +02:00
Robert Fratto	daf2887fd4	expose scrape.userAgentHeader like remote.UserAgent Signed-off-by: Robert Fratto <robertfratto@gmail.com>	2021-09-13 14:10:34 -04:00
Julien Pivotto	48a101be1b	Allow to tune the scrape tolerance (#9283 ) * Allow to tune the scrape tolerance In most of the classic monitoring use cases, a few milliseconds difference can be omitted. In Prometheus, a few millisecond difference can however make a big difference. Currently, Prometheus will ignore up to 2 ms difference in the alignments. It turns out that for users who can afford a 10ms difference, there is a lot of resources and disk space to win, as shown in this graph, which shows the bytes / samples over a production Prometheus server. You can clearly see the switch from 2ms to 10ms tolerance. This pull request enables the adjustment of the scrape timestamp alignment tolerance. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Fix golint Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-09-08 17:27:33 +05:30
Bryan Boreham	92a3eeac55	Create less garbage when parsing metrics (#9299 ) * Refactor: extract function to make scrapeLoop for testing Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Add benchmarks for ScrapeLoopAppend For Prometheus and OpenMetrics Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Create less garbage when parsing metrics Exemplar escapes to heap due to being passed through text-parser interface, but we can reduce the impact by hoisting it out of the loop and resetting it after every use. (Note the cost was paid on every line even when exemplars were disabled) Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Create less garbage when parsing OpenMetrics After calling parseLVals() we always append the return value, so pass in what we want to append it to and save garbage. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2021-09-08 13:39:21 +05:30
Łukasz Mierzwa	f0a26266c0	Add scrape_sample_limit metric This adds a new metric exposing per target scrape sample_limit value. Metrics are only exposed if extra-scrape-metrics feature flag is enabled. scrape_sample_limit will make it easy to monitor and alert on targets getting close to configured sample_limit, which is important given than exceeding sample_limit results in the entire scrape results being rejected. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2021-09-03 15:42:41 +01:00
SuperQ	31f4108758	Add scrape_timeout_seconds metric Add a new built-in metric `scrape_timeout_seconds` to allow monitoring of the ratio of scrape duration to the scrape timeout. Hide behind a feature flag to avoid additional cardinality by default. Signed-off-by: SuperQ <superq@gmail.com>	2021-09-02 12:15:35 +02:00
Levi Harrison	70f597b033	Configure Scrape Interval and Timeout Via Relabeling (#8911 ) * Configure scrape interval and timeout with labels Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-08-31 17:37:32 +02:00
Ganesh Vernekar	8b70e87ab9	Merge remote-tracking branch 'upstream/main' into sparse-refactor Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-05 12:16:08 +05:30
Arunprasad Rajkumar	5527e26efc	scrape: fix 'target_limit exceeded error' when reloading conf with 0 Signed-off-by: Arunprasad Rajkumar <arajkuma@redhat.com>	2021-07-27 17:34:22 +05:30
austin ce	5bdfba1d20	Extract and export GetFQDN() Signed-off-by: austin ce <austin.cawley@gmail.com>	2021-07-21 12:55:02 -04:00
Naka Masato	a1c1313b3c	fix typo in comment for scrape manager (#9094 ) Signed-off-by: Masato Naka <masatonaka1989@gmail.com>	2021-07-19 15:55:13 +05:30
beorn7	5de2df752f	Hacky implementation of protobuf parsing This "brings back" protobuf parsing, with the only goal to play with the new sparse histograms. The Prom-2.x style parser is highly adapted to the structure of the Prometheus text format (and later OpenMetrics). Some jumping through hoops is required to feed protobuf into it. This is not meant to be a model for the final implementation. It should just enable sparse histogram ingestion at a reasonable efficiency. Following known shortcomings and flaws: - No tests yet. - Summaries and legacy histograms, i.e. without sparse buckets, are ignored. - Staleness doesn't work (but this could be fixed in the appender, to be discussed). - No tricks have been tried that would be similar to the tricks the text parsers do (like direct pointers into the HTTP response body). That makes things weird here. Tricky optimizations only make sense once the final format is specified, which will almost certainly not be the old protobuf format. (Interestingly, I expect this implementation to be in fact much more efficient than the original protobuf ingestion in Prom-1.x.) - This is using a proto3 version of metrics.proto (mostly to be consistent with the other protobuf uses). However, proto3 sees no difference between an unset field. We depend on that to distinguish between an unset timestamp and the timestamp 0 (1970-01-01, 00:00:00 UTC). In this experimental code, we just assume that timestamp is never specified and therefore a timestamp of 0 always is interpreted as "not set". Signed-off-by: beorn7 <beorn@grafana.com>	2021-07-01 01:35:11 +02:00
Ganesh Vernekar	04ad56d9b8	Append sparse histograms into the Head block (#9013 ) * Append sparse histograms into the Head block Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add AtHistogram() to Iterator interface. Make HistoChunk conform to Chunk interface. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-06-29 20:08:46 +05:30
Ganesh Vernekar	64bea6999e	HistogramAppender interface for sparse histograms (#9007 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-06-28 20:30:55 +05:30
Julius Volz	9d495afd2c	Remove trailing zeros in scrape timeout header See https://twitter.com/AviKivity/status/1405147699557638145 and https://twitter.com/juliusvolz/status/1405790211670515712 Signed-off-by: Julius Volz <julius.volz@gmail.com>	2021-06-18 09:38:12 +02:00
Levi Harrison	b5f6f8fb36	Switched to go-kit/log Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-11 12:28:36 -04:00
hanjm	1df05bfd49	Add body_size_limit to prevent bad targets response large body cause Prometheus server OOM (#8827 ) Signed-off-by: hanjm <hanjinming@outlook.com>	2021-05-29 07:05:42 +08:00
Levi Harrison	2826fbeeb7	SD: Add target creation failure counter and change failure handling (#8786 ) * Added metric and changed failure/drop strategy Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-05-28 23:50:59 +02:00

1 2 3 4

182 commits