prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-12-28 06:59:40 -08:00

Author	SHA1	Message	Date
Björn Rabenstein	5fab430e73	Merge pull request #1774 from prometheus/beorn7/index storage: improve index lookups	2016-07-20 17:38:04 +02:00
beorn7	fc6737b7fb	storage: improve index lookups tl;dr: This is not a fundamental solution to the indexing problem (like tindex is) but it at least avoids utilizing the intersection problem to the greatest possible amount. In more detail: Imagine the following query: nicely:aggregating:rule{job="foo",env="prod"} While it uses a nicely aggregating recording rule (which might have a very low cardinality), Prometheus still intersects the low number of fingerprints for `{__name__="nicely:aggregating:rule"}` with the many thousands of fingerprints matching `{job="foo"}` and with the millions of fingerprints matching `{env="prod"}`. This totally innocuous query is dead slow if the Prometheus server has a lot of time series with the `{env="prod"}` label. Ironically, if you make the query more complicated, it becomes blazingly fast: nicely:aggregating:rule{job=~"foo",env=~"prod"} Why so? Because Prometheus only intersects with non-Equal matchers if there are no Equal matchers. That's good in this case because it retrieves the few fingerprints for `{__name__="nicely:aggregating:rule"}` and then starts right ahead to retrieve the metric for those FPs and checking individually if they match the other matchers. This change is generalizing the idea of when to stop intersecting FPs and go into "retrieve metrics and check them individually against remaining matchers" mode: - First, sort all matchers by "expected cardinality". Matchers matching the empty string are always worst (and never used for intersections). Equal matchers are in general consider best, but by using some crude heuristics, we declare some better than others (instance labels or anything that looks like a recording rule). - Then go through the matchers until we hit a threshold of remaining FPs in the intersection. This threshold is higher if we are already in the non-Equal matcher area as intersection is even more expensive here. - Once the threshold has been reached (or we have run out of matchers that do not match the empty string), start with "retrieve metrics and check them individually against remaining matchers". A beefy server at SoundCloud was spending 67% of its CPU time in index lookups (fingerprintsForLabelPairs), serving mostly a dashboard that is exclusively built with recording rules. With this change, it spends only 35% in fingerprintsForLabelPairs. The CPU usage dropped from 26 cores to 18 cores. The median latency for query_range dropped from 14s to 50ms(!). As expected, higher percentile latency didn't improve that much because the new approach is _occasionally_ running into the worst case while the old one was _systematically_ doing so. The 99th percentile latency is now about as high as the median before (14s) while it was almost twice as high before (26s).	2016-07-20 17:35:53 +02:00
Brian Brazil	40f8da699e	Merge pull request #1815 from prometheus/stddev Add stddev_over_time and stdvar_over_time.	2016-07-19 15:48:32 +01:00
Brian Brazil	9e58070c04	Merge pull request #1820 from prometheus/console-api Update example console templates to new HTTP API.	2016-07-18 21:59:21 +01:00
Brian Brazil	d458ecd4b9	Update example console templates to new HTTP API. Fixes #1819	2016-07-18 20:36:47 +01:00
Fabian Reinartz	42a3cb6172	Merge branch 'release-1.0'	2016-07-19 00:51:32 +09:00
Fabian Reinartz	e2bb136f4e	Merge pull request #1818 from prometheus/fabxc-1.0.0 *: cut 1.0.0	2016-07-18 23:19:29 +09:00
Fabian Reinartz	e867944172	*: cut 1.0.0	2016-07-18 22:38:51 +09:00
Brian Brazil	6eb1d5e63c	Merge pull request #1816 from prometheus/fabxc-k8sfix config: validate Kubernetes role correctly.	2016-07-18 14:29:10 +01:00
Fabian Reinartz	7a0b3af0b7	config: validate Kubernetes role correctly.	2016-07-18 22:24:41 +09:00
Brian Brazil	1edd6875f5	Add stddev_over_time and stdvar_over_time.	2016-07-16 00:34:44 +01:00
Fabian Reinartz	0938661db9	Merge pull request #1804 from pydima/master web: return status code and error message for config resource	2016-07-15 18:26:19 +09:00
Dmitry Vorobev	273e457da4	web: return status code and error message for config resource	2016-07-15 10:15:24 +02:00
Fabian Reinartz	4d0c697548	circle: add tag v-prefix	2016-07-14 11:46:48 +09:00
Fabian Reinartz	a6c81f32bc	Merge branch 'release-1.0' of github.com:prometheus/prometheus into release-1.0	2016-07-14 10:44:02 +09:00
Fabian Reinartz	675b0184af	Merge pull request #1812 from prometheus/fabxc-1.0.0-rc.0 Release 1.0.0-rc.0	2016-07-14 10:43:41 +09:00
Fabian Reinartz	1c4b3ab0e2	*: update changelog for version 1.0.0-rc.0	2016-07-14 10:04:40 +09:00
Fabian Reinartz	e3f4df75a8	Merge pull request #1807 from prometheus/am-label Expand alert templates at eval time.	2016-07-14 10:04:09 +09:00
Fabian Reinartz	ca7ab62f40	*: bump version to 1.0.0-rc.0	2016-07-14 09:55:00 +09:00
Fabian Reinartz	919558f601	config: remove deprecated `target_groups` configuration	2016-07-14 09:55:00 +09:00
Fabian Reinartz	9c3129746c	Merge pull request #1807 from prometheus/am-label Expand alert templates at eval time.	2016-07-13 17:01:42 +02:00
Björn Rabenstein	0622304244	Merge pull request #1798 from prometheus/beorn7/storage2 Crash recovery: Fix an edge case.	2016-07-13 16:53:18 +02:00
Brian Brazil	0509b0f2db	Expand alert templates at eval time. Fixes #1678 #1677	2016-07-12 17:13:55 +01:00
Fabian Reinartz	e87d604f94	Merge pull request #1791 from prometheus/fabxc-routepref web: add -web.route-prefix flag	2016-07-10 12:05:39 +02:00
Fabian Reinartz	f8bb0ee91f	Merge pull request #1793 from prometheus/count_values Add count_values() aggregator.	2016-07-08 11:50:42 +02:00
Fabian Reinartz	b4660a550c	Merge pull request #1797 from prometheus/beorn7/storage Consistently use the `Seconds()` method for conversion of durations	2016-07-07 17:23:06 +02:00
beorn7	2a75b15328	Crash recovery: Fix an edge case. If the chunks of a series in the checkpoint are all older then the latest chunk on disk, the head chunk is persisted and therefore has to be declared closed. It would be great to have a test for this, but that would require more plumbing, subject of #447.	2016-07-07 16:17:38 +02:00
beorn7	064b57858e	Consistently use the `Seconds()` method for conversion of durations This also fixes one remaining case of recording integral numbers of seconds only for a metric, i.e. this will probably fix #1796.	2016-07-07 15:24:35 +02:00
Fabian Reinartz	59d26e8536	web: add -web.route-prefix flag Fixes #1191	2016-07-07 11:49:16 +02:00
Fabian Reinartz	b16f49bb44	Merge pull request #1795 from prometheus/keeping_extra Clean out old keywords	2016-07-07 09:08:37 +02:00
Brian Brazil	875818d060	Clean out old keywords	2016-07-07 05:30:48 +01:00
Brian Brazil	16690736ab	Add count_values() aggregator. This is useful for counting how many instances of a job are running a particular version/build. Fixes #622	2016-07-05 17:14:01 +01:00
Fabian Reinartz	6f19e418e1	Merge pull request #1781 from prometheus/fabxc-k8s-sd Select Kubernetes SD type in configuration	2016-07-05 14:29:46 +02:00
Fabian Reinartz	4591a2623b	discovery/kubernetes: filter pod/container, service/endpoint This change distinguishes and filters by pod/container and service/endpoint in the respective sub-SDs.	2016-07-05 14:24:17 +02:00
Fabian Reinartz	0ff354341b	discovery/kubernetes: remove unused channel	2016-07-05 14:22:12 +02:00
Fabian Reinartz	7221228843	discovery/kubernetes: select between discovery role This adds `role` field to the Kubernetes SD config, which indicates which type of Kubernetes SD should be run. This no longer allows discovering pods and nodes with the same SD configuration for example.	2016-07-05 14:22:12 +02:00
Fabian Reinartz	abdf3536e4	Merge pull request #1788 from prometheus/topk Make topk/bottomk aggregators.	2016-07-05 11:32:17 +02:00
Fabian Reinartz	e0f8caacd7	discovery/kubernetes: extract service endpoint discovery This extract discovery of services and their endpoints into its own type.	2016-07-05 10:26:23 +02:00
Brian Brazil	7f23a4a099	Add type check on topk/bottomk parameter.	2016-07-04 18:03:05 +01:00
Brian Brazil	fa9cc15573	Add topk/bottomk tests for multiple buckets.	2016-07-04 13:18:28 +01:00
Brian Brazil	3b0c182eee	Move topk/bottomk unittests over to aggregators.	2016-07-04 13:18:28 +01:00
Brian Brazil	3e5136e36d	Make topk/bottomk aggregators.	2016-07-04 13:18:19 +01:00
Fabian Reinartz	3c1e15087d	Merge pull request #1785 from prometheus/fabxc-vendor Update vendoring	2016-07-04 13:21:50 +02:00
Fabian Reinartz	f26823afa7	Merge pull request #1787 from prometheus/fabxc-gitignore gitignore: clean up	2016-07-04 11:47:44 +02:00
Fabian Reinartz	746d330a23	gitignore: clean up This removes several outdated or unnecessary ignore patterns. Especially those that match random words such as 'local' or 'core', which repeatedly caused weird behavior that's hard to debug, e.g. invisble vendored files.	2016-07-04 11:34:33 +02:00
Fabian Reinartz	7d441abd7b	vendor: update prometheus org dependencies	2016-07-04 11:09:06 +02:00
Fabian Reinartz	7700cff1ff	vendor: update golang.org/x/sys	2016-07-04 11:07:02 +02:00
Fabian Reinartz	e4e8479716	vendor: add missing liencse/patent notices	2016-07-04 11:06:26 +02:00
Fabian Reinartz	bc506ce959	vendor: update goleveldb dependencies	2016-07-04 10:08:49 +02:00
Fabian Reinartz	f4398d5bdf	Merge pull request #1782 from prometheus/fabxc-testflags cmd/prometheus: use own flag set	2016-07-04 09:27:10 +02:00

1 2 3 4 5 ...

3055 commits