prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-18 19:44:06 -08:00

Author	SHA1	Message	Date
beorn7	836f1db04c	Improve MetricsForLabelMatchers WIP: This needs more tests. It now gets a from and through value, which it may opportunistically use to optimize the retrieval. With possible future range indices, this could be used in a very efficient way. This change merely applies some easy checks, which should nevertheless solve the use case of heavy rule evaluations on servers with a lot of series churn. Idea is the following: - Only archive series that are at least as old as the headChunkTimeout (which was already extremely unlikely to happen). - Then maintain a high watermark for the last archival, i.e. no archived series has a sample more recent than that watermark. - Any query that doesn't reach to a time before that watermark doesn't have to touch the archive index at all. (A production server at Soundcloud with the aforementioned series churn and heavy rule evaluations spends 50% of its CPU time in archive index lookups. Since rule evaluations usually only touch very recent values, most of those lookup should disappear with this change.) - Federation with a very broad label matcher will profit from this, too. As a byproduct, the un-needed MetricForFingerprint method was removed from the Storage interface.	2016-03-09 00:25:59 +01:00
Björn Rabenstein	eebe077f98	Merge pull request #1476 from prometheus/beorn7/makefile Use UTC for build timestamp	2016-03-08 18:18:54 +01:00
beorn7	6ba379e256	Use UTC for build timestamp	2016-03-08 17:47:17 +01:00
beorn7	d77d625ad3	Merge branch 'master' into beorn7/storage6	2016-03-08 17:39:14 +01:00
Brian Brazil	84c421da8e	Merge pull request #1475 from prometheus/fabxc/targetsort Sort exported targets	2016-03-08 16:24:55 +00:00
Fabian Reinartz	f2e359962c	Sort exported targets	2016-03-08 17:12:27 +01:00
Fabian Reinartz	eb915ec40f	Merge pull request #1474 from prometheus/fabxc/spinfix Handle closed target provider channel	2016-03-08 17:02:05 +01:00
Fabian Reinartz	56fc9bdff3	Handle closed target provider channel This fixes the case where a target provider closes the update channel and exits before the context is canceled. This should only be true for the static provider but it's safer to generally handle this case.	2016-03-08 15:49:03 +01:00
Tobias Schmidt	2f151d02eb	Merge pull request #1456 from prometheus/validate-alertmanager-url Validate alertmanager URL	2016-03-07 20:09:46 -05:00
Tobias Schmidt	7763bbd993	Validate alertmanager URL	2016-03-07 20:07:17 -05:00
beorn7	167b83695c	Merge branch 'beorn7/storage5' into beorn7/storage6	2016-03-08 00:20:44 +01:00
beorn7	01795382c9	Merge branch 'beorn7/storage4' into beorn7/storage5	2016-03-08 00:20:13 +01:00
beorn7	c01658e20d	Merge branch 'beorn7/storage3' into beorn7/storage4	2016-03-08 00:18:00 +01:00
beorn7	f138847d31	Merge branch 'beorn7/storage2' into beorn7/storage3	2016-03-08 00:17:33 +01:00
beorn7	f7fc542db6	Merge branch 'master' into beorn7/storage4 Conflicts: storage/local/persistence.go	2016-03-08 00:14:00 +01:00
beorn7	3d86130d8c	Merge branch 'master' into beorn7/storage3	2016-03-07 23:39:12 +01:00
beorn7	1f30c8de8d	Merge branch 'master' into beorn7/storage2	2016-03-07 23:38:42 +01:00
beorn7	c13b1ecfe9	Make chunk iterators more DRY This finally extracts all the common code of the two chunk iterators into one. Any future chunk encodings with fast access by index can use the same iterator by simply providing an indexAccessor. Other future chunk encodings without fast index access (like Gorilla-style) can still implement the chunkIterator interface as usual.	2016-03-07 20:23:14 +01:00
beorn7	32f280a3cd	Slim down the chunkIterator interface For one, remove unneeded methods. Then, instead of using a channel for all values, use a bufio.Scanner-like interface. This removes the need for creating a goroutine and avoids the (unnecessary) locking performed by channel sending and receiving. This will make it much easier to write new chunk implementations (like Gorilla-style encoding).	2016-03-07 19:50:13 +01:00
Björn Rabenstein	1bd4c92e1f	Merge pull request #1457 from prometheus/beorn7/promtool Add a command to promtool that dumps metadata of heads.db	2016-03-07 17:22:48 +01:00
beorn7	b6fdb355d7	Move dump-heads into its own tool	2016-03-07 16:30:19 +01:00
beorn7	f193f2b8ef	Add a command to promtool that dumps metadata of heads.db I needed this today for debugging. It can certainly be improved, but it's already quite helpful. I refactored the reading of heads.db files out of persistence, which is an improvement, too. I made minor changes to the cli package to allow outputting via the io.Writer interface.	2016-03-07 16:21:57 +01:00
Fabian Reinartz	6bbb4af837	Merge pull request #1465 from prometheus/beorn7/fix-test2 Fix flaky file-sd test	2016-03-07 15:46:18 +01:00
beorn7	d44b83690e	Fix flaky file-sd test	2016-03-07 15:39:18 +01:00
Björn Rabenstein	2a2cc52828	Merge pull request #1405 from prometheus/beorn7/storage Streamline series iterator creation	2016-03-07 13:30:56 +01:00
Fabian Reinartz	5b9e85e556	Merge pull request #1404 from prometheus/scraperef2 Retrieval refactoring	2016-03-06 22:17:00 +01:00
Fabian Reinartz	6ceb7e7887	Merge pull request #1463 from mischief/linuxisms scripts: drop -f from hostname, openbsd does not support it	2016-03-05 08:57:18 +01:00
Nick Owens	53777e7bc4	scripts: drop -f from hostname, openbsd does not support it	2016-03-04 19:59:28 -08:00
Fabian Reinartz	8d2a73aff0	Merge pull request #1451 from pdbogen/origin/1446 rewrite operator balancing to be recursive	2016-03-03 19:42:17 +01:00
Patrick Bogen	250344b344	use short variable assignment	2016-03-03 09:46:50 -08:00
beorn7	75a6b460ef	Give TestEvictAndLoadChunkDescs more time to actually evict Obviously, it's really bad to depend on timing here. The proper fix would be to have something like WaitForIndexing for other things to wait for, too. For now, let's see if the wait time increase fixes the issue.	2016-03-03 13:29:39 +01:00
beorn7	fc7de5374a	Quarantine series upon problem writing to the series file This fixes https://github.com/prometheus/prometheus/issues/1059 , but not in the obvious way (simply not updating the persist watermark, because that's actually not that simple - we don't really know what has gone wrong exactly). As any errors relevant here are most likely caused by severe and unrecoverable problems with the series file, Using the now quarantine feature is the right step. We don't really have to be worried about any inconsistent state of the series because it will be removed for good ASAP. Another plus is that we don't have to declare the whole storage dirty anymore.	2016-03-03 13:15:02 +01:00
Fabian Reinartz	29e31dc3c6	Merge pull request #1452 from prometheus/fix-style-checker Detect code style violations in deeply nested files	2016-03-03 09:22:56 +01:00
Tobias Schmidt	d7889e61bb	Detect code style violations in deeply nested files So far the style check did not recognize issues in files in deeply nested directories, e.g. retrieval/discovery/kubernetes/discovery.go.	2016-03-03 02:21:16 -05:00
Patrick Bogen	2062fbae0f	rewrite operator balancing to be recursive	2016-03-02 15:56:40 -08:00
beorn7	0ea5801e47	Handle errors caused by data corruption more gracefully This requires all the panic calls upon unexpected data to be converted into errors returned. This pollute the function signatures quite lot. Well, this is Go... The ideas behind this are the following: - panic only if it's a programming error. Data corruptions happen, and they are not programming errors. - If we detect a data corruption, we "quarantine" the series, essentially removing it from the database and putting its data into a separate directory for forensics. - Failure during writing to a series file is not considered corruption automatically. It will call setDirty, though, so that a crashrecovery upon the next restart will commence and check for that. - Series quarantining and setDirty calls are logged and counted in metrics, but are hidden from the user of the interfaces in interface.go, whith the notable exception of Append(). The reasoning is that we treat corruption by removing the corrupted series, i.e. a query for it will return no results on its next call anyway, so return no results right now. In the case of Append(), we want to tell the user that no data has been appended, though. Minor side effects: - Now consistently using filepath.* instead of path.*. - Introduced structured logging where I touched it. This makes things less consistent, but a complete change to structured logging would be out of scope for this PR.	2016-03-02 23:02:34 +01:00
beorn7	8766f99085	Merge branch 'beorn7/storage2' into beorn7/storage3	2016-03-02 23:02:06 +01:00
beorn7	162f6fa6f6	Merge branch 'beorn7/storage' into beorn7/storage2	2016-03-02 23:01:26 +01:00
beorn7	79a2ae2d2e	Add missing test file	2016-03-02 23:00:23 +01:00
Fabian Reinartz	7a0c0c3ca2	Remove noise from CHANGELOG	2016-03-02 17:59:23 +01:00
Fabian Reinartz	1e7ce3ffdb	Bump version to 0.17.0	2016-03-02 17:59:10 +01:00
Fabian Reinartz	2bfb86d77c	Update changelog for 0.17.0 release	2016-03-02 17:58:55 +01:00
beorn7	b6840997a7	Merge branch 'beorn7/storage2' into beorn7/storage3	2016-03-02 16:11:25 +01:00
beorn7	ce58fd357b	Merge branch 'beorn7/storage' into beorn7/storage2 Conflicts: storage/local/chunk.go storage/local/interface.go	2016-03-02 16:09:32 +01:00
beorn7	2581648f70	Separate iterators by offset Add test that exposes the problem.	2016-03-02 16:01:03 +01:00
Fabian Reinartz	6adf77e411	Merge pull request #1447 from prometheus/fabxc/alertfix Make copying alerting state safer.	2016-03-02 12:25:19 +01:00
Fabian Reinartz	d89c254849	Make copying alerting state safer. This considers static labels in the equality of alerts to avoid falsely copying state from a different alert definition with the same name across reloads. To be safe, it also copies the state map rather than just its pointer so that remaining collisions disappear after one evaluation interval.	2016-03-02 12:21:54 +01:00
Fabian Reinartz	95c9706d2d	Fix missing comment period.	2016-03-02 09:16:56 +01:00
Fabian Reinartz	ddc74f712b	Add sortable target list	2016-03-02 09:10:20 +01:00
Julius Volz	9ea2465b99	Fix typo in lexer test.	2016-03-02 01:13:27 +01:00

... 188 189 190 191 192 ...

12132 commits