prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-12-26 14:09:41 -08:00

Author	SHA1	Message	Date
Brian Brazil	4a2b96f848	Remove backoff on scrape failure. Having metrics with variable timestamps inconsistently spaced when things fail will make it harder to write correct rules. Update status page, requires some refactoring to insert a function. Change-Id: Ie1c586cca53b8f3b318af8c21c418873063738a8	2014-11-25 17:02:00 +01:00
Brian Brazil	f525ca5d9e	Let consoles get graph links from experssions. Rename ConsoleLinkFromExpression, as we now have consoles. Change-Id: I7ed2c9c83863adb390b51121dd9736845f7bcdfc	2014-11-25 17:01:59 +01:00
Brian Brazil	eba205fcac	Expose path used to get to console to console. Change-Id: I72386a2d4e53863da302ecc5c7e44d6c310197e0	2014-11-25 17:01:59 +01:00
Brian Brazil	eb5d928da7	Fix console handler. This was accidnetally broken in `2128d9d811`. Change-Id: I50ea1fdb8ae4d28ae4555410bee97e5037692aa5	2014-11-25 17:01:59 +01:00
Julius Volz	21cafe6cd7	Only evict memory series after they are on disk. This fixes the problem where samples become temporarily unavailable for queries while they are being flushed to disk. Although the entire flushing code could use some major refactoring, I'm explicitly trying to do the minimal change to fix the problem since there's a whole new storage implementation in the pipeline. Change-Id: I0f5393a30b88654c73567456aeaea62f8b3756d9	2014-11-25 17:01:59 +01:00
Bjoern Rabenstein	8956faeccb	Migrate to new client_golang. This change will only be submitted when the new client_golang has been moved to the new version. Change-Id: Ifceb59333072a08286a8ac910709a8ba2e3a1581	2014-11-25 17:01:59 +01:00
Brian Brazil	e27447da5c	Remove the broken "User Dashboard" link. Due to the lack of a </a>, this makes the entire header render badly. Accordingly it's safe to assume noone is using it, so remove it. With the new console template support, we'll need to something a bit more nuanced later. Change-Id: I3424bed6aea18cbd4c63ad48f98808098dadc3ad	2014-11-25 17:01:59 +01:00
Brian Brazil	960ede66dc	Use html/template for console templates and add template libary support. Add a function to bypass the new auto-escaping. Add a function to workaround go's templates only allowing passing in one argument. Change-Id: Id7aa3f95e7c227692dc22108388b1d9b1e2eec99	2014-11-25 17:01:59 +01:00
Brian Brazil	0f5874ff97	Make Prometheus in header link to status page. This is consistent with alertmanager, and more intiutive for users. The graphs page just has graphs, so remove mention of consoles. Change-Id: I87780a4ade33697a6095423e1a7de47d341d2838	2014-11-25 17:01:59 +01:00
Brian Brazil	1828b1f55c	Only log every query when debugging. Change-Id: I4f988d81cda6f6deb0ed7f497de4aa75409b158f	2014-11-25 17:01:59 +01:00
Brian Brazil	e041c0cd46	Add console and alert templates with access to all data. Move rulemanager to it's own package to break cicrular dependency. Make NewTestTieredStorage available to tests, remove duplication. Change-Id: I33b321245a44aa727bfc3614a7c9ae5005b34e03	2014-05-30 16:24:56 +01:00
Julius Volz	01f652cb4c	Separate storage implementation from interfaces. This was initially motivated by wanting to distribute the rule checker tool under `tools/rule_checker`. However, this was not possible without also distributing the LevelDB dynamic libraries because the tool transitively depended on Levigo: rule checker -> query layer -> tiered storage layer -> leveldb This change separates external storage interfaces from the implementation (tiered storage, leveldb storage, memory storage) by putting them into separate packages: - storage/metric: public, implementation-agnostic interfaces - storage/metric/tiered: tiered storage implementation, including memory and LevelDB storage. I initially also considered splitting up the implementation into separate packages for tiered storage, memory storage, and LevelDB storage, but these are currently so intertwined that it would be another major project in itself. The query layers and most other parts of Prometheus now have notion of the storage implementation anymore and just use whatever implementation they get passed in via interfaces. The rule_checker is now a static binary :) Change-Id: I793bbf631a8648ca31790e7e772ecf9c2b92f7a0	2014-04-16 13:30:19 +02:00
Matt T. Proud	2064f32662	Clean up quitting behavior and add quit trigger. The closing of Prometheus now using a sync.Once wrapper to prevent any accidental multiple invocations of it, which could trigger corruption or a race condition. The shutdown process is made more verbose through logging. A not-enabled by default web handler has been provided to trigger a remote shutdown if requested for debugging purposes. Change-Id: If4fee75196bbff1fb1e4a4ef7e1cfa53fef88f2e	2014-04-15 21:40:04 +02:00
Julius Volz	cc04238a85	Switch to new "__name__" metric name label. This also fixes the compaction test, which before worked only because the input sample sorting was accidentally equal to the resulting on-disk sample sorting. Change-Id: I2a21c4b46ba562424b27058fc02eba84fa6a6006	2014-03-14 16:52:37 +01:00
Julius Volz	740d448983	Use custom timestamp type for sample timestamps and related code. So far we've been using Go's native time.Time for anything related to sample timestamps. Since the range of time.Time is much bigger than what we need, this has created two problems: - there could be time.Time values which were out of the range/precision of the time type that we persist to disk, therefore causing incorrectly ordered keys. One bug caused by this was: https://github.com/prometheus/prometheus/issues/367 It would be good to use a timestamp type that's more closely aligned with what the underlying storage supports. - sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit Unix timestamp (possibly even a 32-bit one). Since we store samples in large numbers, this seriously affects memory usage. Furthermore, copying/working with the data will be faster if it's smaller. MEMORY USAGE RESULTS Initial memory usage comparisons for a running Prometheus with 1 timeseries and 100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my tests, this advantage for some reason decreased a bit the more samples the timeseries had (to 5-7% for millions of samples). This I can't fully explain, but perhaps garbage collection issues were involved. WHEN TO USE THE NEW TIMESTAMP TYPE The new clientmodel.Timestamp type should be used whenever time calculations are either directly or indirectly related to sample timestamps. For example: - the timestamp of a sample itself - all kinds of watermarks - anything that may become or is compared to a sample timestamp (like the timestamp passed into Target.Scrape()). When to still use time.Time: - for measuring durations/times not related to sample timestamps, like duration telemetry exporting, timers that indicate how frequently to execute some action, etc. NOTE ON OPERATOR OPTIMIZATION TESTS We don't use operator optimization code anymore, but it still lives in the code as dead code. It still has tests, but I couldn't get all of them to pass with the new timestamp format. I commented out the failing cases for now, but we should probably remove the dead code soon. I just didn't want to do that in the same change as this. Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f	2013-12-03 09:11:28 +01:00
Conor Hennessy	eba01d1119	Remove usage of gorest. Due to on going issues, we've decided to remove gorest. It started with gorest not being thread-safe (it does introspection to create a new handler which is an easy process to mess up with multiple threads of execution): https://code.google.com/p/gorest/issues/detail?id=15 While the issue has been marked fixed, it looks like the patch has introduced more problems than the original issue and simply doesn't work properly. I'm not sure the behaviour was thought through properly. If a new instance is needed every request then a handler-factory is needed or the library needs to set expectations about how the new objects should interact with their constructor state. While it was tempting to try out another routing library, I think for now it's better to use dumb vanilla Go routing. At least until we decide which URL format we intend to standardize on. Change-Id: Ica3da135d05f8ab8fc206f51eeca4f684f8efa0e	2013-10-23 14:19:14 +02:00
Julius Volz	a50ee8df30	Always set CORS headers at beginning of API handler. Change-Id: Icde9a74260c4bb919f09c3e10c6dd5f372ccdaec	2013-10-16 15:59:47 +02:00
Matt T. Proud	4a87c002e8	Update low-level i'faces to reflect wireformats. This commit fixes a critique of the old storage API design, whereby the input parameters were always as raw bytes and never Protocol Buffer messages that encapsulated the data, meaning every place a read or mutation was conducted needed to manually perform said translations on its own. This is taxing. Change-Id: I4786938d0d207cefb7782bd2bd96a517eead186f	2013-09-04 17:13:58 +02:00
Julius Volz	788587426b	Make scrape timeouts configurable per job. Change-Id: I77a7514ad9e7969771f873d63d6353ec50082a62	2013-08-19 12:21:47 +02:00
Matt T. Proud	972e856d9b	Kill the curation state channel. The use of the channels for curation state were always unidiomatic. Change-Id: I1cb1d7175ebfb4faf28dff84201066278d6a0d92	2013-08-13 17:20:22 +02:00
Julius Volz	0003027dce	Add needed trailing spaces in logs.	2013-08-12 18:22:48 +02:00
Julius Volz	aa5d251f8d	Use github.com/golang/glog for all logging.	2013-08-12 17:54:36 +02:00
Julius Volz	ecf0ee8f39	Transfer alerting rule and Prometheus URL to alertmanager.	2013-08-09 18:32:13 +02:00
Matt T. Proud	07ac921aec	Code Review: First pass.	2013-08-05 17:31:49 +02:00
Matt T. Proud	d8792cfd86	Extract HighWatermarking. Clean up the rest.	2013-08-05 11:03:03 +02:00
Julius Volz	fcf784c13c	Fix query error notification in tabular view. Instead of "Unsupported value type" when type="error", the delivered error message should be shown.	2013-08-02 09:04:13 +02:00
Julius Volz	35ee2cd3cb	Add alertmanager notification support to Prometheus. Alert definitions now also have mandatory SUMMARY and DESCRIPTION fields that get sent along a firing alert to the alert manager.	2013-07-30 17:23:41 +02:00
Julius Volz	4e941255d8	Add caching to static assets when served from blob handler.	2013-07-24 18:52:57 +02:00
Julius Volz	1b9cbaf842	Bootstrappify remaining status pages.	2013-07-24 16:09:34 +02:00
Julius Volz	481ee4096b	Add no-op silencing links.	2013-07-24 15:09:42 +02:00
Julius Volz	d9f403ab7d	Prettify/Bootstrapify alert tables.	2013-07-24 15:03:13 +02:00
Julius Volz	f665534b61	Make quote and semicolon usage consistent in graph.js	2013-07-24 12:29:03 +02:00
Julius Volz	c91c100102	Fix graph resize bug when no graph exists.	2013-07-24 12:29:03 +02:00
Julius Volz	9f07f8677a	Generate tabular console view from JSON data.	2013-07-24 12:28:59 +02:00
Sabra Melamed	22ab2366c1	Replacing interface components with Bootstrap. This commit includes Bootstrap 2.3.2 and swaps a multitude of graph, status, and other components to Bootstrap-based widgets.	2013-07-23 20:58:55 +02:00
Matt T. Proud	f7704af4f8	Code Review: Formatting comments.	2013-07-15 15:12:01 +02:00
Matt T. Proud	06b4a40661	Represent targets in a tabular interface. This commit represents a target group's endpoints in a tabular fashion for better differentiation of their state in a concise manner.	2013-07-15 15:12:01 +02:00
Julius Volz	f42adc1cc0	Display Y-axis outside of graph.	2013-07-01 14:47:43 +02:00
Julius Volz	1aa8f071b9	Add content compression support to API HTTP responses.	2013-06-28 16:56:44 +02:00
Matt T. Proud	30b1cf80b5	WIP - Snapshot of Moving to Client Model.	2013-06-25 15:52:42 +02:00
Julius Volz	0226d1ac7a	Implement alerts dashboard and expression console links.	2013-06-13 22:35:40 +02:00
Johannes 'fish' Ziemke	005d65868a	Merge pull request #294 from prometheus/remove-gvm Remove gvm	2013-06-13 05:41:29 -07:00
Johannes 'fish' Ziemke	56249320e3	Remove gvm on travis.	2013-06-13 14:36:00 +02:00
Julius Volz	1fe3d3b06b	Remove obsolete argument from target handling code.	2013-06-11 17:54:58 +02:00
Julius Volz	ba29d07901	Show loaded rules in Status dashboard.	2013-06-11 11:39:31 +02:00
Matt T. Proud	a73f061d3c	Persist solely Protocol Buffers. An design question was open for me in the beginning was whether to serialize other types to disk, but Protocol Buffers quickly won out, which allows us to drop support for other types. This is a good start to cleaning up a lot of cruft in the storage stack and can let us eventually decouple the various moving parts into separate subsystems for easier reasoning. This commit is not strictly required, but it is a start to making the rest a lot more enjoyable to interact with.	2013-06-08 11:02:35 +02:00
Bernerd Schaefer	f7a2436665	Include link to user dashboard when provided	2013-06-07 11:17:17 +02:00
Bernerd Schaefer	1d794896ac	Support user-provided static asset directory [fix #159]	2013-06-07 10:25:12 +02:00
Julius Volz	51689d965d	Add debug timers to instant and range queries. This adds timers around several query-relevant code blocks. For now, the query timer stats are only logged for queries initiated through the UI. In other cases (rule evaluations), the stats are simply thrown away. My hope is that this helps us understand where queries spend time, especially in cases where they sometimes hang for unusual amounts of time.	2013-06-05 18:32:54 +02:00
Matt T. Proud	0d2d6e9a27	Include uptime in the status console. In order to help corroborate whether a Prometheus instance has flapped until meta-monitoring is in-place, we ought to provide the instance's start time in the console to aid in diagnostics.	2013-05-24 10:44:34 +02:00
Julius Volz	8586c7520c	Support negative graph values. Currently graph Y-Axes were hardcoded to start at 0. Choose the Y-scale automatically based on the graph data instead.	2013-05-21 16:54:33 +02:00
Julius Volz	081191afb8	Remember and display last scrape errors in web UI.	2013-05-21 15:31:27 +02:00
Matt T. Proud	1a95406b81	Include forgotten databases.html.	2013-05-14 14:50:54 +02:00
Matt T. Proud	b224251981	Simplify compaction and expose database sizes. This commit simplifies the way that compactions across a database's keyspace occur due to reading the LevelDB internals. Secondarily it introduces the database size estimation mechanisms. Include database health and help interfaces. Add database statistics; remove status goroutines. This commit kills the use of Go routines to expose status throughout the web components of Prometheus. It also dumps raw LevelDB status on a separate /databases endpoint.	2013-05-14 12:29:53 +02:00
Bernerd Schaefer	cdde766f39	Embed mutex on web status handler	2013-05-07 18:15:17 +02:00
Bernerd Schaefer	7740167654	Add comments about potential race conditions	2013-05-07 18:15:17 +02:00
Bernerd Schaefer	9183302b1f	Web handler returns 404 for favicon requests	2013-05-07 18:15:17 +02:00
Matt Proud	7f0d816574	Schedule the background compactors to run. This commit introduces three background compactors, which compact sparse samples together. 1. Older than five minutes is grouped together into chunks of 50 every 30 minutes. 2. Older than 60 minutes is grouped together into chunks of 250 every 50 minutes. 3. Older than one day is grouped together into chunks of 5000 every 70 minutes.	2013-05-07 17:14:04 +02:00
Julius Volz	af7920126c	Fix build errors and add default build step to "make".	2013-05-07 15:54:41 +02:00
Julius Volz	56324d8ce2	Make AST query storage non-global.	2013-05-07 13:15:10 +02:00
Matt T. Proud	3b9b1c6ab4	Define dependencies for web. stack concretely. This commit destroys the use of AppState, which makes passing concrete state along to various serving components onerous.	2013-05-06 11:13:12 +02:00
juliusv	cfc3b1053d	Merge pull request #212 from prometheus/ui/smaller-navigation-links Restyle navigation a bit, align content elements with it	2013-05-03 07:00:09 -07:00
juliusv	2935476818	Merge pull request #211 from prometheus/feature/reorder-hud-elements Move build info to the top of the status HUD.	2013-05-03 06:56:45 -07:00
Julius Volz	04e661c28f	Move build info to the top of the status HUD.	2013-05-03 15:54:32 +02:00
Julius Volz	f3cf8eae7e	Restyle navigation a bit, align content elements with it.	2013-05-03 15:49:08 +02:00
Johannes 'fish' Ziemke	c5e507cd9c	Never submit empty queries.	2013-05-02 16:55:47 +02:00
juliusv	2b9ba56d61	Merge pull request #208 from prometheus/feature/toggle-console Add the console to the main/graph ui.	2013-05-02 04:05:53 -07:00
Johannes 'fish' Ziemke	ba289ef7cd	Add the console to the main/graph ui.	2013-05-02 12:19:34 +02:00
Julius Volz	9cea5d9df8	Convert the Prometheus configuration to protocol buffers.	2013-04-30 22:26:00 +02:00
Matt T. Proud	3362bf36e2	Include curator status in web heads-up-display.	2013-04-29 12:40:33 +02:00
Matt T. Proud	a48ab34dd0	Refresh Prometheus client API usage. The client API has been updated per https://github.com/prometheus/client_golang/pull/9.	2013-04-28 19:40:30 +02:00
juliusv	169a7dc26c	Merge pull request #189 from prometheus/feature/build-info-and-startup-friendliness Build info and startup friendliness	2013-04-26 05:45:34 -07:00
Bernerd Schaefer	19fc094362	Merge pull request #191 from prometheus/update-gitignore-files Ignore web/static/generated and build/root/share	2013-04-25 03:59:04 -07:00
Bernerd Schaefer	169ed9d297	Ignore web/static/generated and build/root/share	2013-04-25 12:33:27 +02:00
Matt T. Proud	961ff26874	Fix positional flags for ``cp`` on Darwin. Unfortunately ``cp`` on Darwin regards some flags as positional and requires them to be in a specific place. The new Protocol Buffer descriptor bundling fails on Mac OS.	2013-04-25 12:16:51 +02:00
Bernerd Schaefer	45243ac2da	Print flags on status page.	2013-04-25 12:12:05 +02:00
Bernerd Schaefer	862054e88b	web.StartServing prints listening address	2013-04-25 11:59:39 +02:00
Bernerd Schaefer	a2a4f94aae	StatusHandler renders build info	2013-04-25 11:57:08 +02:00
Johannes 'fish' Ziemke	1f96d4c822	Move protobuf descriptor and add content-type. - move to static/generated - set content-type based on extension '.description'	2013-04-24 18:51:07 +02:00
Matt T. Proud	9e02c2393a	Include generated Protocol Buffer descriptor. The Protocol Buffer compiler supports generating a machine-readable descriptor file encoded as a provided Protocol Buffer message type, which can be used to decode messages that have been encoded with it after-the-fact. The generated descriptor also bundles in dependent message types. We can use this to perform forensics on old Prometheus clients, if necessary.	2013-04-24 16:59:40 +02:00
Matt T. Proud	e86f4d9dfd	Convert time readers to represent time in UTC. Go's time.Time represents time as UTC in its fundamental data type. That said, when using ``time.Unix(...)``, it sets the zone for the time representation to the local. Unfortunately with diagnosis and our tests, it is a PITA to jump between various zones, even though the serialized version remains the same. To keep things easy, all places where times are generated or read are converted into UTC. These conversions are cheap, for ``Time.In`` merely changes a pointer reference in the struct, nothing more. This enables me to diagnose test failures with fixture data very easily.	2013-04-24 12:19:41 +02:00
Johannes 'fish' Ziemke	955708e8db	Merge pull request #158 from prometheus/feature/auto-refresh Add per graph auto-refresh option to web UI.	2013-04-22 05:10:17 -07:00
Julius Volz	a2623efcdf	Register pprof /debug endpoints with custom HTTP mux.	2013-04-22 13:21:24 +02:00
Johannes 'fish' Ziemke	712bf5e2f9	Add per graph auto-refresh option to web UI. This adds a drop-down menu to select/disable a auto-refresh interval.	2013-04-22 11:42:23 +02:00
Julius Volz	a0d311c9e6	Constantize job name label.	2013-04-15 11:47:54 +02:00
juliusv	f21b5ad12b	Merge pull request #133 from bernerdschaefer/graph-display-tweaks Graph display tweaks	2013-04-15 02:32:45 -07:00
Bernerd Schaefer	72bd585485	Revert style change to legend items	2013-04-15 10:04:09 +02:00
juliusv	f817106d6a	Merge pull request #134 from prometheus/fix/set-job-label-from-targets-api Set job label for targets registered through the API	2013-04-12 07:28:27 -07:00
juliusv	f89d4c2cac	Merge pull request #128 from prometheus/feature/convert-host-relative-links Convert addresses pointing to localhost in status.	2013-04-12 07:27:30 -07:00
Johannes 'fish' Ziemke	14407a076a	Convert addresses pointing to localhost in status. Until now, targets pointing to localhost in the status view are linked to localhost, so you can't follow those links by clicking on them. This change converts the links to point to the hostname of the prometheus server. Before: <a href="http://localhost:9090/metrics.json">http://localhost:9090/metrics.json</a> After: <a href="http://hostname-of-prometheus-server:9090/metrics.json">http://localhost:9090/metrics.json</a>	2013-04-12 15:14:04 +02:00
Bernerd Schaefer	8af0bbb3a0	Set job label for targets registered through the API This is set when jobs are statically registered (see retrieval/targetmanager.go#L92), and should be set here, too.	2013-04-12 14:50:44 +02:00
Bernerd Schaefer	442a6d2b11	Use $ instead of jQuery	2013-04-12 13:43:53 +02:00
Bernerd Schaefer	953334a4f7	Reformat and add semicolons to graph.js	2013-04-12 13:41:53 +02:00
Bernerd Schaefer	43dc377bee	Flip x_label when it would render off-page	2013-04-12 11:59:49 +02:00
Bernerd Schaefer	461e02d2b8	Flip hover detail to prevent going off the screen	2013-04-12 10:39:37 +02:00
Bernerd Schaefer	8c9597cb39	Render legend in a similar style to labels	2013-04-12 10:39:15 +02:00
Bernerd Schaefer	a7ec43189a	Hovering over legend items highlights series in graph	2013-04-12 09:34:12 +02:00
Bernerd Schaefer	564633ecbc	Render graph labels vertically This helps to make the timeseries with many labels fit on the screen.	2013-04-12 09:34:12 +02:00
Bernerd Schaefer	5e9447996b	Set CORS Headers on API requests By setting Access-Control headers, the Prometheus metrics API can be accessed by cross-origin javascript applications (e.g., an external dashboard pulling Prometheus metrics).	2013-04-11 14:51:42 +02:00
Johannes 'fish' Ziemke	8fba639706	Fix path to expression browser js.	2013-04-10 13:09:32 +02:00

1 2 3 4

199 commits