Commit graph

135 commits

Author SHA1 Message Date
Julius Volz 740d448983 Use custom timestamp type for sample timestamps and related code.
So far we've been using Go's native time.Time for anything related to sample
timestamps. Since the range of time.Time is much bigger than what we need, this
has created two problems:

- there could be time.Time values which were out of the range/precision of the
  time type that we persist to disk, therefore causing incorrectly ordered keys.
  One bug caused by this was:

  https://github.com/prometheus/prometheus/issues/367

  It would be good to use a timestamp type that's more closely aligned with
  what the underlying storage supports.

- sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit
  Unix timestamp (possibly even a 32-bit one). Since we store samples in large
  numbers, this seriously affects memory usage. Furthermore, copying/working
  with the data will be faster if it's smaller.

*MEMORY USAGE RESULTS*
Initial memory usage comparisons for a running Prometheus with 1 timeseries and
100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my
tests, this advantage for some reason decreased a bit the more samples the
timeseries had (to 5-7% for millions of samples). This I can't fully explain,
but perhaps garbage collection issues were involved.

*WHEN TO USE THE NEW TIMESTAMP TYPE*
The new clientmodel.Timestamp type should be used whenever time
calculations are either directly or indirectly related to sample
timestamps.

For example:
- the timestamp of a sample itself
- all kinds of watermarks
- anything that may become or is compared to a sample timestamp (like the timestamp
  passed into Target.Scrape()).

When to still use time.Time:
- for measuring durations/times not related to sample timestamps, like duration
  telemetry exporting, timers that indicate how frequently to execute some
  action, etc.

*NOTE ON OPERATOR OPTIMIZATION TESTS*
We don't use operator optimization code anymore, but it still lives in
the code as dead code. It still has tests, but I couldn't get all of them to
pass with the new timestamp format. I commented out the failing cases for now,
but we should probably remove the dead code soon. I just didn't want to do that
in the same change as this.

Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f
2013-12-03 09:11:28 +01:00
Conor Hennessy eba01d1119 Remove usage of gorest.
Due to on going issues, we've decided to remove gorest. It started with gorest
not being thread-safe (it does introspection to create a new handler which is
an easy process to mess up with multiple threads of execution):
    https://code.google.com/p/gorest/issues/detail?id=15
While the issue has been marked fixed, it looks like the patch has introduced
more problems than the original issue and simply doesn't work properly.
I'm not sure the behaviour was thought through properly. If a new instance is
needed every request then a handler-factory is needed or the library needs to
set expectations about how the new objects should interact with their
constructor state.
While it was tempting to try out another routing library, I think for now
it's better to use dumb vanilla Go routing. At least until we decide which
URL format we intend to standardize on.

Change-Id: Ica3da135d05f8ab8fc206f51eeca4f684f8efa0e
2013-10-23 14:19:14 +02:00
Julius Volz a50ee8df30 Always set CORS headers at beginning of API handler.
Change-Id: Icde9a74260c4bb919f09c3e10c6dd5f372ccdaec
2013-10-16 15:59:47 +02:00
Matt T. Proud 4a87c002e8 Update low-level i'faces to reflect wireformats.
This commit fixes a critique of the old storage API design, whereby
the input parameters were always as raw bytes and never Protocol
Buffer messages that encapsulated the data, meaning every place a
read or mutation was conducted needed to manually perform said
translations on its own.  This is taxing.

Change-Id: I4786938d0d207cefb7782bd2bd96a517eead186f
2013-09-04 17:13:58 +02:00
Julius Volz 788587426b Make scrape timeouts configurable per job.
Change-Id: I77a7514ad9e7969771f873d63d6353ec50082a62
2013-08-19 12:21:47 +02:00
Matt T. Proud 972e856d9b Kill the curation state channel.
The use of the channels for curation state were always unidiomatic.

Change-Id: I1cb1d7175ebfb4faf28dff84201066278d6a0d92
2013-08-13 17:20:22 +02:00
Julius Volz 0003027dce Add needed trailing spaces in logs. 2013-08-12 18:22:48 +02:00
Julius Volz aa5d251f8d Use github.com/golang/glog for all logging. 2013-08-12 17:54:36 +02:00
Julius Volz ecf0ee8f39 Transfer alerting rule and Prometheus URL to alertmanager. 2013-08-09 18:32:13 +02:00
Matt T. Proud 07ac921aec Code Review: First pass. 2013-08-05 17:31:49 +02:00
Matt T. Proud d8792cfd86 Extract HighWatermarking.
Clean up the rest.
2013-08-05 11:03:03 +02:00
Julius Volz fcf784c13c Fix query error notification in tabular view.
Instead of "Unsupported value type" when type="error", the delivered
error message should be shown.
2013-08-02 09:04:13 +02:00
Julius Volz 35ee2cd3cb Add alertmanager notification support to Prometheus.
Alert definitions now also have mandatory SUMMARY and DESCRIPTION fields
that get sent along a firing alert to the alert manager.
2013-07-30 17:23:41 +02:00
Julius Volz 4e941255d8 Add caching to static assets when served from blob handler. 2013-07-24 18:52:57 +02:00
Julius Volz 1b9cbaf842 Bootstrappify remaining status pages. 2013-07-24 16:09:34 +02:00
Julius Volz 481ee4096b Add no-op silencing links. 2013-07-24 15:09:42 +02:00
Julius Volz d9f403ab7d Prettify/Bootstrapify alert tables. 2013-07-24 15:03:13 +02:00
Julius Volz f665534b61 Make quote and semicolon usage consistent in graph.js 2013-07-24 12:29:03 +02:00
Julius Volz c91c100102 Fix graph resize bug when no graph exists. 2013-07-24 12:29:03 +02:00
Julius Volz 9f07f8677a Generate tabular console view from JSON data. 2013-07-24 12:28:59 +02:00
Sabra Melamed 22ab2366c1 Replacing interface components with Bootstrap.
This commit includes Bootstrap 2.3.2 and swaps a multitude of graph,
status, and other components to Bootstrap-based widgets.
2013-07-23 20:58:55 +02:00
Matt T. Proud f7704af4f8 Code Review: Formatting comments. 2013-07-15 15:12:01 +02:00
Matt T. Proud 06b4a40661 Represent targets in a tabular interface.
This commit represents a target group's endpoints in a tabular fashion for better differentiation
of their state in a concise manner.
2013-07-15 15:12:01 +02:00
Julius Volz f42adc1cc0 Display Y-axis outside of graph. 2013-07-01 14:47:43 +02:00
Julius Volz 1aa8f071b9 Add content compression support to API HTTP responses. 2013-06-28 16:56:44 +02:00
Matt T. Proud 30b1cf80b5 WIP - Snapshot of Moving to Client Model. 2013-06-25 15:52:42 +02:00
Julius Volz 0226d1ac7a Implement alerts dashboard and expression console links. 2013-06-13 22:35:40 +02:00
Johannes 'fish' Ziemke 005d65868a Merge pull request #294 from prometheus/remove-gvm
Remove gvm
2013-06-13 05:41:29 -07:00
Johannes 'fish' Ziemke 56249320e3 Remove gvm on travis. 2013-06-13 14:36:00 +02:00
Julius Volz 1fe3d3b06b Remove obsolete argument from target handling code. 2013-06-11 17:54:58 +02:00
Julius Volz ba29d07901 Show loaded rules in Status dashboard. 2013-06-11 11:39:31 +02:00
Matt T. Proud a73f061d3c Persist solely Protocol Buffers.
An design question was open for me in the beginning was whether to
serialize other types to disk, but Protocol Buffers quickly won out,
which allows us to drop support for other types.  This is a good
start to cleaning up a lot of cruft in the storage stack and
can let us eventually decouple the various moving parts into
separate subsystems for easier reasoning.

This commit is not strictly required, but it is a start to making
the rest a lot more enjoyable to interact with.
2013-06-08 11:02:35 +02:00
Bernerd Schaefer f7a2436665 Include link to user dashboard when provided 2013-06-07 11:17:17 +02:00
Bernerd Schaefer 1d794896ac Support user-provided static asset directory
[fix #159]
2013-06-07 10:25:12 +02:00
Julius Volz 51689d965d Add debug timers to instant and range queries.
This adds timers around several query-relevant code blocks. For now, the
query timer stats are only logged for queries initiated through the UI.
In other cases (rule evaluations), the stats are simply thrown away.

My hope is that this helps us understand where queries spend time,
especially in cases where they sometimes hang for unusual amounts of
time.
2013-06-05 18:32:54 +02:00
Matt T. Proud 0d2d6e9a27 Include uptime in the status console.
In order to help corroborate whether a Prometheus instance has
flapped until meta-monitoring is in-place, we ought to provide the
instance's start time in the console to aid in diagnostics.
2013-05-24 10:44:34 +02:00
Julius Volz 8586c7520c Support negative graph values.
Currently graph Y-Axes were hardcoded to start at 0. Choose the Y-scale
automatically based on the graph data instead.
2013-05-21 16:54:33 +02:00
Julius Volz 081191afb8 Remember and display last scrape errors in web UI. 2013-05-21 15:31:27 +02:00
Matt T. Proud 1a95406b81 Include forgotten databases.html. 2013-05-14 14:50:54 +02:00
Matt T. Proud b224251981 Simplify compaction and expose database sizes.
This commit simplifies the way that compactions across a database's
keyspace occur due to reading the LevelDB internals. Secondarily it
introduces the database size estimation mechanisms.

Include database health and help interfaces.

Add database statistics; remove status goroutines.

This commit kills the use of Go routines to expose status throughout
the web components of Prometheus. It also dumps raw LevelDB status
on a separate /databases endpoint.
2013-05-14 12:29:53 +02:00
Bernerd Schaefer cdde766f39 Embed mutex on web status handler 2013-05-07 18:15:17 +02:00
Bernerd Schaefer 7740167654 Add comments about potential race conditions 2013-05-07 18:15:17 +02:00
Bernerd Schaefer 9183302b1f Web handler returns 404 for favicon requests 2013-05-07 18:15:17 +02:00
Matt Proud 7f0d816574 Schedule the background compactors to run.
This commit introduces three background compactors, which compact
sparse samples together.

1. Older than five minutes is grouped together into chunks of 50 every 30
   minutes.

2. Older than 60 minutes is grouped together into chunks of 250 every 50
   minutes.

3. Older than one day is grouped together into chunks of 5000 every 70
   minutes.
2013-05-07 17:14:04 +02:00
Julius Volz af7920126c Fix build errors and add default build step to "make". 2013-05-07 15:54:41 +02:00
Julius Volz 56324d8ce2 Make AST query storage non-global. 2013-05-07 13:15:10 +02:00
Matt T. Proud 3b9b1c6ab4 Define dependencies for web. stack concretely.
This commit destroys the use of AppState, which makes passing
concrete state along to various serving components onerous.
2013-05-06 11:13:12 +02:00
juliusv cfc3b1053d Merge pull request #212 from prometheus/ui/smaller-navigation-links
Restyle navigation a bit, align content elements with it
2013-05-03 07:00:09 -07:00
juliusv 2935476818 Merge pull request #211 from prometheus/feature/reorder-hud-elements
Move build info to the top of the status HUD.
2013-05-03 06:56:45 -07:00
Julius Volz 04e661c28f Move build info to the top of the status HUD. 2013-05-03 15:54:32 +02:00