Commit graph

3957 commits

Author SHA1 Message Date
Fabian Reinartz 70490fe568 Merge pull request #1805 from prometheus/higher-level-storage-interface
Make the storage interface higher-level.
2016-08-05 16:17:14 +02:00
Fabian Reinartz 806571074a Merge pull request #1869 from prometheus/fabxc-patch-1
Clarify comment on rule evaluation
2016-08-03 10:47:28 +02:00
Fabian Reinartz 9a269b5507 Clarify comment on rule evaluation
Fixes #1866
2016-08-03 08:29:51 +02:00
Fabian Reinartz a4ee5b14d5 Merge pull request #1865 from prometheus/alert-name
Remove __name__ from alerts sent to AM.
2016-08-01 23:34:19 -07:00
Brian Brazil 6fc88d4b4d Remove __name__ from alerts sent to AM.
Fixes #1861
2016-08-01 23:32:41 +01:00
Fabian Reinartz f9533754d1 Merge pull request #1862 from alicebob/removegraph
'Remove Graph' links on the /graph page
2016-07-31 12:19:27 -07:00
Harmen 7b4a67f651 make assets build 2016-07-31 16:32:25 +02:00
Harmen 0b883e24ba Add a 'Remove Graph' link to the 'Graph' screen 2016-07-31 16:30:23 +02:00
Steve Durrheimer 6633df1607 Merge pull request #1859 from prometheus/sdurrheimer-promu-go-version
Use the default go version for the crossbuilt process
2016-07-31 13:02:34 +02:00
Steve Durrheimer d41e66e9a3
Use the default go version for the crossbuilt process 2016-07-30 11:19:56 +02:00
Fabian Reinartz 603b3e50b9 Merge pull request #1858 from fabric8io/docker-consoles-dir
Docker: Move console dirs to /usr/share/prometheus
2016-07-30 01:30:34 -07:00
Jimmi Dyson bf6d92c63a
Docker: Move console dirs to /usr/share/prometheus 2016-07-29 14:00:47 +01:00
Andrew Hemming 7ebcd678ea Added HTML link for each job name
Useful for quick navigation on the target page when there are many jobs
and targets

Corrected HTML link for each job name

Regenerated bindata
2016-07-28 17:10:34 +01:00
Fabian Reinartz 3a1a5786a8 Merge pull request #1851 from prometheus/grobie/always-format-assets
Always format generated assets
2016-07-27 16:06:22 -07:00
Tobias Schmidt 4042392a2d Always format generated assets
It's easy to forget formatting assets after re-generating them, so let's
do this automatically.
2016-07-27 19:02:18 -04:00
Tobias Schmidt 0b6bec1af7 Merge pull request #1850 from caniszczyk/patch-1
Add CNCF reference in the README
2016-07-27 18:41:42 -04:00
Tobias Schmidt 5416518178 Fix go fmt of ui/bindata.go 2016-07-27 18:35:01 -04:00
Chris Aniszczyk 8066de91ca Add CNCF reference in the README
A simple reference to the CNCF and link to the website.
2016-07-27 17:15:03 -05:00
Fabian Reinartz e7ce06f506 Update Travis Go versions 2016-07-27 12:06:49 -07:00
Fabian Reinartz 3bcc36dd8a Merge pull request #1848 from audunstrand/patch-1
added path to pods scrape job
2016-07-27 09:31:55 -07:00
Fabian Reinartz 823ee3eeee Merge pull request #1847 from alicebob/alertmsg
Message on an empty Alerts page
2016-07-27 09:30:08 -07:00
Audun Fauchald Strand 50e044bb00 added path to pods scrape job 2016-07-27 15:13:53 +02:00
Harmen a1443280b4 make assets build 2016-07-27 12:56:08 +02:00
Harmen 512b3f8d95 Friendlier message when there are no alerting rules 2016-07-27 12:54:46 +02:00
Fabian Reinartz 67b7535471 Merge pull request #1835 from fabric8io/k8s-sd-pod-disco-hostname
Kubernetes SD: Add node name and host IP to pod discovery
2016-07-26 15:18:42 -07:00
Fabian Reinartz 4be73d118e Merge pull request #1831 from mattbostock/remove_silence_column
Alerts template: remove silence table header
2016-07-26 13:41:55 -07:00
Matt Bostock 78715e4182 Alerts template: remove silence table header
There's no corresponding table column for this table header. The
placeholder link for silences was removed in e8800730.

Accordingly, regenerate `web/ui/bindata.go` by running:

    make assets format
2016-07-26 21:33:21 +01:00
Björn Rabenstein 49e317333c Merge pull request #1844 from alicebob/lessnoise
don't store empty values in the URL
2016-07-26 00:59:48 +02:00
Harmen afc5873b0e don't store empty values in the URL 2016-07-25 21:08:05 +02:00
Julius Volz 3bfec97d46 Make the storage interface higher-level.
See discussion in
https://groups.google.com/forum/#!topic/prometheus-developers/bkuGbVlvQ9g

The main idea is that the user of a storage shouldn't have to deal with
fingerprints anymore, and should not need to do an individual preload
call for each metric. The storage interface needs to be made more
high-level to not expose these details.

This also makes it easier to reuse the same storage interface for remote
storages later, as fewer roundtrips are required and the fingerprint
concept doesn't work well across the network.

NOTE: this deliberately gets rid of a small optimization in the old
query Analyzer, where we dedupe instants and ranges for the same series.
This should have a minor impact, as most queries do not have multiple
selectors loading the same series (and at the same offset).
2016-07-25 13:59:22 +02:00
Björn Rabenstein e980913cd6 Merge pull request #1840 from zoidbergwill/patch-1
Fix missing roles in prometheus kubernetes example
2016-07-21 16:13:30 +02:00
William Stewart f97cd29e47
Drop '__meta_kubernetes_role' since we have role in the config 2016-07-21 15:46:14 +02:00
William Stewart 599fafd2aa
Add node job 2016-07-21 15:45:42 +02:00
Björn Rabenstein cc86d3fb0c Merge pull request #1842 from prometheus/beorn7/release
Merge release-1.0 into master
2016-07-21 15:01:52 +02:00
beorn7 1bb077b5ef Merge branch 'release-1.0' into beorn7/release 2016-07-21 14:55:05 +02:00
Björn Rabenstein be4019065c Merge pull request #1841 from prometheus/beorn7/release
Cut release 1.0.1
2016-07-21 14:51:05 +02:00
beorn7 4ff4857112 Recreate assets 2016-07-21 14:11:09 +02:00
beorn7 7e75bb2101 Cut v1.0.1 2016-07-21 14:08:31 +02:00
Dave Rawks 40b9666479 Error on non-flag commandline arguments
- Added minor cmdline parsing logic change to bail on
  unconsumed arguments. Fixes #1821
2016-07-21 14:01:19 +02:00
Brian Brazil 56151e57ba Update example console templates to new HTTP API.
Fixes #1819
2016-07-21 14:01:09 +02:00
William Martin Stewart 58a3771e49 Add roles to prometheus kubernetes example
Needed with Prometheus 1.0
2016-07-21 13:16:23 +02:00
Brian Brazil c3a7941da7 Merge pull request #1799 from prometheus/quantile
Implement quantile and quantile_over_time
2016-07-21 10:34:27 +01:00
Brian Brazil 0303ccc6a7 Add quantile aggregator. 2016-07-21 00:09:19 +01:00
Brian Brazil 15f9fe0a45 Factor out quantile fucntion. 2016-07-20 23:56:18 +01:00
Brian Brazil b0342ba9ec Add quantile_over_time function 2016-07-20 23:56:18 +01:00
Julius Volz 08891beb5f Merge pull request #1828 from drawks/iss-1821
Error on non-flag commandline arguments
2016-07-21 00:35:53 +02:00
Björn Rabenstein 12709af249 Merge pull request #1838 from prometheus/release-1.0
Explicitly add logging flags to our custom flag set
2016-07-21 00:33:12 +02:00
Dave Rawks 00ea36cdbe Error on non-flag commandline arguments
- Added minor cmdline parsing logic change to bail on
  unconsumed arguments. Fixes #1821
2016-07-20 10:28:26 -07:00
Björn Rabenstein 5fab430e73 Merge pull request #1774 from prometheus/beorn7/index
storage: improve index lookups
2016-07-20 17:38:04 +02:00
beorn7 fc6737b7fb storage: improve index lookups
tl;dr: This is not a fundamental solution to the indexing problem
(like tindex is) but it at least avoids utilizing the intersection
problem to the greatest possible amount.

In more detail:

Imagine the following query:

    nicely:aggregating:rule{job="foo",env="prod"}

While it uses a nicely aggregating recording rule (which might have a
very low cardinality), Prometheus still intersects the low number of
fingerprints for `{__name__="nicely:aggregating:rule"}` with the many
thousands of fingerprints matching `{job="foo"}` and with the millions
of fingerprints matching `{env="prod"}`. This totally innocuous query
is dead slow if the Prometheus server has a lot of time series with
the `{env="prod"}` label. Ironically, if you make the query more
complicated, it becomes blazingly fast:

    nicely:aggregating:rule{job=~"foo",env=~"prod"}

Why so? Because Prometheus only intersects with non-Equal matchers if
there are no Equal matchers. That's good in this case because it
retrieves the few fingerprints for
`{__name__="nicely:aggregating:rule"}` and then starts right ahead to
retrieve the metric for those FPs and checking individually if they
match the other matchers.

This change is generalizing the idea of when to stop intersecting FPs
and go into "retrieve metrics and check them individually against
remaining matchers" mode:

- First, sort all matchers by "expected cardinality". Matchers
  matching the empty string are always worst (and never used for
  intersections). Equal matchers are in general consider best, but by
  using some crude heuristics, we declare some better than others
  (instance labels or anything that looks like a recording rule).

- Then go through the matchers until we hit a threshold of remaining
  FPs in the intersection. This threshold is higher if we are already
  in the non-Equal matcher area as intersection is even more expensive
  here.

- Once the threshold has been reached (or we have run out of matchers
  that do not match the empty string), start with "retrieve metrics
  and check them individually against remaining matchers".

A beefy server at SoundCloud was spending 67% of its CPU time in index
lookups (fingerprintsForLabelPairs), serving mostly a dashboard that
is exclusively built with recording rules. With this change, it spends
only 35% in fingerprintsForLabelPairs. The CPU usage dropped from 26
cores to 18 cores. The median latency for query_range dropped from 14s
to 50ms(!). As expected, higher percentile latency didn't improve that
much because the new approach is _occasionally_ running into the worst
case while the old one was _systematically_ doing so. The 99th
percentile latency is now about as high as the median before (14s)
while it was almost twice as high before (26s).
2016-07-20 17:35:53 +02:00