prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2024-11-17 11:04:05 -08:00

Author	SHA1	Message	Date
Bjoern Rabenstein	c3b282bd14	Add regression tests for 'loop until op is consumed' bug. - Most of this is the actual regression test in tiered_test.go. - Working on that regression tests uncovered problems in tiered_test.go that are fixed in this commit. - The 'op.consumed = false' line added to freelist.go was actually not fixing a bug. Instead, there was no bug at all. So this commit removes that line again, but adds a regression test to make sure that the assumed bug is indeed not there (cf. freelist_test.go). - Removed more code duplication in operation.go (following the same approach as before, i.e. embedding op type A into op type B if everything in A is the same as in B with the exception of String() and ExtractSample()). (This change make struct literals for ops more clunky, but that only affects tests. No code change whatsoever was necessary in the actual code after this refactoring.) - Fix another op leak in tiered.go. Change-Id: Ia165c52e33290ad4f6aba9c83d92318d4f583517	2014-03-12 18:40:24 +01:00
Björn Rabenstein	b470fb0672	Merge "Convert metric.Values to slice of values."	2014-03-11 18:44:04 +01:00
Björn Rabenstein	b7ba349ca8	Merge "Introduce semantic versioning."	2014-03-11 18:34:21 +01:00
Julius Volz	86fc13a52e	Convert metric.Values to slice of values. The initial impetus for this was that it made unmarshalling sample values much faster. Other relevant benchmark changes in ns/op: Benchmark old new speedup ================================================================== BenchmarkMarshal 179170 127996 1.4x BenchmarkUnmarshal 404984 132186 3.1x BenchmarkMemoryGetValueAtTime 57801 50050 1.2x BenchmarkMemoryGetBoundaryValues 64496 53194 1.2x BenchmarkMemoryGetRangeValues 66585 54065 1.2x BenchmarkStreamAdd 45.0 75.3 0.6x BenchmarkAppendSample1 1157 1587 0.7x BenchmarkAppendSample10 4090 4284 0.95x BenchmarkAppendSample100 45660 44066 1.0x BenchmarkAppendSample1000 579084 582380 1.0x BenchmarkMemoryAppendRepeatingValues 22796594 22005502 1.0x Overall, this gives us good speedups in the areas where they matter most: decoding values from disk and accessing the memory storage (which is also used for views). Some of the smaller append examples take minimally longer, but the cost seems to get amortized over larger appends, so I'm not worried about these. Also, we're currently not bottlenecked on the write path and have plenty of other optimizations available in that area if it becomes necessary. Memory allocations during appends don't change measurably at all. Change-Id: I7dc7394edea09506976765551f35b138518db9e8	2014-03-11 18:23:37 +01:00
Julius Volz	44390d831d	Introduce semantic versioning. This introduces semantic versioning (http://semver.org/) in Prometheus: - A new VERSION file contains the semantic version string. - The "tarball" target now includes versioning and build information in the tarball name, like: "prometheus-0.1.0.linux-amd64.tar.gz". - A new "release" target allows scp-ing the versioned tarball to a remote machine (file server). - A new "tag" target allows git-tagging the current revision with the version specified in VERSION. Change-Id: I1f19f38b9b317bfa9eb513754750df5a9c602d94	2014-03-11 15:39:22 +01:00
Julius Volz	a7d0973fe3	Add version field to LevelDB sample format. This doesn't add complex discriminator logic yet, but adds a single version byte to the beginning of each samples chunk. If we ever need to change the disk format again, this will make it easy to do so without having to wipe the entire database. Change-Id: I60c39274256f790bc2da83167a1effaa174588fe	2014-03-11 14:08:40 +01:00
Julius Volz	cb9fa1ba93	Merge "Store samples in custom binary encoding."	2014-03-11 13:19:46 +01:00
Julius Volz	84df022025	Cleanup server address handling, support IPv6. This fixes https://github.com/prometheus/prometheus/issues/377, as IPv6 server addresses are now handled correctly. Change-Id: Iebde7cfdadb0a52041472517e6fdcff4303a25ab	2014-03-09 23:31:30 +01:00
Julius Volz	1eee448bc1	Store samples in custom binary encoding. This has been shown to provide immense decoding speed benefits. See also: https://groups.google.com/forum/#!topic/prometheus-developers/FeGl_qzGrYs Change-Id: I7d45b4650e44ddecaa91dad9d7fdb3cd0b9f15fe	2014-03-09 22:31:38 +01:00
Julius Volz	c2a2a20f36	Remove obsolete scanjobs timer. Change-Id: Ifb29b4d93c9c1c6cacb8b098d5237866925c9fac	2014-03-07 17:10:28 +01:00
Julius Volz	dd4892dcad	Ensure no ops are leaked in renderView(). Change-Id: I6970a9098be305fcd010d46443b040d864d9740a	2014-03-07 14:33:13 +01:00
Julius Volz	5745ce0a60	Fixups for single-op-per-fingerprint view rendering. Change-Id: Ie496d4529b65a3819c6042f43d7cf99e0e1ac60b	2014-03-07 00:54:28 +01:00
Björn Rabenstein	8b43497002	Merge "Fix memory series indexing bug."	2014-03-06 11:53:10 +01:00
Björn Rabenstein	0bb33b6525	Merge "Remove unused labelname -> fingerprints index."	2014-03-06 11:40:09 +01:00
Julius Volz	d6827b6898	Fix memory series indexing bug. This fixes https://github.com/prometheus/prometheus/issues/381. For any stale series we dropped from memory, this bug caused us to also drop any other series from the labelpair->fingerprints memory index if they had any label/value-pairs in common with the intentionally dropped series. To fix this issue more easily, I converted the labelpair->fingerprints index map values to a utility.Set of clientmodel.Fingerprints. This makes handling this index much easier in general. Change-Id: If5e81e202e8c542261bbd9797aa1257376c5c074	2014-03-06 01:23:22 +01:00
Julius Volz	c6013ff309	Remove unused labelname -> fingerprints index. Change-Id: Ie4ccea3a230532e670030ca64ede9435b1b3e506	2014-03-05 23:49:33 +01:00
Bjoern Rabenstein	9ea9189dd1	Remove the multi-op-per-fingerprint capability. Currently, rendering a view is capable of handling multiple ops for the same fingerprint efficiently. However, this capability requires a lot of complexity in the code, which we are not using at all because the way we assemble a viewRequest will never have more than one operation per fingerprint. This commit weeds out the said capability, along with all the code needed for it. It is still possible to have more than one operation for the same fingerprint, it will just be handled in a less efficient way (as proven by the unit tests). As a result, scanjob.go could be removed entirely. This commit also contains a few related refactorings and removals of dead code in operation.go, view,go, and freelist.go. Also, the docstrings received some love. Change-Id: I032b976e0880151c3f3fdb3234fb65e484f0e2e5	2014-03-04 16:29:56 +01:00
Julius Volz	817d9b0e97	"go fmt" fixup. Change-Id: I262bb462281bc2610819c822fc7a0768c6ce3d8d	2014-02-27 19:48:55 +01:00
Julius Volz	e8d963d630	Remove some dead code. Change-Id: Ie148b506eb6037a0f44d976a2b53e4bf7e64e212	2014-02-27 19:48:55 +01:00
Bjoern Rabenstein	7f078efe5e	Merge "Unify LevelDB.*Options."	2014-02-27 18:21:02 +01:00
Julius Volz	24a0ccdd54	Merge "Major code cleanup in storage."	2014-02-27 18:04:05 +01:00
Bjoern Rabenstein	e11e8c7a23	Unify LevelDB.Options. We have seven different types all called like LevelDB.Options. One of them is the plain LevelDBOptions. All others are just wrapping that type without adding anything except clunkier handling. If there ever was a plan to add more specific options to the various LevelDB.*Options types, history has proven that nothing like that is going to happen anytime soon. To keep the code a bit shorter and more focused on the real (quite significant) complexities we have to deal with here, this commit reduces all uses of LevelDBOptions to the actual LevelDBOptions type. 1576 fewer characters to read... Change-Id: I3d7a2b7ffed78b337aa37f812c53c058329ecaa6	2014-02-27 16:03:58 +01:00
Bjoern Rabenstein	6bc083f38b	Major code cleanup in storage. - Mostly docstring fixed/additions. (Please review these carefully, since most of them were missing, I had to guess them from an outsider's perspective. (Which on the other hand proves how desperately required many of these docstrings are.)) - Removed all uses of new(...) to meet our own style guide (draft). - Fixed all other 'go vet' and 'golint' issues (except those that are not fixable (i.e. caused by bugs in or by design of 'go vet' and 'golint')). - Some trivial refactorings, like reorder functions, minor renames, ... - Some slightly less trivial refactoring, mostly to reduce code duplication by embedding types instead of writing many explicit forwarders. - Cleaned up the interface structure a bit. (Most significant probably the removal of the View-like methods from MetricPersistenc. Now they are only in View and not duplicated anymore.) - Removed dead code. (Probably not all of it, but it's a first step...) - Fixed a leftover in storage/metric/end_to_end_test.go (that made some parts of the code never execute (incidentally, those parts were broken (and I fixed them, too))). Change-Id: Ibcac069940d118a88f783314f5b4595dce6641d5	2014-02-27 15:22:37 +01:00
Julius Volz	c35db9f080	Add Java implementation links to metric model docs. Change-Id: If2a36aa305a0806ffdf490f78e85afb71da8c202	2014-02-27 15:08:44 +01:00
Julius Volz	688f4f43c3	Minor metric type documentation fixups. Change-Id: Ib01ab728e9f0a6b15c23f1cde84161efe9f89e33	2014-02-25 15:29:04 +01:00
Julius Volz	91aebda74d	Merge "Interim commit of metric model."	2014-02-25 15:24:17 +01:00
Julius Volz	bc6ee6611e	Rename persistence_adapter.go -> view_adapter.go Change-Id: Ib45081393b734531d2f85a02f46e87930aab3273	2014-02-22 22:43:11 +01:00
Julius Volz	3f226c9724	Rename {Scalar,Vector}Literal to {Scalar,Vector}Selector. Change-Id: Ie92301f47f5f49f30b3a62c365e377108982b080	2014-02-22 22:33:42 +01:00
Julius Volz	a8d4a7ce48	Merge "Compact everything to the same sample group size."	2014-02-19 17:28:26 +01:00
Julius Volz	2279fcbac4	Compact everything to the same sample group size. Change-Id: Ibb4f3a5d76173d64de916ef1eb41ab5d7900c97b	2014-02-19 16:22:20 +01:00
Julius Volz	92ea823e0c	Fix alertmanager API path. Change-Id: Iea6059decb121c7e75c1828406c4e0b3f2fc1c5d	2014-02-19 16:05:54 +01:00
Bjoern Rabenstein	682cf6fc51	Simplify QueryAnalizer.Visit(). Change-Id: I628582a1903b7273e78921e22a475f1dae5ebaae	2014-02-14 15:15:57 +01:00
Bjoern Rabenstein	fd63500ed3	Make rules/ast golint clean. Mostly, that means adding compliant doc strings to exported items. Also, remove 'go vet' warnings where possible. (Some are unfortunately not to avoid, arguably bugs in 'go vet'.) Change-Id: I2827b6dd317492864c1383c3de1ea9eac5a219bb	2014-02-14 15:01:39 +01:00
Johannes 'fish' Ziemke	5e8026779f	Make Dockerfile build prometheus in container This way the binary will be built in a clear environment and prometheus can be added to the docker index. Change-Id: I417fb90adf2503c990a96f4bad370b09b102e0b9	2014-02-14 11:47:47 +01:00
Björn Rabenstein	59febe771a	Merge "Minor code cleanups."	2014-02-13 15:29:16 +01:00
Julius Volz	c4adfc4f25	Minor code cleanups. Change-Id: Ib3729cf38b107b7f2186ccf410a745e0472e3630	2014-02-13 15:24:43 +01:00
Julius Volz	67ccf7b8e7	Merge "Add -O3 to all C/C++ compiles."	2014-02-13 12:46:36 +01:00
Bjoern Rabenstein	1f90abdc1f	Add -O3 to all C/C++ compiles. So far, we are compiling C/C++ code without any optimization. In non-representative, but practically relevant tests, the -O3 improved the total query time for a demanding graph by ~20%. Change-Id: I5e8123650e53a4933ed4fbe63d0b1ca67217b865	2014-02-13 12:37:43 +01:00
Julius Volz	8cadae6102	Merge "Fix LevelDB closing order."	2014-02-03 23:22:30 +01:00
Julius Volz	94666e20b7	Minor test error reporting cleanup. Change-Id: Ie11c16b4e60de7c179c6d2a86e063f4432e2000f	2014-02-03 12:27:01 +01:00
Julius Volz	fd2158e746	Store copy of metric during fingerprint caching Problem description: ==================== If a rule evaluation referencing a metric/timeseries M happens at a time when M doesn't have a memory timeseries yet, looking up the fingerprint for M (via TieredStorage.GetMetricForFingerprint()) will create a new Metric object for M which gets both: a) attached to a new empty memory timeseries (so we don't have to ask disk for the Metric's fingerprint next time), and b) returned to the rule evaluation layer. However, the rule evaluation layer replaces the name label (and possibly other labels) of the metric with the name of the recorded rule. Since both the rule evaluator and the memory storage share a reference to the same Metric object, the original memory timeseries will now also be incorrectly renamed. Fix: ==== Instead of storing a reference to a shared metric object, take a copy of the object when creating an empty memory timeseries for caching purposes. Change-Id: I9f2172696c16c10b377e6708553a46ef29390f1e	2014-02-02 17:11:08 +01:00
Julius Volz	7e9ecaac3a	Add count_scalar() function. Change-Id: I63f09dd0479d0a6b016f5f857dd39dcbda56c7f9	2014-01-30 13:07:26 +01:00
Julius Volz	718ad2224b	Fix LevelDB closing order. The storage itself should be closed before any of the objects passed into it are closed (otherwise closing the storage can randomly freeze). Defers are executed in reverse order, so closing the storage should be the last of the defer statements. Change-Id: Id920318b876f5b94767ed48c81221b3456770620	2014-01-28 15:16:06 +01:00
Julius Volz	18d9d00100	Upgrade to Go 1.2. Change-Id: If8451257487edc4b76f4248f6e6b47c073dea183	2014-01-24 16:13:36 +01:00
Julius Volz	b382e8b7bd	Remove overly verbose DNS-SD logging line. Change-Id: Ie4534437ab88b9a6b99f5cb6c2f32c9588c1fff6	2014-01-24 16:09:41 +01:00
Julius Volz	0378c2ca1f	Nonexistent labels in BY-clauses shouldn't propagate to result. This fixes bug 2. of https://github.com/prometheus/prometheus/issues/374 Change-Id: Ia4a13153616bafce5bf10597966b071434422d09	2014-01-24 16:05:30 +01:00
Bjoern Rabenstein	c342ad33a0	Fix OperatorError. This used to work with Go 1.1, but only because of a compiler bug. The bug is fixed in Go 1.2, so we have to fix our code now. Change-Id: I5a9f3a15878afd750e848be33e90b05f3aa055e1	2014-01-21 16:49:51 +01:00
Julius Volz	d5ef0c64dc	Merge "Add optional sample replication to OpenTSDB."	2014-01-08 17:45:08 +01:00
Julius Volz	61d26e8445	Add optional sample replication to OpenTSDB. Prometheus needs long-term storage. Since we don't have enough resources to build our own timeseries storage from scratch ontop of Riak, Cassandra or a similar distributed datastore at the moment, we're planning on using OpenTSDB as long-term storage for Prometheus. It's data model is roughly compatible with that of Prometheus, with some caveats. As a first step, this adds write-only replication from Prometheus to OpenTSDB, with the following things worth noting: 1) I tried to keep the integration lightweight, meaning that anything related to OpenTSDB is isolated to its own package and only main knows about it (essentially it tees all samples to both the existing storage and TSDB). It's not touching the existing TieredStorage at all to avoid more complexity in that area. This might change in the future, especially if we decide to implement a read path for OpenTSDB through Prometheus as well. 2) Backpressure while sending to OpenTSDB is handled by simply dropping samples on the floor when the in-memory queue of samples destined for OpenTSDB runs full. Prometheus also only attempts to send samples once, rather than implementing a complex retry algorithm. Thus, replication to OpenTSDB is best-effort for now. If needed, this may be extended in the future. 3) Samples are sent in batches of limited size to OpenTSDB. The optimal batch size, timeout parameters, etc. may need to be adjusted in the future. 4) OpenTSDB has different rules for legal characters in tag (label) values. While Prometheus allows any characters in label values, OpenTSDB limits them to a to z, A to Z, 0 to 9, -, _, . and /. Currently any illegal characters in Prometheus label values are simply replaced by an underscore. Especially when integrating OpenTSDB with the read path in Prometheus, we'll need to reconsider this: either we'll need to introduce the same limitations for Prometheus labels or escape/encode illegal characters in OpenTSDB in such a way that they are fully decodable again when reading through Prometheus, so that corresponding timeseries in both systems match in their labelsets. Change-Id: I8394c9c55dbac3946a0fa497f566d5e6e2d600b5	2014-01-02 18:21:38 +01:00
Julius Volz	7b013e6491	Merge "Replace some uses of obsolete `/metrics.json` with `/metrics` (haven't touched test files yet)."	2013-12-18 16:56:30 +01:00

... 231 232 233 234 235 ...

12515 commits