prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2025-03-05 20:59:13 -08:00

History

Lukasz Mierzwa 565c6fa704 Reduce stringlabels memory usage for common labels stringlabels stores all time series labels as a single string using this format: <length><name><length><value>[<length><name><length><value> ...] So a label set for my_metric{job=foo, instance="bar", env="prod", blank=""} would be encoded as: [8]__name__[9]my_metric[3]job[3]foo[8]instance[3]bar[3]env[4]prod[5]blank[0] This is a huge improvement over 'classic' labels implementation that stores all label names & values as seperate strings. There is some room for improvement though since some string are present more often than others. For example __name__ will be present for all label sets of every time series we store in HEAD, eating 1+8=9 bytes. Since __name__ is well known string we can try to use a single byte to store it in our encoded string, rather than repeat it in full each time. To be able to store strings that are short cut into a single byte we need to somehow signal that to the reader of the encoded string, for that we use the fact that zero length strings are rare and generaly not stored on time series. If we have an encoded string with zero length then this will now signal that it represents a mapped value - to learn the true value of this string we need to read the next byte which gives us index in a static mapping. That mapping must include empty string, so that we can still encode empty strings using this scheme. Example of our mapping (minimal version): 0: "" 1: "__name__" 2: "instance" 3: "job" With that mapping our example label set would be encoded as: [0]1[9]mymetric[0]3[3]foo[0]2[3]bar[3]env[4]prod[5]blank[0]0 The tricky bit is how to populate this mapping with useful strings that will result in measurable memory savings. This is further complicated by the fact that the mapping must remain static and cannot be modified during Prometheus lifetime. We can use all the 255 slots we have inside our mapping byte with well known generic strings and that will provide some measurable savings for all Prometheus users, and is essentially a slightly more compact stringlabels variant. We could also allow users to pass in a list of well know strings via flags, which will allow Prometheus operators to reduce memory usage for any labels if they know those are popular. Third option is to discover most popular strings from TSDB or WAL on startup, but that's more complicated and we might pick a list that would be the best set of mapped strings on startup, but after some time is no longer the best set. Benchmark results: goos: linux goarch: amd64 pkg: github.com/prometheus/prometheus/model/labels cpu: 13th Gen Intel(R) Core(TM) i7-13800H │ main.txt │ new1.txt │ │ sec/op │ sec/op vs base │ String-20 863.8n ± 4% 873.0n ± 4% ~ (p=0.353 n=10) Labels_Get/with_5_labels/first_label/get-20 4.763n ± 1% 5.035n ± 0% +5.72% (p=0.000 n=10) Labels_Get/with_5_labels/first_label/has-20 3.439n ± 0% 3.967n ± 0% +15.37% (p=0.000 n=10) Labels_Get/with_5_labels/middle_label/get-20 7.077n ± 1% 9.588n ± 1% +35.47% (p=0.000 n=10) Labels_Get/with_5_labels/middle_label/has-20 5.166n ± 0% 6.990n ± 1% +35.30% (p=0.000 n=10) Labels_Get/with_5_labels/last_label/get-20 9.181n ± 1% 12.970n ± 1% +41.26% (p=0.000 n=10) Labels_Get/with_5_labels/last_label/has-20 8.101n ± 1% 11.640n ± 1% +43.69% (p=0.000 n=10) Labels_Get/with_5_labels/not-found_label/get-20 3.974n ± 0% 4.768n ± 0% +19.98% (p=0.000 n=10) Labels_Get/with_5_labels/not-found_label/has-20 3.974n ± 0% 5.033n ± 0% +26.65% (p=0.000 n=10) Labels_Get/with_10_labels/first_label/get-20 4.761n ± 0% 5.042n ± 0% +5.90% (p=0.000 n=10) Labels_Get/with_10_labels/first_label/has-20 3.442n ± 0% 3.972n ± 0% +15.40% (p=0.000 n=10) Labels_Get/with_10_labels/middle_label/get-20 10.62n ± 1% 14.85n ± 1% +39.83% (p=0.000 n=10) Labels_Get/with_10_labels/middle_label/has-20 9.360n ± 1% 13.375n ± 0% +42.90% (p=0.000 n=10) Labels_Get/with_10_labels/last_label/get-20 18.19n ± 1% 22.00n ± 0% +20.97% (p=0.000 n=10) Labels_Get/with_10_labels/last_label/has-20 16.51n ± 0% 20.50n ± 1% +24.14% (p=0.000 n=10) Labels_Get/with_10_labels/not-found_label/get-20 3.985n ± 0% 4.768n ± 0% +19.62% (p=0.000 n=10) Labels_Get/with_10_labels/not-found_label/has-20 3.973n ± 0% 5.045n ± 0% +26.97% (p=0.000 n=10) Labels_Get/with_30_labels/first_label/get-20 4.773n ± 0% 5.050n ± 1% +5.80% (p=0.000 n=10) Labels_Get/with_30_labels/first_label/has-20 3.443n ± 1% 3.976n ± 2% +15.50% (p=0.000 n=10) Labels_Get/with_30_labels/middle_label/get-20 31.93n ± 0% 43.50n ± 1% +36.21% (p=0.000 n=10) Labels_Get/with_30_labels/middle_label/has-20 30.53n ± 0% 41.75n ± 1% +36.75% (p=0.000 n=10) Labels_Get/with_30_labels/last_label/get-20 106.55n ± 0% 71.17n ± 0% -33.21% (p=0.000 n=10) Labels_Get/with_30_labels/last_label/has-20 104.70n ± 0% 69.21n ± 1% -33.90% (p=0.000 n=10) Labels_Get/with_30_labels/not-found_label/get-20 3.976n ± 1% 4.772n ± 0% +20.03% (p=0.000 n=10) Labels_Get/with_30_labels/not-found_label/has-20 3.974n ± 0% 5.032n ± 0% +26.64% (p=0.000 n=10) Labels_Equals/equal-20 2.382n ± 0% 2.446n ± 0% +2.67% (p=0.000 n=10) Labels_Equals/not_equal-20 0.2741n ± 2% 0.2662n ± 2% -2.88% (p=0.001 n=10) Labels_Equals/different_sizes-20 0.2762n ± 3% 0.2652n ± 0% -3.95% (p=0.000 n=10) Labels_Equals/lots-20 2.381n ± 0% 2.386n ± 1% +0.23% (p=0.011 n=10) Labels_Equals/real_long_equal-20 6.087n ± 1% 5.558n ± 1% -8.70% (p=0.000 n=10) Labels_Equals/real_long_different_end-20 5.030n ± 0% 4.699n ± 0% -6.57% (p=0.000 n=10) Labels_Compare/equal-20 4.814n ± 1% 4.777n ± 0% -0.77% (p=0.000 n=10) Labels_Compare/not_equal-20 17.55n ± 8% 20.92n ± 1% +19.24% (p=0.000 n=10) Labels_Compare/different_sizes-20 3.711n ± 1% 3.707n ± 0% ~ (p=0.224 n=10) Labels_Compare/lots-20 27.09n ± 3% 28.73n ± 2% +6.05% (p=0.000 n=10) Labels_Compare/real_long_equal-20 27.91n ± 3% 15.67n ± 1% -43.86% (p=0.000 n=10) Labels_Compare/real_long_different_end-20 33.92n ± 1% 35.35n ± 1% +4.22% (p=0.000 n=10) Labels_Hash/typical_labels_under_1KB-20 59.63n ± 0% 59.67n ± 0% ~ (p=0.897 n=10) Labels_Hash/bigger_labels_over_1KB-20 73.42n ± 1% 73.81n ± 1% ~ (p=0.342 n=10) Labels_Hash/extremely_large_label_value_10MB-20 720.3µ ± 2% 715.2µ ± 3% ~ (p=0.971 n=10) Builder-20 371.6n ± 4% 1191.0n ± 3% +220.46% (p=0.000 n=10) Labels_Copy-20 85.52n ± 4% 53.90n ± 48% -36.97% (p=0.000 n=10) geomean 13.26n 14.68n +10.71% │ main.txt │ new1.txt │ │ B/op │ B/op vs base │ String-20 240.0 ± 0% 240.0 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/first_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/first_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/middle_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/middle_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/last_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/last_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/not-found_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/not-found_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/first_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/first_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/middle_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/middle_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/last_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/last_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/not-found_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/not-found_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/first_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/first_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/middle_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/middle_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/last_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/last_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/not-found_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/not-found_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/not_equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/different_sizes-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/lots-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/real_long_equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/real_long_different_end-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/not_equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/different_sizes-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/lots-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/real_long_equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/real_long_different_end-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Hash/typical_labels_under_1KB-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Hash/bigger_labels_over_1KB-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Hash/extremely_large_label_value_10MB-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Builder-20 224.0 ± 0% 192.0 ± 0% -14.29% (p=0.000 n=10) Labels_Copy-20 224.0 ± 0% 192.0 ± 0% -14.29% (p=0.000 n=10) geomean ² -0.73% ² ¹ all samples are equal ² summaries must be >0 to compute geomean │ main.txt │ new1.txt │ │ allocs/op │ allocs/op vs base │ String-20 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/first_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/first_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/middle_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/middle_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/last_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/last_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/not-found_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_5_labels/not-found_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/first_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/first_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/middle_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/middle_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/last_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/last_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/not-found_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_10_labels/not-found_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/first_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/first_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/middle_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/middle_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/last_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/last_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/not-found_label/get-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Get/with_30_labels/not-found_label/has-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/not_equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/different_sizes-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/lots-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/real_long_equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Equals/real_long_different_end-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/not_equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/different_sizes-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/lots-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/real_long_equal-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Compare/real_long_different_end-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Hash/typical_labels_under_1KB-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Hash/bigger_labels_over_1KB-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Hash/extremely_large_label_value_10MB-20 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Builder-20 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Labels_Copy-20 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² +0.00% ² ¹ all samples are equal ² summaries must be >0 to compute geomean Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>		2025-02-20 12:06:33 +00:00
..
exemplar	update links to openmetrics to reference the v1.0.0 release	2024-12-13 21:32:27 +00:00
histogram	fix TestCuttingNewHeadChunks/really_large_histograms on 32-bit	2024-12-16 10:45:01 -05:00
labels	Reduce stringlabels memory usage for common labels	2025-02-20 12:06:33 +00:00
metadata	Fix: metadata API using wrong field names (#13633 )	2024-02-26 09:53:39 +00:00
relabel	Addressed comments.	2025-01-27 09:54:13 +00:00
rulefmt	rulefmt: support YAML aliases for Alert/Record/Expr (#14957 )	2025-02-13 20:48:33 +11:00
textparse	textparse: Optimized protobuf parser with custom streaming unmarshal. (#15731 )	2025-02-13 10:38:35 +00:00
timestamp	Move packages out of deprecated pkg directory	2021-11-09 08:03:10 +01:00
value	Move packages out of deprecated pkg directory	2021-11-09 08:03:10 +01:00