On a 32 bit architecture the size of int is 32 bits. Thus converting from
int64, uint64 can overflow it and flip the sign.
Try for yourself in playground:
package main
import "fmt"
func main() {
x := int64(0x1F0000001)
y := int64(1)
z := int32(x - y) // numerically this is 0x1F0000000
fmt.Printf("%v\n", z)
}
Prints -268435456 as if x was smaller.
Followup to #12650
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* Additionally wrap WBL replay error
Although WBL replay is already wrapped with errLoadWbl,
there are other errors that can happen during a WBL replay.
We should not try to repair WAL in those cases.
This commit additionally wraps the final error in Head.Init again
with errLoadWbl so that WBL replay errors can be identified properly.
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>
Otherwise we have a highly unusual situation of over 100 chunks
in the headChunks list of each series, which heavily skews
performance.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Simlar to cleanup of WAL files on startup, cleanup temporary
chunk_snapshot dirs. This prevents storage space leaks due to terminated
snapshots on shutdown.
Signed-off-by: SuperQ <superq@gmail.com>
Currently memSeries holds a single head chunk in-memory and a slice of mmapped chunks.
When append() is called on memSeries it might decide that a new headChunk is needed to use for given append() call.
If that happens it will first mmap existing head chunk and only after that happens it will create a new empty headChunk and continue appending
our sample to it.
Since appending samples uses write lock on memSeries no other read or write can happen until any append is completed.
When we have an append() that must create a new head chunk the whole memSeries is blocked until mmapping of existing head chunk finishes.
Mmapping itself uses a lock as it needs to be serialised, which means that the more chunks to mmap we have the longer each chunk might wait
for it to be mmapped.
If there's enough chunks that require mmapping some memSeries will be locked for long enough that it will start affecting
queries and scrapes.
Queries might timeout, since by default they have a 2 minute timeout set.
Scrapes will be blocked inside append() call, which means there will be a gap between samples. This will first affect range queries
or calls using rate() and such, since the time range requested in the query might have too few samples to calculate anything.
To avoid this we need to remove mmapping from append path, since mmapping is blocking.
But this means that when we cut a new head chunk we need to keep the old one around, so we can mmap it later.
This change makes memSeries.headChunk a linked list, memSeries.headChunk still points to the 'open' head chunk that receives new samples,
while older, yet to be mmapped, chunks are linked to it.
Mmapping is done on a schedule by iterating all memSeries one by one. Thanks to this we control when mmapping is done, since we trigger
it manually, which reduces the risk that it will have to compete for mmap locks with other chunks.
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Snappy remains as the default compression but there is now a flag to switch
the compression algorithm.
Signed-off-by: Justin Lei <justin.lei@grafana.com>
* WIP implement WAL watcher reading via notifications over a channel from
the TSDB code
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Notify via head appenders Commit (finished all WAL logging) rather than
on each WAL Log call
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Fix misspelled Notify plus add a metric for dropped Write notifications
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Update tests to handle new notification pattern
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* this test maybe needs more time on windows?
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* does this test need more time on windows as well?
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* read timeout is already a time.Duration
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* remove mistakenly commited benchmark data files
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* address some review feedback
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* fix missed changes from previous commit
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Fix issues from wrapper function
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* try fixing race condition in test by allowing tests to overwrite the
read ticker timeout instead of calling the Notify function
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* fix linting
Signed-off-by: Callum Styan <callumstyan@gmail.com>
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.
The exceptions that I have found in our codebase are just these two:
* The `if else` is followed by an additional statement before the next
condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
used. In this case, using `switch` would require tagging the `for`
loop, which probably tips the balance.
Why are `switch` statements more readable?
For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.
I'm sure the aforemention wise coders can list even more reasons.
In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.
Signed-off-by: beorn7 <beorn@grafana.com>
* Adds an affirmative log message for successful WAL repair
Signed-off-by: Vernon Miller <vernon.miller@grafana.com>
Signed-off-by: Vernon Miller <96601789+aldernero@users.noreply.github.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Adds `WALReplayConcurrency` as an option on tsdb `Options` and `HeadOptions`.
If it is not set or set <=0, then `GOMAXPROCS` is used, which matches the previous behaviour.
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Instead of relying on `labels.Hash()`, which may change, have the
caller pass in a shard function if required.
For most purposes `tsdb.Options.ShardFunc` is used, but the compactor
may be created independently so `NewLeveledCompactorWithChunkSize` also
takes a shard function parameter.
Regular Prometheus, which does not use block sharding, will have this
parameter as nil.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
When out-of-order is enabled, queries go through both Head and OOOHead,
and they both execute the same PostingsForMatchers call, as memSeries
are shared for both.
In some cases these calls can be heavy, and also frequent. We can
deduplicate those calls by using the PostingsForMatchers cache that we
already use for query sharding.
The usage of this cache can skip a newly appended series in the results
for the duration of the ttl.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>