- Parallelize AppendSamples as much as possible without breaking the
contract about temporal order.
- Allocate more fingerprint locker slots.
- Do not run early checkpoints if we are behind on chunk persistence.
- Increase fpMinWaitDuration to give the disk more time for more
important things.
Also, switch math.MaxInt64 and math.MinInt64 to the new constants.
- Increase samplesQueueCapacity.
- Improve docstring for the above.
- Accept a short waiting period for the ingest channel to become
ready. This should depend on the http timeout, but 100ms is probably
good enough to cushion bursts bigger than samplesQueueCapacity,
while it is unlikely that anybody ever will set an HTTP timeout
similarly short.
This is now not even trying to throttle in a benign way, but creates a
fully-fledged error. Advantage: It shows up very visible on the status
page. Disadvantage: The server does not really adjusts to a lower
scraping rate. However, if your ingestion backs up, you are in a very
irregulare state, I'd say it _should_ be considered an error and not
dealt with in a more graceful way.
In different news: I'll work on optimizing ingestion so that we will
not as easily run into that situation in the first place.
- original series data is saved so it can be re-transformed after
Rickshaw's stacking modified the series data
- always reconstruct graphs from scratch instead of updating the
settings of an existing one (simplification)
- always wipe and recreate all graph-related DOM elements completely so
that no left-over event handlers cause background event handlers
The simple algorithm applied here will increase the actual interval
incrementally, whenever and as long as the scrape itself takes longer
than the configured interval. Once it takes shorter again, the actual
interval will iteratively decrease again.
Also, set a much higher default value.
Chunk persist requests can be quite spiky. If you collect a large
number of time series that are very similar, they will tend to finish
up a chunk at about the same time. There is no reason we need to back
up scraping just because of that. The rationale of the new default
value is "1/8 of the chunks in memory".
This is related to #454. Queries now timeout after a duration set by
the -query.timeout flag. The TotalEvalTimer is now started/stopped
inside any of the ast.Eval* functions.
When Rickshaw was updated to 1.5.1 in
fd43daf82e,
the Rickshaw upstream package now contained 3 different D3 files:
d3.min.js
d3.v2.js
d3.v3.js
For details on why that is, see
https://groups.google.com/forum/#!topic/d3-js/lXQgKA7mtEw
For the 1.5.1 Rickshaw to work properly (being able to format dates with
D3 without causing a JS error), it needs d3.v2.js or d3.v3.js, not the
d3.min.js one. I chose to update us to d3.v3.js now, since that is the
most recent and minified version, and I didn't see any problems with it
(also, the current Rickshaw examples are using that D3 version).
Currently, displaying graphs with a range >14d is broken. This fixes
that.
While the recent commit 7e5745f solved the issue of having an
independent blob-stamp file, which was possible to become out of
sync with the necessary web/blob/files.go file, this change further
simplifies the setup by merging the two Makefile.
The only purpose of web/Makefile was to call targets in
web/blob/Makefile. As all dependencies for blob/files.go are
outside of the blob/ directory, the separation isn't logically
necessary.
persistence.go is way too long anyway, and a lot of code is just crash
recovery, which is not important to understand the normal operation.
Also, remove unused `exists` function.
Previously, it would return an error instead. Now we can distinguish
the cases 'error while deleting known key' vs. 'key not in index'
without testing for leveldb-internal kinds of errors.