v1.96.0 #
Released at 2023-12-13
vmalert’s metrics vmalert_alerting_rules_error
and vmalert_recording_rules_error
were replaced with vmalert_alerting_rules_errors_total
and vmalert_recording_rules_errors_total
. See this issue for details.
SECURITY: upgrade base docker image (Alpine) from 3.18.4 to 3.19.0. See alpine 3.19.0 release notes.
SECURITY: upgrade Go builder from Go1.21.4 to Go1.21.5. See the list of issues addressed in Go1.21.5.
FEATURE: vmauth: add ability to send requests to the first available backend and fall back to other
hot standby
backends when the first backend is unavailable. This allows building highly available setups as shown in these docs. See this issue.FEATURE:
vmselect
: allow specifying multiple groups ofvmstorage
nodes with independent-replicationFactor
per each group. See these docs and this feature request for details.FEATURE:
vmselect
: allow opening vmui and investigating Top queries and Active queries when thevmselect
is overloaded with concurrent queries (e.g. when more than-search.maxConcurrentRequests
concurrent queries are executed). Previously an attempt to openTop queries
orActive queries
atvmui
could result incouldn't start executing the request in ... seconds, since -search.maxConcurrentRequests=... concurrent requests are executed
error, which could complicate debugging of overloadedvmselect
or single-node VictoriaMetrics.FEATURE: vmagent: add
-enableMultitenantHandlers
command-line flag, which allows receiving data via VictoriaMetrics cluster urls atvmagent
and converting tenant ids to (vm_account_id
,vm_project_id
) labels before sending the data to the configured-remoteWrite.url
. See these docs for details.FEATURE: vmagent: add
-remoteWrite.disableOnDiskQueue
command-line flag, which can be used for disabling data queueing to disk when the remote storage cannot keep up with the data ingestion rate. See these docs and this feature request.FEATURE: vmagent: add support for reading and writing samples via Google PubSub. See these docs.
FEATURE: vmagent: show all the dropped targets together with the reason why they are dropped at
http://vmagent:8429/service-discovery
page. Previously targets, which were dropped because of target sharding weren’t displayed on this page. This could complicate service discovery debugging. See this issue and this feature request.FEATURE: reduce the default value for
-import.maxLineLen
command-line flag from 100MB to 10MB in order to prevent excessive memory usage during data import via /api/v1/import.FEATURE: vmagent: add
keep_if_contains
anddrop_if_contains
relabeling actions. See these docs for details.FEATURE: vmagent: export
vm_promscrape_scrape_pool_targets
metric to track the number of targets each scrape job discovers. See this feature request.FEATURE: vmalert: provide
/vmalert/api/v1/rule
and/api/v1/rule
API endpoints to get the rule object in JSON format. See these docs for details.FEATURE: vmalert: deprecate process gauge metrics
vmalert_alerting_rules_error
andvmalert_recording_rules_error
in favour ofvmalert_alerting_rules_errors_total
andvmalert_recording_rules_errors_total
counter metrics. Counter metric type is more suitable for error counting as it preserves the state change between the scrapes. See this issue for details.FEATURE: MetricsQL: add day_of_year() function, which returns the day of the year for each of the given unix timestamps. See this issue for details. Thanks to @luckyxiaoqiang for the pull request.
FEATURE: all VictoriaMetrics binaries: expose additional metrics at
/metrics
page, which may simplify debugging of VictoriaMetrics components (see this feature request):go_sched_latencies_seconds
- the histogram, which shows the time goroutines have spent in runnable state before actually running. Big values point to the lack of CPU time for the current workload.go_mutex_wait_seconds_total
- the counter, which shows the total time spent by goroutines waiting for locked mutex. Big values point to mutex contention issues.go_gc_cpu_seconds_total
- the counter, which shows the total CPU time spent by Go garbage collector.go_gc_mark_assist_cpu_seconds_total
- the counter, which shows the total CPU time spent by goroutines in GC mark assist state.go_gc_pauses_seconds
- the histogram, which shows the duration of GC pauses.go_scavenge_cpu_seconds_total
- the counter, which shows the total CPU time spent by Go runtime for returning memory to the Operating System.go_memlimit_bytes
- the value of GOMEMLIMIT environment variable.
FEATURE: vmui: enhance autocomplete functionality with caching. See this issue.
FEATURE: add field
version
to the response for/api/v1/status/buildinfo
API for using more efficient API in Grafana for receiving label values. Add additional info about setup Grafana datasource. See this issue and these docs for details.FEATURE: add
-search.maxResponseSeries
command-line flag for limiting the number of time series a single query to/api/v1/query
or/api/v1/query_range
can return. This limit can protect Grafana from high memory usage when the query returns too many series. See this feature request.FEATURE: Alerting rules for VictoriaMetrics: ease aggregation for certain alerting rules to keep more useful labels for the context. Before, all extra labels except
job
andinstance
were ignored. See this pull request and this follow-up commit. Thanks to @7840vz.FEATURE: vmctl: allow reversing the migrating order from the newest to the oldest data for vm-native and remote-read modes via
--vm-native-filter-time-reverse
and--remote-read-filter-time-reverse
command-line flags respectively. See: https://docs.victoriametrics.com/vmctl/#using-time-based-chunking-of-migration and this feature request.BUGFIX: MetricsQL: properly calculate values for the first point on the graph for queries, which do not use rollup functions. For example, previously
count(up)
could return lower than expected values for the first point on the graph. This also could result in lower than expected values in the middle of the graph like in this issue when the response caching isn’t disabled. The issue has been introduced in v1.95.0.BUGFIX: vmagent: prevent from
FATAL: cannot flush metainfo
panic when-remoteWrite.multitenantURL
command-line flag is set. See this issue.BUGFIX: vmagent: properly decode zstd-encoded data blocks received via VictoriaMetrics remote_write protocol. See this issue comment.
BUGFIX: vmagent: properly add new labels at
output_relabel_configs
during stream aggregation. Previously this could lead to corrupted labels in output samples. Thanks to @ChengChung for providing detailed report for this bug.BUGFIX: vmalert-tool: allow using arbitrary
eval_time
in alert_rule_test case. Previously, test cases witheval_time
not being a multiple ofevaluation_interval
would fail.BUGFIX: vmalert: sanitize label names before sending the alert notification to Alertmanager. Before, vmalert would send notifications with labels containing characters not supported by Alertmanager validator, resulting into validation errors like
msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar"
.BUGFIX: vmbackupmanager: fix
vmbackupmanager
not deleting previous object versions from S3 when applying retention policy with-deleteAllObjectVersions
command-line flag.BUGFIX: vminsert: fix panic when ingesting data via NewRelic protocol into VictoriaMetrics cluster. See this issue.
BUGFIX: properly escape
<
character in responses returned via/federate
endpoint. See this issue.BUGFIX: vmctl: check for Error field in response from influx client during migration. Before, only network errors were checked. Thanks to @wozz for the pull request.
v1.95.1 #
Released at 2023-11-16
FEATURE: dashboards: use
version
instead ofshort_version
in version change annotation for single/cluster dashboards. The update should reflect version changes even if different flavours of the same release were applied (custom builds).BUGFIX: fix a bug, which could result in improper results and/or to
cannot merge series: duplicate series found
error during range query execution. The issue has been introduced in v1.95.0. See this bugreport for details.BUGFIX: improve deadline detection when using buffered connection for communication between cluster components. Before, due to nature of a buffered connection the deadline could have been exceeded while reading or writing buffered data to connection. See this pull request.
v1.95.0 #
Released at 2023-11-15
It is recommended upgrading to v1.95.1 because v1.95.0 contains a bug, which can lead to incorrect query results and to cannot merge series: duplicate series found
error. See this issue for details.
vmalert’s cmd-line flag -datasource.lookback
will be deprecated soon. Please use -rule.evalDelay
command-line flag instead and see more details on how to use it here. The flag datasource.lookback
will have no effect in the next release and will be removed in the future releases. See this issue.
vmalert’s cmd-line flag -datasource.queryTimeAlignment
was deprecated and will have no effect anymore. It will be completely removed in next releases. See this issue and more detailed changes related to vmalert below.
SECURITY: upgrade Go builder from Go1.21.1 to Go1.21.4. See the list of issues addressed in Go1.21.2, the list of issues addressed in Go1.21.3 and the list of issues addressed in Go1.21.4.
FEATURE:
vmselect
: improve performance for repeated instant queries if they contain one of the following rollup functions:avg_over_time
sum_over_time
count_eq_over_time
count_gt_over_time
count_le_over_time
count_ne_over_time
count_over_time
increase
max_over_time
min_over_time
rate
The optimization is enabled when these functions contain lookbehind window in square brackets bigger or equal to
6h
(the threshold can be changed via-search.minWindowForInstantRollupOptimization
command-line flag). The optimization improves performance for SLO/SLI-like queries such asavg_over_time(up[30d])
orsum(rate(http_request_errors_total[3d])) / sum(rate(http_requests_total[3d]))
, which can be generated by sloth or similar projects.FEATURE:
vmselect
: improve query performance on systems with big number of CPU cores (>=32
). Add-search.maxWorkersPerQuery
command-line flag, which can be used for fine-tuning query performance on systems with big number of CPU cores. See this pull request.FEATURE:
vmselect
: exposevm_memory_intensive_queries_total
counter metric which gets increased each time-search.logQueryMemoryUsage
memory limit is exceeded by a query. This metric should help to identify expensive and heavy queries without inspecting the logs.FEATURE: MetricsQL: add drop_empty_series() function, which can be used for filtering out empty series before performing additional calculations as shown in this issue.
FEATURE: MetricsQL: add labels_equal() function, which can be used for searching series with identical values for the given labels. See this feature request.
FEATURE: MetricsQL: add
outlier_iqr_over_time(m[d])
andoutliers_iqr(q)
functions, which allow detecting anomalies in samples and series using Interquartile range method.FEATURE: vmalert: add
eval_alignment
attribute for Groups, it will align group query requests timestamp with interval likedatasource.queryTimeAlignment
did. This also means thatdatasource.queryTimeAlignment
command-line flag becomes deprecated now and will have no effect if configured. Ifdatasource.queryTimeAlignment
was set tofalse
before, theneval_alignment
has to be set tofalse
explicitly under group. See this issue.FEATURE: vmalert: add
-rule.evalDelay
flag andeval_delay
attribute for Groups. The new flag and param can be used to adjust thetime
parameter for rule evaluation requests to match intentional query delay from the datasource. See this issue.FEATURE: vmalert: allow specifying full url in notifier static_configs target address, like
http://alertmanager:9093/test/api/v2/alerts
. See this issue.FEATURE: vmalert: reduce the number of queries for restoring alerts state on start-up. The change should speed up the restore process and reduce pressure on
remoteRead.url
. See this pull request.FEATURE: vmalert: add label
file
pointing to the group’s filename to metricsvmalert_recording_.*
andvmalert_alerts_.*
. The filename should help identifying alerting rules belonging to specific groups with identical names but different filenames. See this issue.FEATURE: vmalert: automatically retry remote-write requests on closed connections. The change should reduce the amount of logs produced in environments with short-living connections or environments without support of keep-alive on network balancers.
FEATURE: vmagent: support data ingestion from NewRelic infrastructure agent. See these docs, this feature request and this pull request.
FEATURE: vmagent: add
-remoteWrite.shardByURL.labels
command-line flag, which can be used for specifying a list of labels for sharding outgoing samples among the configured-remoteWrite.url
destinations if-remoteWrite.shardByURL
command-line flag is set. See these docs and this feature request for details.FEATURE: vmagent: do not exit on startup when scrape_configs refer to non-existing or invalid files with auth configs, since these files may appear / updated later. See this feature request and this pull request.
FEATURE: vmagent: allow loading TLS certificates from HTTP and HTTPS urls by specifying these urls at
cert_file
andkey_file
options insidetls_config
andproxy_tls_config
sections at http client settings.FEATURE: vmagent: reduce CPU load when big number of targets are scraped over HTTPS with the same custom TLS certificate configured via
tls_config->cert_file
andtls_config->key_file
at scrape_config.FEATURE: vmbackup: add
-filestream.disableFadvise
command-line flag, which can be used for disablingfadvise
syscall during backup upload to the remote storage. By defaultvmbackup
usesfadvise
syscall in order to prevent from eviction of recently accessed data from the OS page cache when backing up large files. Sometimes thefadvise
syscall may take significant amounts of CPU when the backup is performed with large value of-concurrency
command-line flag on systems with big number of CPU cores. In this case it is better to manually disablefadvise
syscall by passing-filestream.disableFadvise
command-line flag tovmbackup
. See this pull request for details.FEATURE: vmbackup: add
-deleteAllObjectVersions
command-line flag, which can be used for forcing removal of all object versions in remote object storage. See this issue and these docs for the details.FEATURE: Alerting rules for VictoriaMetrics: account for
vmauth
component for alertsServiceDown
andTooManyRestarts
.FEATURE: Alerting rules for VictoriaMetrics: make
TooHighMemoryUsage
more tolerable to spikes or near-the-threshold states. The change should reduce number of false positives.FEATURE: Alerting rules for VictoriaMetrics: add
TooManyMissedIterations
alerting rule for vmalert to detect groups that miss their evaluations due to slow queries.FEATURE: vmui: add support for functions, labels, values in autocomplete. See this issue.
FEATURE: vmui: retain specified time interval when executing a query from
Top Queries
. See this issue.FEATURE: vmui: improve repeated VMUI page load times by enabling caching of static js and css at web browser side according to these recommendations.
FEATURE: vmui: sort legend under the graph in descending order of median values. This should simplify graph analysis, since usually the most important lines have bigger values.
FEATURE: vmui: reduce vertical space usage, so more information is visible on the screen without scrolling.
FEATURE: vmui: show query execution duration in the header of query input field. This should help optimizing query performance.
FEATURE: support
Strict-Transport-Security
,Content-Security-Policy
andX-Frame-Options
HTTP response headers in the all VictoriaMetrics components. The values for headers can be specified via the following command-line flags:-http.header.hsts
,-http.header.csp
and-http.header.frameOptions
.FEATURE: vmalert-tool: add
unittest
command to run unittest for alerting and recording rules. See this pull request for details.FEATURE: dashboards/vmalert: add new panel
Missed evaluations
for indicating alerting groups that miss their evaluations.FEATURE: all: track requests with wrong auth key and wrong basic auth at
vm_http_request_errors_total
metric withreason="wrong_auth_key"
andreason="wrong_basic_auth"
. See this issue. Thanks to @venkatbvc for the pull request.FEATURE: vmauth: add ability to drop the specified number of
/
-delimited prefix parts from the request path before proxying the request to the matching backend. See these docs.FEATURE: vmauth: add ability to skip TLS verification and to specify TLS Root CA when connecting to backends. See these docs and this issue.
FEATURE:
vmstorage
: gradually closevminsert
connections during 25 seconds at graceful shutdown. This should reduce data ingestion slowdown during rolling restarts. The duration for gradual closing ofvminsert
connections can be configured via-storage.vminsertConnsShutdownDuration
command-line flag. See this issue and these docs for details.FEATURE:
vmstorage
: add-blockcache.missesBeforeCaching
command-line flag, which can be used for fine-tuning RAM usage forindexdb/dataBlocks
cache when queries touching big number of time series are executed.FEATURE: add
-loggerMaxArgLen
command-line flag for fine-tuning the maximum lengths of logged args.BUGFIX: vmalert: strip sensitive information such as auth headers or passwords from datasource, remote-read, remote-write or notifier URLs in log messages or UI. This behavior is by default and is controlled via
-datasource.showURL
,-remoteRead.showURL
,remoteWrite.showURL
or-notifier.showURL
cmd-line flags. See this issue.BUGFIX: vmalert: fix vmalert web UI when running on 32-bit architectures machine.
BUGFIX: vmalert: do not send requests to configured remote systems when
-datasource.*
,-remoteWrite.*
,-remoteRead.*
or-notifier.*
command-line flags refer files with invalid auth configs. Previously such requests were sent without properly set auth headers. Now the requests are sent only after the files are updated with valid auth configs. See this pull request.BUGFIX: vmalert: properly maintain alerts state in replay mode if alert’s
for
param was bigger than replay request range (usually a couple of hours). See this issue for details.BUGFIX: vmalert: increment
vmalert_remotewrite_errors_total
metric if all retries to send remote-write request failed. Before, this metric was incremented only if remote-write client’s buffer is overloaded.BUGFIX: vmalert: increment
vmalert_remotewrite_dropped_rows_total
metric if remote-write client’s buffer is overloaded. Before, these metrics were incremented only after unsuccessful HTTP calls.BUGFIX:
vmselect
: improve performance and memory usage during query processing on machines with big number of CPU cores. See this issue.BUGFIX: dashboards: fix vminsert/vmstorage/vmselect metrics filtering when dashboard is used to display data from many sub-clusters with unique job names. Before, only one specific job could have been accounted for component-specific panels, instead of all available jobs for the component.
BUGFIX: dashboards: respect
job
andinstance
filters foralerts
annotation in cluster and single-node dashboards.BUGFIX: dashboards: update description for RSS and anonymous memory panels to be consistent for single-node, cluster and vmagent dashboards.
BUGFIX: dashboards/vmalert: apply
desc
sorting in tooltips for vmalert dashboard in order to improve visibility of the outliers on graph.BUGFIX: dashboards/vmalert: properly apply time series filter for panel
No data errors
. Before, the panel didn’t respectjob
orinstance
filters.BUGFIX: dashboards/vmalert: fix panel
Errors rate to Alertmanager
not showing any data due to wrong label filters.BUGFIX: dashboards/cluster: fix description about
max
threshold forConcurrent selects
panel. Before, it was mistakenly implying thatmax
is equal to the double of available CPUs.BUGFIX: VictoriaMetrics cluster: bump hard-coded limit for search query size at
vmstorage
from 1MB to 5MB. The change should be more suitable for real-world scenarios and protect vmstorage from excessive memory usage. See this issue for detailsBUGFIX: vmbackup: fix error when creating an incremental backup with the
-origin
command-line flag. See this issue for details.BUGFIX: vmagent: properly apply relabeling with
regex
, which start and end with.+
or.*
and which contain alternate sub-regexps. For example,.+;|;.+
or.*foo|bar|baz.*
. Previously such regexps were improperly parsed, which could result in unexpected relabeling results. See this issue.BUGFIX: vmagent: properly discover Kubernetes targets via kubernetes_sd_configs. Previously some targets and some labels could be skipped during service discovery because of the bug introduced in v1.93.5 when implementing this feature. See this issue for more details.
BUGFIX: vmagent: fix vmagent ignoring configuration reload for streaming aggregation if it was started with empty streaming aggregation config. Thanks to @aluode99 for the pull request.
BUGFIX: vmagent: do not scrape targets if the corresponding scrape_configs refer to files with invalid auth configs. Previously the targets were scraped without properly set auth headers in this case. Now targets are scraped only after the files are updated with valid auth configs. See this pull request.
BUGFIX: vmagent: properly parse
ca
,cert
andkey
options attls_config
section inside http client settings. Previously string values couldn’t be parsed for these options, since the parser was mistakenly expecting a list ofuint8
values instead.BUGFIX: vmagent: properly drop samples if
-streamAggr.dropInput
command-line flag is set and-remoteWrite.streamAggr.config
contains an empty file. See this issue.BUGFIX: vmagent: do not print redundant error logs when failed to scrape consul or nomad target. See this pull request.
BUGFIX: vmagent: generate proper link to the main page and to
favicon.ico
at http pages served byvmagent
such as/targets
or/service-discovery
whenvmagent
sits behind an http proxy with custom http path prefixes. See this issue.BUGFIX: vmagent: properly decode Snappy-encoded data blocks received via VictoriaMetrics remote_write protocol. See this issue.
BUGFIX: vmstorage: prevent deleted series to be searchable via
/api/v1/series
API if they were re-ingested with staleness markers. This situation could happen if user deletes the series from the target and from VM, and then vmagent sends stale markers for absent series. Thanks to @ilyatrefilov for the issue and pull request.BUGFIX: vmstorage: log warning about switching to ReadOnly mode only on state change. Before, vmstorage would log this warning every 1s. See this issue for details.
BUGFIX: vmauth: show browser authorization window for unauthorized requests to unsupported paths if the
unauthorized_user
section is specified. This allows properly authorizing the user. See this issue for details.BUGFIX: vmauth: properly proxy requests to HTTP/2.0 backends and properly pass
Host
header to backends.BUGFIX: vmui: fix the
Disable cache
toggle atJSON
andTable
views. Previously response caching was always enabled and couldn’t be disabled at these views.BUGFIX: vmui: correctly display query errors on Explore Prometheus Metrics page. See this issue for details.
BUGFIX: vmui: properly handle trailing slash in the server URL. See this issue.
BUGFIX: vmbackupmanager: correctly print error in logs when copying backup fails. Previously, error was displayed in metrics but was missing in logs.
BUGFIX: fix panic, which could occur when query tracing is enabled. See this issue.
v1.94.0 #
Released at 2023-10-02
FEATURE: MetricsQL: add support for numbers with underscore delimiters such as
1_234_567_890
and1.234_567_890
. These numbers are easier to read than1234567890
and1.234567890
.FEATURE: vmbackup: add support for server-side copy of existing backups. See these docs for details.
FEATURE: vmui: add the option to see the latest 25 queries. See this issue.
FEATURE: vmagent: add ability to set
member num
label for all the metrics scraped by a particularvmagent
instance in a cluster of vmagents via-promscrape.cluster.memberLabel
command-line flag. See these docs and this issue.FEATURE: vmagent: do not log
unexpected EOF
when reading incoming metrics, since this error is expected and is handled during metrics’ parsing. This reduces the amounts of noisy logs. See this issue.FEATURE: vmagent: retry failed write request on the closed connection immediately, without waiting for backoff. This should improve data delivery speed and reduce amount of error logs emitted by vmagent when using idle connections. See related issue.
FEATURE: vmagent: reduces load on Kubernetes control plane during initial service discovery. See this issue for details.
FEATURE: VictoriaMetrics cluster: reduce the maximum recovery time at
vmselect
andvminsert
when some ofvmstorage
nodes become unavailable because of networking issues from 60 seconds to 3 seconds by default. The recovery time can be tuned atvmselect
andvminsert
nodes with-vmstorageUserTimeout
command-line flag if needed. Thanks to @wjordan for the pull request.FEATURE: vmui: add Prometheus data support to the “Explore cardinality” page. See this issue for details.
FEATURE: vmui: make the warning message more noticeable for text fields. See this issue.
FEATURE: vmui: add button for auto-formatting PromQL/MetricsQL queries. See this issue. Thanks to @aramattamara for the pull request.
FEATURE: vmui: improve accessibility score to 100 according to Google’s Lighthouse tests.
FEATURE: vmui: organize
min
,max
,median
values on the chart legend and tooltips for better visibility.FEATURE: vmui: add explanation about cardinality explorer statistic inaccuracy in VictoriaMetrics cluster. See this issue.
FEATURE: vmui: add storage of query history in
localStorage
. See the pull request.FEATURE: dashboards: provide copies of Grafana dashboards alternated with VictoriaMetrics datasource at dashboards/vm.
FEATURE: vmauth: added ability to set, override and clear request and response headers on a per-user and per-path basis. See this issue and these docs for details.
FEATURE: vmauth: add ability to retry requests to the remaining backends if they return response status codes specified in the
retry_status_codes
list. See this feature request.FEATURE: vmauth: expose metrics
vmauth_config_last_reload_*
for tracking the state of config reloads, similarly to vmagent/vmalert components.FEATURE: vmauth: do not print logs like
SIGHUP received...
once per configured-configCheckInterval
cmd-line flag. This log will be printed only if config reload was invoked manually.FEATURE: vmalert: add
eval_offset
attribute for Groups. If specified, Group will be evaluated at the exact time offset on the range of [0…evaluationInterval]. The setting might be useful for cron-like rules which must be evaluated at specific moments of time. See this issue for details.FEATURE: vmalert: validate MetricsQL function names in alerting and recording rules when
vmalert
runs with-dryRun
command-line flag. Previously it was allowed to use unknown (aka invalid) MetricsQL function names there. For example,foo()
was counted as a valid query. See this feature request.FEATURE: limit the length of string params in log messages to 500 chars. Longer string params are replaced with the
first_250_chars..last_250_chars
. This prevents from too long log lines, which can be emitted by VictoriaMetrics components.FEATURE: docker compose environment: add
vmauth
component to cluster’s docker-compose example for balancing load among multiplevmselect
components.FEATURE: MetricsQL: make sure that
q2
series are returned afterq1
series in the results ofq1 or q2
query, in the same way as Prometheus does. See this issue.FEATURE: MetricsQL: return empty result from
bitmap_and(a, b)
,bitmap_or(a, b)
andbitmap_xor(a, b)
ifa
orb
have no value at the particular timestamp. Previously0
was returned in this case. See this issue.FEATURE: stop exposing
vm_merge_need_free_disk_space
metric, since it has been appeared that it confuses users while doesn’t bring any useful information. See this comment.BUGFIX: Official Grafana dashboards for VictoriaMetrics: fix display of ingested rows rate for
Samples ingested/s
andSamples rate
panels for vmagent’s dashboard. Previously, not all ingested protocols were accounted in these panels. An extra panelRows rate
was added toIngestion
section to display the split for rows ingested rate by protocol.BUGFIX: vmui: fix the bug causing render looping when switching to heatmap.
BUGFIX: VictoriaMetrics enterprise validate
-dedup.minScrapeInterval
value and-downsampling.period
intervals are multiples of each other. See these docs.BUGFIX: vmbackup: properly copy
appliedRetention.txt
files inside<-storageDataPath>/{data}
folders during incremental backups. Previously the newappliedRetention.txt
could be skipped during incremental backups, which could lead to increased load on storage after restoring from backup. See this issue.BUGFIX: vmagent: suppress
context canceled
error messages in logs whenvmagent
is reloading service discovery config. This error could appear starting from v1.93.5. See this PR.BUGFIX: vmagent: remove concurrency limit during parsing of scraped metrics, which was mistakenly applied to it. With this change cmd-line flag
-maxConcurrentInserts
won’t have effect on scraping anymore.BUGFIX: MetricsQL: allow passing median_over_time to aggr_over_time. See this issue.
BUGFIX: vminsert: fix ingestion via multitenant url for opentsdbhttp. See this issue. The bug has been introduced in v1.93.2.
BUGFIX: vmagent: fix support of legacy DataDog agent, which adds trailing slashes to urls. See this issue. Thanks to @maxb for spotting the issue.
v1.93.9 #
Released at 2023-12-10
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
SECURITY: upgrade base docker image (Alpine) from 3.18.4 to 3.19.0. See alpine 3.19.0 release notes.
SECURITY: upgrade Go builder from Go1.21.4 to Go1.21.5. See the list of issues addressed in Go1.21.5.
BUGFIX: vmagent: prevent from
FATAL: cannot flush metainfo
panic when-remoteWrite.multitenantURL
command-line flag is set. See this issue.BUGFIX: vmagent: properly decode zstd-encoded data blocks received via VictoriaMetrics remote_write protocol. See this issue comment.
BUGFIX: vmagent: properly add new labels at
output_relabel_configs
during stream aggregation. Previously this could lead to corrupted labels in output samples. Thanks to @ChengChung for providing detailed report for this bug.BUGFIX: vmalert: sanitize label names before sending the alert notification to Alertmanager. Before, vmalert would send notifications with labels containing characters not supported by Alertmanager validator, resulting into validation errors like
msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar"
.BUGFIX: properly escape
<
character in responses returned via/federate
endpoint. See this issue.
v1.93.8 #
Released at 2023-11-15
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
SECURITY: upgrade Go builder from Go1.21.3 to Go1.21.4. See the list of issues addressed in Go1.21.4.
BUGFIX: vmagent: properly apply relabeling with
regex
, which start and end with.+
or.*
and which contain alternate sub-regexps. For example,.+;|;.+
or.*foo|bar|baz.*
. Previously such regexps were improperly parsed, which could result in unexpected relabeling results. See this issue.BUGFIX: vmagent: properly decode Snappy-encoded data blocks received via VictoriaMetrics remote_write protocol. See this issue.
BUGFIX: fix panic, which could occur when query tracing is enabled. See this issue.
v1.93.7 #
Released at 2023-11-02
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
- BUGFIX: vmagent: properly discover Kubernetes targets via kubernetes_sd_configs. Previously some targets and some labels could be skipped during service discovery because of the bug introduced in v1.93.5 when implementing this feature. See this issue for more details.
- BUGFIX: vmagent: properly parse
ca
,cert
andkey
options attls_config
section inside http client settings. Previously string values couldn’t be parsed for these options, since the parser was mistakenly expecting a list ofuint8
values instead. - BUGFIX: vmagent: properly drop samples if
-streamAggr.dropInput
command-line flag is set and-remoteWrite.streamAggr.config
contains an empty file. See this issue. - BUGFIX: vmagent: do not print redundant error logs when failed to scrape consul or nomad target. See this pull request.
- BUGFIX: vmstorage: prevent deleted series to be searchable via
/api/v1/series
API if they were re-ingested with staleness markers. This situation could happen if user deletes the series from the target and from VM, and then vmagent sends stale markers for absent series. Thanks to @ilyatrefilov for the issue and pull request. - BUGFIX: vmstorage: log warning about switching to ReadOnly mode only on state change. Before, vmstorage would log this warning every 1s. See this issue for details.
- BUGFIX: vmauth: show browser authorization window for unauthorized requests to unsupported paths if the
unauthorized_user
section is specified. This allows properly authorizing the user. See this issue for details.
v1.93.6 #
Released at 2023-10-16
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
SECURITY: upgrade Go builder from Go1.21.1 to Go1.21.3. See the list of issues addressed in Go1.21.2 and the list of issues addressed in Go1.21.3.
BUGFIX: vmalert: strip sensitive information such as auth headers or passwords from datasource, remote-read, remote-write or notifier URLs in log messages or UI. This behavior is by default and is controlled via
-datasource.showURL
,-remoteRead.showURL
,remoteWrite.showURL
or-notifier.showURL
cmd-line flags. See this issue.BUGFIX:
vmselect
: improve performance and memory usage during query processing on machines with big number of CPU cores. See this issue for details.BUGFIX: VictoriaMetrics cluster: bump hard-coded limit for search query size at
vmstorage
from 1MB to 5MB. The change should be more suitable for real-world scenarios and protect vmstorage from excessive memory usage. See this issue for detailsBUGFIX: vmagent: fix vmagent ignoring configuration reload for streaming aggregation if it was started with empty streaming aggregation config. Thanks to @aluode99 for the pull request.
BUGFIX: vmbackup: properly copy
appliedRetention.txt
files inside<-storageDataPath>/{data}
folders during incremental backups. Previously the newappliedRetention.txt
could be skipped during incremental backups, which could lead to increased load on storage after restoring from backup. See this issue.BUGFIX: vmagent: suppress
context canceled
error messages in logs whenvmagent
is reloading service discovery config. This error could appear starting from v1.93.5. See this PR.BUGFIX: MetricsQL: allow passing median_over_time to aggr_over_time. See this issue.
BUGFIX: vminsert: fix ingestion via multitenant url for opentsdbhttp. See this issue. The bug has been introduced in v1.93.2.
BUGFIX: vmagent: fix support of legacy DataDog agent, which adds trailing slashes to urls. See this issue. Thanks to @maxb for spotting the issue.
v1.93.5 #
Released at 2023-09-19
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
- BUGFIX: storage: prevent from livelock when forced merge is called under high data ingestion. See this issue.
- BUGFIX: Graphite Render API: correctly return
null
instead ofInf
in JSON query responses. See this issue. - BUGFIX: vmbackup: properly copy
parts.json
files inside<-storageDataPath>/{data,indexdb}
folders during incremental backups. Previously the newparts.json
could be skipped during incremental backups, which could lead to inability to restore from the backup. See this issue. This issue has been introduced in v1.90.0. - BUGFIX: vmagent: properly close connections to Kubernetes API server after the change in
selectors
ornamespaces
sections of kubernetes_sd_configs. Previouslyvmagent
could continue polling Kubernetes API server with the oldselectors
ornamespaces
configs additionally to polling new configs. See this issue. - BUGFIX: vmauth: prevent configuration reloading if there were no changes in config. This improves memory usage when
-configCheckInterval
cmd-line flag is configured and config has extensive list of regexp expressions requiring additional memory on parsing.
v1.93.4 #
Released at 2023-09-10
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
SECURITY: upgrade Go builder from Go1.21.0 to Go1.21.1. See the list of issues addressed in Go1.20.6.
BUGFIX: vminsert enterprise: properly parse
/insert/multitenant/*
urls, which have been broken since v1.93.2. See this issue.BUGFIX: properly build production armv5 binaries for
GOARCH=arm
. This has been broken after the upgrading of Go builder to Go1.21.0. See this issue.BUGFIX: vmselect: return
503 Service Unavailable
status code when partial responses are denied and some ofvmstorage
nodes are temporarily unavailable. Previously422 Unprocessable Entity
status code was mistakenly returned in this case, which could prevent from automatic recovery by re-sending the request to healthy cluster replica in another availability zone.BUGFIX: vmalert: fix the bug when Group’s
params
fields with multiple values were overriding each other instead of adding up. The bug was introduced in this commit starting from v1.91.1. See this issue.BUGFIX: vmagent: fix possible corruption of labels in the collected samples if
-remoteWrite.label
is set together with multiple-remoteWrite.url
options. The bug has been introduced in v1.93.1. See this issue.
v1.93.3 #
Released at 2023-09-02
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
- BUGFIX: vminsert: properly close broken vmstorage connection during read-only state checks at
vmstorage
. Previously it wasn’t properly closed, which prevents restoringvmstorage
node from read-only mode. See this issue. - BUGFIX: vmstorage: prevent from breaking
vmselect
->vmstorage
RPC communication whenvmstorage
returns an empty label name at/api/v1/labels
request. See this issue.
v1.93.2 #
Released at 2023-09-01
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
- BUGFIX: build: fix Docker builds for old Docker releases. See this issue.
- BUGFIX: vmagent: consistently set
User-Agent
header tovm_promscrape
during scraping with enabled or disabled stream parsing mode. See this issue. - BUGFIX: vmagent: consistently set timeout for scraping with enabled or disabled stream parsing mode. See this issue.
- BUGFIX: vmalert: correctly re-use HTTP request object on
EOF
retries when querying the configured datasource. Previously, there was a small chance that query retry wouldn’t succeed. - BUGFIX: vmselect: correctly handle requests with
/select/multitenant
prefix. Such requests must be rejected since vmselect does not support multitenancy endpoint. Previously, such requests were causing panic. See this issue. - BUGFIX: vminsert: properly check for read-only state at
vmstorage
. Previously it wasn’t properly checked, which could lead to increased resource usage and data ingestion slowdown when some ofvmstorage
nodes are in read-only mode. See this issue.
v1.93.1 #
Released at 2023-08-23
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
- BUGFIX: prevent from possible data loss during
indexdb
rotation. See this issue for details. - BUGFIX: do not allow starting VictoriaMetrics components with improperly set boolean command-line flags in the form
-boolFlagName value
, since this leads to silent incomplete flags’ parsing. This form should be replaced with-boolFlagName=value
. See this issue. - BUGFIX: vmagent: properly set labels from
-remoteWrite.label
command-line flag just before sending samples to the configured-remoteWrite.url
according to these docs. Previously these labels were incorrectly set before the relabeling configured via-remoteWrite.urlRelabelConfigs
and the stream aggregation configured via-remoteWrite.streamAggr.config
, so these labels could be lost or incorrectly transformed before sending the samples to remote storage. The fix allows using-remoteWrite.label
for identifyingvmagent
instances in cluster mode. See this issue and these docs for more details. - BUGFIX: remove
DEBUG
logging when parsingif
filters inside relabeling rules and when parsingmatch
filters inside stream aggregation rules. - BUGFIX: properly replace
:
chars in label names with_
when-usePromCompatibleNaming
command-line flag is passed tovmagent
,vminsert
or single-node VictoriaMetrics. This addresses this comment. - BUGFIX: vmbackup: correctly check if specified
-dst
belongs to specified-storageDataPath
. See this issue. - BUGFIX: vmctl: don’t interrupt the migration process if no metrics were found for a specific tenant. See this issue.
v1.93.0 #
Released at 2023-08-12
It is recommended upgrading to VictoriaMetrics v1.93.1 because v1.93.0 contains a bug, which can lead to data loss because of incorrect indexdb
rotation. See this issue for details.
v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release
Update note: starting from this release, vmagent ignores timestamps provided by scrape targets by default - it associates scraped metrics with local timestamps instead. Set honor_timestamps: true
in scrape configs if timestamps provided by scrape targets must be used instead. This change helps removing gaps for metrics collected from cadvisor such as container_memory_usage_bytes
. This also improves data compression and query performance over metrics collected from cadvisor
. See more details here.
SECURITY: upgrade Go builder from Go1.20.6 to Go1.21.0 in order to fix this issue.
SECURITY: upgrade base docker image (Alpine) from 3.18.2 to 3.18.3. See alpine 3.18.3 release notes.
FEATURE: MetricsQL: add
share_eq_over_time(m[d], eq)
function for calculating the share (in the range[0...1]
) of raw samples on the given lookbehind windowd
, which are equal toeq
. See this feature request. Thanks to @Damon07 for the pull request.FEATURE: vmauth: allow configuring deadline for a backend to be excluded from the rotation on errors via
-failTimeout
cmd-line flag. This feature could be useful when it is expected for backends to be not available for significant periods of time. See this issue for details. Thanks to @SunKyu for the pull request.FEATURE: vmalert: remove deprecated in v1.61.0
-rule.configCheckInterval
command-line flag. Use-configCheckInterval
command-line flag instead.FEATURE: vmalert: remove support of deprecated web links of
/api/v1/<groupID>/<alertID>/status
form in favour of/api/v1/alerts?group_id=<>&alert_id=<>
links. Links of/api/v1/<groupID>/<alertID>/status
form were deprecated in v1.79.0. See this issue for details.FEATURE: vmctl: allow disabling binary export API protocol via
-vm-native-disable-binary-protocol
cmd-line flag when migrating data from VictoriaMetrics. Disabling binary protocol can be useful for deduplication of the exported data before ingestion. For this, deduplication need to be configured at-vm-native-src-addr
side and-vm-native-disable-binary-protocol
should be set on vmctl side.FEATURE: vmctl: add support of
week
step for time-based chunking migration. See this issue.FEATURE: vmctl: allow specifying custom full url at
--remote-read-src-addr
command-line flag if--remote-read-disable-path-append
command-line flag is set. This allows importing data from urls, which do not end with/api/v1/read
. For example, from Promscale. See this issue.FEATURE: vmui: add warning in query field of vmui for partial data responses. See this issue.
FEATURE: vmui: allow displaying the full error message on click for trimmed error messages in vmui. See this issue.
FEATURE: Official Grafana dashboards for VictoriaMetrics: add
Concurrent inserts
panel to vmagent’s dashboard. The new panel supposed to show whether the number of concurrent inserts processed by vmagent isn’t reaching the limit.FEATURE: Official Grafana dashboards for VictoriaMetrics: add panels for absolute Mem and CPU usage by vmalert. See related issue here.
FEATURE: Official Grafana dashboards for VictoriaMetrics: correctly calculate
Bytes per point
value for single-server and cluster VM dashboards. Before, the calculation mistakenly accounted for the number of entries in indexdb in denominator, which could have shown lower values than expected.FEATURE: Alerting rules for VictoriaMetrics:
ConcurrentFlushesHitTheLimit
alerting rule was moved from single-server and cluster alerts to the list of “health” alerts as it could be related to many VictoriaMetrics components.BUGFIX: storage: properly set next retention time for indexDB. Previously it may enter into endless retention loop. See this issue for details.
BUGFIX: vmagent: return human readable error if opentelemetry has json encoding. Follow-up after PR.
BUGFIX: vmagent: properly validate scheme for
proxy_url
field at the scrape config. See this issue for details.BUGFIX: vmagent: properly apply
if
filters during relabeling. Previously theif
filter could improperly work. See this issue and this pull request.BUGFIX: vmagent: use local scrape timestamps for the scraped metrics unless
honor_timestamps: true
option is explicitly set at scrape_config. This fixes gaps for metrics collected from cadvisor or similar exporters, which export metrics with invalid timestamps. See this issue and this comment for details. The issue has been introduced in v1.68.0.BUGFIX: vmagent: fixes runtime panic at OpenTelemetry parser. Opentelemetry format allows histograms without
sum
fields. Such histogram converted as counter with_count
suffix. See this issue.BUGFIX: vmagent: keep unmatched series at stream aggregation when
-remoteWrite.streamAggr.dropInput
is set tofalse
to match intended behaviour introduced at v1.92.0. See this issue.BUGFIX: vmalert: properly set
vmalert_config_last_reload_successful
value on configuration updates or rollbacks. The bug was introduced in v1.92.0 in this PR.BUGFIX: vmalert: fix
vmalert_remotewrite_send_duration_seconds_total
value, before it didn’t count in the real time spending on remote write requests. See this pr for details.BUGFIX: vmbackupmanager: fix panic when creating a backup to a local filesystem on Windows. See this issue.
BUGFIX: vmui: properly handle client address with
X-Forwarded-For
part at the Active queries page. See this comment.BUGFIX: MetricsQL: prevent from panic when the lookbehind window in square brackets of rollup function is parsed into negative value. See this issue.
v1.92.1 #
Released at 2023-07-28
- BUGFIX: vmalert: revert unit test feature for alerting and recording rules introduced in this pull request. See the following change.
v1.92.0 #
Released at 2023-07-27
Update note: this release contains backwards-incompatible change to indexdb, so rolling back to the previous versions of VictoriaMetrics may result in partial data loss of entries in indexdb.
Update note: starting from this release, stream aggregation writes the following samples to the configured remote storage by default:
- aggregated samples;
- the original input samples, which match zero
match
options from the provided config.
Previously only aggregated samples were written to the storage by default. The previous behavior can be restored in the following ways:
- by passing
-streamAggr.dropInput
command-line flag to single-node VictoriaMetrics; - by passing
-remoteWrite.streamAggr.dropInput
command-line flag per each configured-remoteWrite.streamAggr.config
atvmagent
.
SECURITY: upgrade base docker image (alpine) from 3.18.0 to 3.18.2. See alpine 3.18.2 release notes.
SECURITY: upgrade Go builder from Go1.20.5 to Go1.20.6. See the list of issues addressed in Go1.20.6.
FEATURE: reduce memory usage by up to 5x for setups with high churn rate and long retention. See the description for this change and this issue for details.
FEATURE: reduce spikes in CPU and disk IO usage during
indexdb
rotation (aka inverted index), which is performed once per-retentionPeriod
. The new algorithm gradually pre-populates newly createdindexdb
during the last hour before the rotation. The number of pre-populated series in the newly createdindexdb
can be monitored viavm_timeseries_precreated_total
metric. This should resolve this issue.FEATURE: MetricsQL: allow selecting time series matching at least one of multiple
or
filters. For example,{env="prod",job="a" or env="dev",job="b"}
selects series with either{env="prod",job="a"}
or{env="dev",job="b"}
labels. This functionality allows passing the selected series to rollup functions without the need to use subqueries. See these docs.FEATURE: MetricsQL: add ability to preserve metric names for binary operation results via
keep_metric_names
modifier. For example,({__name__=~"foo|bar"} / 10) keep_metric_names
leavesfoo
andbar
metric names in division results. See these docs. This helps to address issues like this one.FEATURE: MetricsQL: add ability to copy all the labels from
one
side of many-to-one operations by specifying*
insidegroup_left()
orgroup_right()
. Also allow adding a prefix for copied label names viagroup_left(*) prefix "..."
syntax. For example, the following query copies Kubernetes namespace labels tokube_pod_info
series and addsns_
prefix for the copied label names:kube_pod_info * on(namespace) group_left(*) prefix "ns_" kube_namespace_labels
. The labels fromon()
list aren’t prefixed. This feature resolves this and that questions at StackOverflow.FEATURE: MetricsQL: add ability to specify durations via
WITH
templates. Examples:WITH (w = 5m) m[w]
is automatically transformed tom[5m]
WITH (f(window, step, off) = m[window:step] offset off) f(5m, 10s, 1h)
is automatically transformed tom[5m:10s] offset 1h
Thanks to @lujiajing1126 for the initial idea and implementation. See this feature request.
FEATURE: vmui: added a new page with the list of currently running queries. See this issue and these docs.
FEATURE: vmagent: add support for data ingestion via OpenTelemetry protocol. See these docs, this feature request and this pull request.
FEATURE: vmagent: allow sharding outgoing time series among the configured remote storage systems. This can be useful for building horizontally scalable stream aggregation, when samples for the same time series must be aggregated by the same
vmagent
instance at the second level. See these docs and this feature request for details.FEATURE: vmagent: allow configuring staleness interval in stream aggregation config. See this issue for details.
FEATURE: vmagent: allow specifying a list of series selectors inside
if
option of relabeling rules. The corresponding relabeling rule is executed when at least a single series selector matches. See these docs.FEATURE: stream aggregation: allow specifying a list of series selectors inside
match
option of stream aggregation configs. The input sample is aggregated when at least a single series selector matches. See this feature request.FEATURE: stream aggregation: preserve input samples, which match zero
match
options from the configured aggregations. Previously all the input samples were dropped by default, so only the aggregated samples are written to the output storage. The previous behavior can be restored by passing-streamAggr.dropInput
command-line flag to single-node VictoriaMetrics or by passing-remoteWrite.streamAggr.dropInput
command-line flag tovmagent
.FEATURE: vmctl: add verbose output for docker installations or when TTY isn’t available. See this issue.
FEATURE: vmctl: interrupt backoff retries when import process is cancelled. The change makes vmctl more responsive in case of errors during the import. See this pull request.
FEATURE: vmctl: update backoff policy on retries to reduce probability of overloading for
source
ordestination
databases. See this issue.FEATURE: vmstorage: suppress “broken pipe” and “connection reset by peer” errors for search queries on vmstorage side. See this and this commits.
FEATURE: Official Grafana dashboards for VictoriaMetrics: add panel for tracking rate of syscalls while writing or reading from disk via
process_io_(read|write)_syscalls_total
metrics.FEATURE: accept timestamps in milliseconds at
start
,end
andtime
query args in Prometheus querying API. See these docs and this feature request.FEATURE: vmalert: update retry policy for pushing data to
-remoteWrite.url
. By default, vmalert will make multiple retry attempts with exponential delay. The total time spent during retry attempts shouldn’t exceed-remoteWrite.retryMaxTime
(default is 30s). When retry time is exceeded vmalert drops the data dedicated for-remoteWrite.url
. Before, vmalert dropped data after 5 retry attempts with 1s delay between attempts (not configurable). See-remoteWrite.retryMinInterval
and-remoteWrite.retryMaxTime
cmd-line flags.FEATURE: vmalert: expose
vmalert_remotewrite_send_duration_seconds_total
counter, which can be used for determining high saturation of every connection to remote storage with an alerting querysum(rate(vmalert_remotewrite_send_duration_seconds_total[5m])) by(job, instance) > 0.9 * max(vmalert_remotewrite_concurrency) by(job, instance)
. This query triggers when a connection is saturated by more than 90%. This usually means that-remoteWrite.concurrency
command-line flag must be increased in order to increase the number of concurrent writings into remote endpoint. See this feature request.FEATURE: vmalert: display the error message received during unsuccessful config reload in vmalert’s UI. See this issue for details.
FEATURE: vmalert: allow disabling of
step
param attached to instant queries. This might be useful for using vmalert with datasources that to not support this param, unlike VictoriaMetrics. See this issue for details.FEATURE: vmalert: support option for “blackholing” alerting notifications if
-notifier.blackhole
cmd-line flag is set. Enable this flag if you want vmalert to evaluate alerting rules without sending any notifications to external receivers (eg. alertmanager). See this issue for details. Thanks to @venkatbvc for the pull request.FEATURE: vmalert: add unit test for alerting and recording rules, see more details here. Thanks to @Haleygo for the pull request.
FEATURE: vmalert: allow overriding default GET params for rules with
graphite
datasource type, in the same way as it happens forprometheus
type. See this issue.FEATURE: vmalert: support
keep_firing_for
field for alerting rules. See docs updated here and this issue. Thanks to @Haleygo for the pull request.FEATURE: vmauth: expose
vmauth_user_request_duration_seconds
andvmauth_unauthorized_user_request_duration_seconds
summary metrics for measuring requests latency per user.FEATURE: vmbackup: show backup progress percentage in log during backup uploading. See this issue.
FEATURE: vmrestore: show restoring progress percentage in log during backup downloading. See this issue.
FEATURE: add ability to fine-tune Graphite API limits via the following command-line flags:
-search.maxGraphiteTagKeys
for limiting the number of tag keys returned from Graphite API for tags-search.maxGraphiteTagValues
for limiting the number of tag values returned from Graphite API for tag values-search.maxGraphiteSeries
for limiting the number of series (aka paths) returned from Graphite API for series See this issue.BUGFIX: properly return series from /api/v1/series if it finds more than the
limit
series (limit
is an optional query arg passed to this API). Previously thelimit exceeded error
error was returned in this case. See this issue.BUGFIX: vmui: fix application routing issues and problems with manual URL changes. See this pull request and this issue.
BUGFIX: add validation for invalid partial RFC3339 timestamp formats in query and export APIs.
BUGFIX: vmctl: interrupt explore procedure in influx mode if vmctl found no numeric fields.
BUGFIX: vmctl: fix panic in case
--remote-read-filter-time-start
flag is not set for remote-read mode. This flag is now required to use remote-read mode. See this issue.BUGFIX: vmctl: fix formatting issue, which could add superfluous
s
characters at the end ofsamples/s
output during data migration. For example, it could writesamples/ssssss
. See this issue.BUGFIX: vmalert: use RFC3339 time format in query args instead of unix timestamp for all issued queries to Prometheus-like datasources.
BUGFIX: vmalert: correctly calculate evaluation time for rules. Before, there was a low probability for discrepancy between actual time and rules evaluation time if evaluation interval was lower than the execution time for rules within the group.
BUGFIX: vmalert: reset evaluation timestamp after modifying group interval. Before, there could have latency on rule evaluation time.
BUGFIX: vmselect: fix timestamp alignment for Prometheus querying API if time argument is less than 10m from the beginning of Unix epoch.
BUGFIX: vmagent: close HTTP connections to service discovery servers when they are no longer needed. This should prevent from possible connection exhaustion in some cases. See this issue.
BUGFIX: vmagent: do not show relabel debug links at the
/targets
page whenvmagent
runs with-promscrape.dropOriginalLabels
command-line flag, since it has no the original labels needed for relabel debug. See this issue.BUGFIX: vminsert: fixed decoding of label values with slash when accepting data via pushgateway protocol. This fixes Prometheus golang client compatibility. See this issue.
BUGFIX: MetricsQL: properly parse binary operations with reserved words on the right side such as
foo + (on{bar="baz"})
. Previously such queries could lead to panic. See this issue.BUGFIX: Official Grafana dashboards for VictoriaMetrics: display cache usage for all components on panel
Cache usage % by type
for cluster dashboard. Before, only vmstorage caches were shown.
v1.91.3 #
Released at 2023-06-30
SECURITY: upgrade Go builder from Go1.20.4 to Go1.20.5. See the list of issues addressed in Go1.20.5.
BUGFIX: vmagent: fix possible panic at shutdown when stream aggregation is enabled. See this pull request for details.
BUGFIX: vmagent: fixed service name detection for consulagent service discovery in case of a difference in service name and service id. See this issue for details.
BUGFIX: vmalert: retry all errors except 4XX status codes while pushing via remote-write to the remote storage. Previously, errors like broken connection could prevent vmalert from retrying the request.
BUGFIX: vmalert: properly interrupt retry attempts on vmalert shutdown. Before, vmalert could have waited for all retries to finish for shutdown.
BUGFIX: vmbackupmanager: fix an issue with
vmbackupmanager
not being able to restore data from a backup stored in GCS. See this issue for details.BUGFIX: VictoriaMetrics cluster: properly return error from /api/v1/query and /api/v1/query_range at
vmselect
when the-search.maxSamplesPerQuery
or-search.maxSamplesPerSeries
limit is exceeded. Previously incomplete response could be returned without the error ifvmselect
runs with-replicationFactor
greater than 1. See this pull request.BUGFIX: storage: prevent from possible crashloop after the migration from versions below
v1.90.0
to newer versions. See this issue for details.BUGFIX: vmui: fix a memory leak issue associated with chart updates. See this pull request.
BUGFIX: vmbackupmanager: fix removing storage data dir before restoring from backup.
BUGFIX: vmselect: wait for all vmstorage nodes to respond when the
-replicationFactor
flag is set bigger than > 1. Before, vmselect could have skip waiting for the slowest replicas to respond. This could have resulted in issues illustrated here. Now, this optimization is disabled by default and could be re-enabled by passing-search.skipSlowReplicas
cmd-line flag to vmselect. See more details here.
v1.91.2 #
Released at 2023-06-02
v1.91.1 #
Released at 2023-06-01
FEATURE:vmagent: Adds
follow_redirects
at service discovery level of scrape configuration. See this issue. Thanks to @Haleygo for the pull request.FEATURE: vmselect: Decreases startup time for vmselect with a big number of vmstorage nodes. See this issue. Thanks to @Haleygo for the pull request.
BUGFIX: vmalert: Properly form path to static assets in WEB UI if
http.pathPrefix
set. See this issue.BUGFIX: vmalert: Properly set datasource query params. See this issue. Thanks to @gsakun for the pull request.
BUGFIX: vmalert: properly return empty slices instead of nil for
/api/v1/rules
for groups with present name but absentrules
. See this issue.BUGFIX: vmauth: Properly handle LOCAL command for proxy protocol. See this issue.
BUGFIX: vmbackupmanager: Fixes crash on startup. See this issue.
BUGFIX: vmui: fix bug with custom URL in global settings not respecting tenantID change. See this issue.
v1.91.0 #
Released at 2023-05-18
SECURITY: upgrade Go builder from Go1.20.3 to Go1.20.4. See the list of issues addressed in Go1.20.4.
SECURITY: serve
/robots.txt
content to disallow indexing of the exposed instances by search engines. See this issue for details.FEATURE: update docker compose environment to V2 in respect to V1 deprecation notice from June 2023. See Migrate to Compose V2.
FEATURE: deprecate
-bigMergeConcurrency
command-line flag, since improper configuration for this flag frequently led to uncontrolled growth of unmerged parts, which, in turn, could lead to queries slowdown and increased CPU usage. The concurrency for background merges can be controlled via-smallMergeConcurrency
command-line flag, though it isn’t recommended to change this flag in general case.FEATURE: do not execute the incoming request if it has been canceled by the client before the execution start. See this pull request.
FEATURE: support time formats with timezones. For example,
2024-01-02+02:00
meansJanuary 2, 2024
at+02:00
time zone. See these docs.FEATURE: expose
process_*
metrics at/metrics
page of all the VictoriaMetrics components under Windows OS. See this pull request.FEATURE: reduce the amounts of unimportant
INFO
logging during VictoriaMetrics startup / shutdown. This should improve visibility for potentially important logs.FEATURE: upgrade base docker image (alpine) from 3.17.3 to 3.18.0. See alpine 3.18.0 release notes.
FEATURE: VictoriaMetrics cluster: do not pollute logs with
cannot read hello: cannot read message with size 11: EOF
messages atvmstorage
during TCP health checks performed by Consul or other services. See this issue.FEATURE: vmagent: support the ability to filter consul_sd_configs targets in more optimal way via new
filter
option. See this feature request.FEATURE: vmagent: add support for consulagent_sd_configs. See this feature request.
FEATURE: vmagent: emit a warning if too small value is passed to
-remoteWrite.maxDiskUsagePerURL
command-line flag. See this issue.FEATURE: vmalert: add support of recursive globs for
-rule
and-rule.templates
command-line flags by using**
in the glob pattern. See this issue.FEATURE: vmalert: add ability to specify custom per-group HTTP headers sent to the configured notifiers. See this issue. Thanks to @Haleygo for the pull request.
FEATURE: vmalert: detect alerting rules which don’t match any series. See these docs and this feature request.
FEATURE: vmalert: support loading rules via HTTP URL. See this issue. Thanks to @Haleygo for the pull request.
FEATURE: vmalert: add buttons for filtering groups/rules with errors or with no-match warning in web UI for page
/groups
. See this issue.FEATURE: vmalert: do not retry remote-write requests for responses with 4XX status codes. This aligns with Prometheus remote write specification. Thanks to @MichaHoffmann for the pull request.
FEATURE: vmauth: add ability to filter incoming requests by IP. See these docs and this feature request.
FEATURE: vmauth: add ability to proxy requests to the specified backends for unauthorized users. See this feature request.
FEATURE: vmauth: add ability to specify default route for unmatched requests. See this feature request.
FEATURE: vmauth: retry
POST
requests on the remaining backends if the currently selected backend isn’t reachable. See this issue.FEATURE: vmui: add ability to compare the data for the previous day with the data for the current day at Cardinality Explorer. See this feature request.
FEATURE: vmui: display histograms as heatmaps in Metrics explorer. See this feature request.
FEATURE: vmui: add
WITH template
playground. See this feature request.FEATURE: vmui: add ability to debug relabeling. See this feature request.
FEATURE: vmui: add an ability to copy and execute queries listed at top queries page. Also make more human readable the query duration column. See this feature request and this pull request.
FEATURE: vmui: increase default font size for better readability.
FEATURE: vmui: cardinality explorer: return back a table with labels containing the highest number of unique label values. See issue.
FEATURE: vmui: add notification icon for queries that do not match any time series. A warning icon appears next to the query field when the executed query does not match any time series. See this feature request.
FEATURE: vmbackup: add
-s3StorageClass
command-line flag for setting the storage class for AWS S3 backups. See this issue. Thanks to @justcompile for the pull request.FEATURE: vmbackup: store backup creation and completion time in
backup_complete.ignore
file of backup contents. This allows determining the exact timestamp when the backup was created and completed.FEATURE: vmbackupmanager: add
created_at
field to the output of/api/v1/backups
API andvmbackupmanager backup list
command. See this doc for data format details.FEATURE: vmbackupmanager: add commands for locking/unlocking backups against deletion by retention policy. See this doc for data format details.
FEATURE: vmctl: add support for different time formats for
--vm-native-filter-time-start
and--vm-native-filter-time-end
command-line flags. See this issue.FEATURE: vmctl: set default value for
--vm-native-step-interval
command-line flag tomonth
. This enables time-based chunking of data based on monthly step value when using native migration mode. See this issue.BUGFIX: reduce the probability of sudden increase in the number of small parts on systems with small number of CPU cores.
BUGFIX: reduce the possibility of increased CPU usage when data with timestamps older than one hour is ingested into VictoriaMetrics. This reduces spikes for the graph
sum(rate(vm_slow_per_day_index_inserts_total))
. See this pull request.BUGFIX: fix possible infinite loop during
indexdb
rotation when-retentionTimezoneOffset
command-line flag is set and the local timezone is not UTC. See this issue. Thanks to @faceair for the fix.BUGFIX: do not panic at Windows during snapshot deletion. Instead, delete the snapshot on the next restart. See this comment for details.
BUGFIX: change the max allowed value for
-memory.allowedPercent
from 100 to 200. See this issue.BUGFIX: properly limit the number of OpenTSDB HTTP concurrent requests specified via
-maxConcurrentInserts
command-line flag. See this issue. Thanks to @zouxiang1993 for the fix.BUGFIX: do not ignore trailing empty field in CSV lines when importing data in CSV format. See this issue.
BUGFIX: disallow
"
chars when parsing Prometheus label names, since they aren’t allowed by Prometheus text exposition format. Previously this could result in silent incorrect parsing of incorrect Prometheus labels such asfoo{"bar"="baz"}
or{foo:"bar",baz="aaa"}
. See this issue.BUGFIX: VictoriaMetrics cluster: prevent from possible panic when the number of vmstorage nodes increases when automatic vmstorage discovery is enabled.
BUGFIX: MetricsQL: fix a panic when the duration in the query contains uppercase
M
suffix. Such a suffix isn’t allowed to use in durations, since it clashes witha million
suffix, e.g. it isn’t clear whetherrate(metric[5M])
means rate over 5 minutes, 5 months or 5 million seconds. See this and this issues.BUGFIX: vmagent: properly handle the
vm_promscrape_config_last_reload_successful
metric after config reload. See this issue.BUGFIX: vmagent: add
__meta_kubernetes_endpoints_name
label for all ports discovered from endpoint. Previously, ports not matched byService
did not have this label. See this issue for details. Thanks to @thunderbird86 for discovering and fixing the issue.BUGFIX: vmalert: retry failed read request on the closed connection one more time. This improves rules execution reliability when connection between vmalert and datasource closes unexpectedly.
BUGFIX: vmalert: properly display an error when using
query
function for templating value of-external.alert.source
flag. See this issue.BUGFIX: vmalert: properly return empty slices instead of nil for
/api/v1/rules
and/api/v1/alerts
API handlers. See this issue.BUGFIX: vmauth: do not return invalid auth credentials in http response by default, since it may be logged by client. See this issue.
BUGFIX: vmui: fix the display of the tenant selector. See this issue.
BUGFIX: vmui: fix UI freeze when the query returns non-histogram series alongside histogram series.
BUGFIX: vmui: fix the text display on buttons in Safari 16.4.
BUGFIX: alerts-health: update threshold for
TooHighMemoryUsage
alert from 90% to 80%, since 90% is too high for production environments.BUGFIX: vmbackup: fix compatibility with Windows OS. See this issue.
BUGFIX: vmctl: fix performance issue when migrating data from VictoriaMetrics according to these docs. Add the ability to speed up the data migration via
--vm-native-disable-retries
command-line flag. See this issue.BUGFIX: stream aggregation: fix bug with duplicated labels during stream aggregation via single-node VictoriaMetrics. See this issue.
v1.90.0 #
Released at 2023-04-06
Update note: this release contains backwards-incompatible change in storage data format,
so the previous versions of VictoriaMetrics will exit with the unexpected number of substrings in the part name
error when trying to run them on the data
created by v1.90.0 or newer versions. The solution is to upgrade to v1.90.0 or newer releases
SECURITY: upgrade base docker image (alpine) from 3.17.2 to 3.17.3. See alpine 3.17.3 release notes.
SECURITY: upgrade Go builder from Go1.20.2 to Go1.20.3. See the list of issues addressed in Go1.20.3.
FEATURE: open source Graphite Render API. This API allows using VictoriaMetrics as a drop-in replacement for Graphite at both data ingestion and querying sides and reducing infrastructure costs by up to 10x comparing to Graphite. See this case study as an example.
FEATURE: release Windows binaries for single-node VictoriaMetrics, VictoriaMetrics cluster, vmbackup and vmrestore. See this, this and this issues. This release of VictoriaMetrics for Windows cannot delete snapshots due to Windows constraints. See this comment for details. This issue should be resolved in future releases.
FEATURE: log metrics with truncated labels if the length of label value in the ingested metric exceeds
-maxLabelValueLen
. This should simplify debugging for this case.FEATURE: vmagent: show target URL when debugging target relabeling. This should simplify target relabel debugging a bit. See this pull request.
FEATURE: vmagent: add support for VictoriaMetrics remote write protocol when sending / receiving data to / from Kafka. This protocol allows saving egress network bandwidth costs when sending data from
vmagent
toKafka
located in another datacenter or availability zone. See this feature request.FEATURE: vmagent: add
-kafka.consumer.topic.concurrency
command-line flag. It controls the number of Kafka consumer workers to use byvmagent
. It should eliminate the need to start multiplevmagent
instances to improve data transfer rate. See this feature request.FEATURE: vmagent: add support for Kafka producer and consumer on
arm64
machines. See this issue.FEATURE: vmagent: delete unused buffered data at
-remoteWrite.tmpDataPath
directory when there is no matching-remoteWrite.url
to send this data to. See this feature request.FEATURE: vmagent: add the ability for hot reloading of stream aggregation configs. See these docs and this feature request.
FEATURE: check the contents of
-relabelConfig
and-streamAggr.config
files additionally to-promscrape.config
when single-node VictoriaMetrics runs with-dryRun
command-line flag. This aligns the behaviour of single-node VictoriaMetrics with vmagent behaviour for-dryRun
command-line flag.FEATURE: vmui: automatically draw a heatmap graph when the query selects a single histogram. This simplifies analyzing histograms. See this feature request.
FEATURE: vmui: add support for drag’n’drop and paste from clipboard in the “Trace analyzer” page. See this pull request.
FEATURE: vmui: hide messages longer than 3 lines in the trace. You can view the full message by clicking on the
show more
button. See this pull request.FEATURE: vmui: add the ability to manually input date and time when selecting a time range. See this pull request.
FEATURE: vmui: updated usability and the search process in cardinality explorer. Made this process straightforward for user. See this pull request.
FEATURE: vmui: add the ability to collapse/expand the legend. See this pull request.
FEATURE: vmui: add tips for working with the graph and legend. See this pull request.
FEATURE: vmui: add
apply
andcancel
buttons to settings popup. See this issue.FEATURE: vmctl: automatically disable progress bar when TTY isn’t available. See this issue.
FEATURE: vmauth: add
-configCheckInterval
command-line flag, which can be used for automatic re-reading the-auth.config
file. See this feature request.BUGFIX: prevent from slow snapshot creating under high data ingestion rate. See this issue.
BUGFIX: vmauth: suppress proxy protocol parsing errors in case of
EOF
. Usually, the error is caused by health checks and is not a sign of an actual error.BUGFIX: vmui: fix displaying errors for each query. See this issue.
BUGFIX: vmbackup: fix snapshot not being deleted in case of error during backup. See this issue.
BUGFIX: stream aggregation: suppress
series after dedup
error message in logs when-remoteWrite.streamAggr.dedupInterval
command-line flag is set at vmagent or when-streamAggr.dedupInterval
command-line flag is set at single-node VictoriaMetrics.BUGFIX: allow using dashes and dots in environment variables names referred in config files via
%{ENV-VAR.SYNTAX}
. See these docs and this issue.BUGFIX: return back query performance scalability on hosts with big number of CPU cores. The scalability has been reduced in v1.86.0. See this issue.
BUGFIX: MetricsQL: properly convert VictoriaMetrics historgram buckets to Prometheus histogram buckets when VictoriaMetrics histogram contain zero buckets. Previously these buckets were ignored, and this could lead to missing Prometheus histogram buckets after the conversion. Thanks to @zklapow for the fix.
BUGFIX: vmagent: fix CPU and memory usage spikes when files pointed by file_sd_config cannot be re-read. See this_issue.
BUGFIX: prevent unexpected merges on start-up when
-storage.minFreeDiskSpaceBytes
is set. See the issue.BUGFIX: properly support comma-separated filters inside retention filters. See this issue.
BUGFIX: verify response code when fetching configuration files via HTTP. See this issue.
BUGFIX: vmalert: replace empty labels with
""
instead of"<no value>"
during templating, as Prometheus does. See this issue.BUGFIX: vmctl: properly pass multiple filters from
--vm-native-filter-match
command-line flag to the data source. Previously filters from--vm-native-filter-match
were only used to discover the metric names, and the metric names like__name__="metric_name"
has been taken into account, while the remaining filters were ignored. For example--vm-native-src-addr={foo="bar",baz="abc"}
may foundmetric_name{foo="bar",baz="abc"}
and filter was treated as--vm-native-src-addr={__name__="metrics_name"}
, e.g.foo="bar",baz="abc"
filter was ignored. See this issue.
v1.89.1 #
Released at 2023-03-12
- BUGFIX: prevent from possible
cannot unmarshal timeseries from rollupResultCache
panic after the upgrade to v1.89.0.
v1.89.0 #
Released at 2023-03-12
Update note: this release can crash with cannot unmarshal timeseries from rollupResultCache
panic after the upgrade from the previous releases.
This issue can be fixed by removing caches stored on disk according to these docs.
Another option is to upgrade to v1.89.1.
SECURITY: upgrade Go builder from Go1.20.1 to Go1.20.2. See the list of issues addressed in Go1.20.2.
FEATURE: vmctl: increase the default value for
--remote-read-http-timeout
command-line option from 30s (30 seconds) to 5m (5 minutes). This reduces the probability of timeout errors when migrating big number of time series. See this pull request.FEATURE: vmctl: migrate series one-by-one in vm-native mode. This allows better tracking the migration progress and resuming the migration process from the last migrated time series. See this pull request and this feature request.
FEATURE: vmctl: add
--vm-native-src-headers
and--vm-native-dst-headers
command-line flags, which can be used for setting custom HTTP headers during vm-native migration mode. Thanks to @baconmania for the pull request.FEATURE: vmctl: add
--vm-native-src-bearer-token
and--vm-native-dst-bearer-token
command-line flags, which can be used for setting Bearer token headers for the source and the destination storage during vm-native migration mode. See this feature request.FEATURE: vmctl: add
--vm-native-disable-http-keep-alive
command-line flag to allowvmctl
to use non-persistent HTTP connections in vm-native migration mode. Thanks to @baconmania for the pull request.FEATURE: vmalert: log number of configuration files found for each specified
-rule
command-line flag.FEATURE: vmalert enterprise: concurrently read config files from S3, GCS or S3-compatible object storage. This significantly improves config load speed for cases when there are thousands of files to read from the object storage.
BUGFIX: vmstorage: fix a bug, which could lead to incomplete or empty results for heavy queries selecting tens of thousands of time series. See this pull request.
BUGFIX: vmselect: reduce memory usage and CPU usage when performing heavy queries. See this issue.
BUGFIX: prevent from possible
invalid memory address or nil pointer dereference
panic during background merge. The issue has been introduced at v1.85.0. See this issue.BUGFIX: prevent from possible
SIGBUS
crash on ARM architectures (Raspberry Pi), which deny unaligned access to 8-byte words. Thanks to @oliverpool for narrowing down the issue and for the initial attempt to fix it.BUGFIX: VictoriaMetrics cluster: always return
is_partial: true
in partial responses. Previously partial responses could be returned as non-partial in some cases.BUGFIX: VictoriaMetrics cluster: properly take into account
-rpc.disableCompression
command-line flag atvmstorage
. It was ignored since v1.78.0. See this pull request.BUGFIX: vmagent: fix panic when writing data to Kafka. The panic has been introduced in v1.88.0.
BUGFIX: vmui: stop showing
Please enter a valid Query and execute it
error message on the first load of vmui.BUGFIX: vmui: properly process
Run in VMUI
button click in VictoriaMetrics datasource plugin for Grafana.BUGFIX: vmui: fix the display of the selected value for dropdowns on
Explore
page.BUGFIX: vmui: do not send
step
param for instant queries. See this issue.BUGFIX: vmauth: fix
cannot serve http
panic when plain HTTP request is sent tovmauth
configured to accept requests over proxy protocol-encoded request (e.g. whenvmauth
runs with-httpListenAddr.useProxyProtocol
command-line flag). The issue has been introduced at v1.87.0 when implementing this feature.BUGFIX: vmgateway: properly parse RSA public key discovered via JWK endpoint.
v1.88.1 #
Released at 2023-02-27
FEATURE: add
-snapshotCreateTimeout
flag to allow configuring timeout for snapshot process. See this issue.FEATURE: expose
vm_http_requests_total
andvm_http_request_errors_total
metrics forsnapshot/*
paths at VictoriaMetrics clustervmstorage
and VictoriaMetrics Single. See this issue.FEATURE: vmgateway: add the ability to discover keys for JWT verification via OpenID discovery endpoint. See these docs.
FEATURE: add
-internStringDisableCache
command-line flag for disabling the cache for interned strings. This flag may be useful in some cases for reducing memory usage at the cost of higher CPU usage.FEATURE: add
-internStringCacheExpireDuration
command-line flag for controlling the lifetime of cached interned strings.BUGFIX: MetricsQL: fix panic when executing the query
aggr_func(rollup*(some_value))
. The panic has been introduced in v1.88.0.BUGFIX: vmagent: use the provided
-remoteWrite.*
auth options when determining whether the remote storage supports VictoriaMetrics remote write protocol. Previously the auth options were ignored. This was preventing from automatic switch to VictoriaMetrics remote write protocol.BUGFIX: vmagent: do not register
vm_promscrape_config_*
metrics if-promscrape.config
flag is not used. Previously those metrics were registered and never updated, which was confusing and could trigger false-positive alerts.BUGFIX: vmctl: skip measurements with no fields when migrating data from influxdb. See this issue.
BUGFIX: delete failed snapshot contents from disk on failed attempt to create snapshot. Previously failed snapshot contents could remain on disk in incomplete state. See this issue
v1.88.0 #
Released at 2023-02-24
SECURITY: upgrade base docker image (alpine) from 3.17.1 to 3.17.2. See alpine 3.17.2 release notes.
SECURITY: upgrade Go builder from Go1.20.0 to Go1.20.1. See the list of issues addressed in Go1.20.1.
FEATURE: vmagent: add support for VictoriaMetrics remote write protocol. This protocol allows saving egress network bandwidth costs when sending data from
vmagent
to VictoriaMetrics located in another datacenter or availability zone. This also allows reducing disk IO under high load whenvmagent
starts queuing the collected data to disk when the remote storage is temporarily unavailable or cannot keep up with the data ingestion rate. See this feature request.FEATURE: vmagent: add support for Kuma Control Plane targets discovery aka kuma_sd_configs. See this issue.
FEATURE: vmgateway: add the ability to verify JWT signature via JWKS endpoint. See these docs.
FEATURE: vmauth: add the ability to limit the number of concurrent requests on a per-user basis via
-maxConcurrentPerUserRequests
command-line flag and viamax_concurrent_requests
config option. See this feature request and these docs.FEATURE: vmauth: automatically retry failing
GET
requests on all the configured backends. Previously the backend error has been immediately returned to the client without retrying the request on the remaining backends.FEATURE: vmauth: choose the backend with the minimum number of concurrently executed requests among the configured backends in a round-robin manner for serving the incoming requests. This allows spreading the load among backends more evenly, while improving the response time.
FEATURE: vmalert enterprise: add ability to read alerting and recording rules from S3, GCS or S3-compatible object storage. See these docs.
FEATURE: vmctl: automatically retry requests to remote storage if up to 5 errors occur during the data migration process. This should help continuing the data migration process on temporary errors. Previously
vmctl
was stopping after the first error. See this feature request.FEATURE: MetricsQL: support optional 2nd argument
min
,max
oravg
for rollup, rollup_delta, rollup_deriv, rollup_increase, rollup_rate and rollup_scrape_interval function. If the second argument is passed, then the function returns only the selected aggregation type. This change can be useful for situations where only one type of rollup calculation is needed. For example,rollup_rate(requests_total[1i], "max")
would return only the max increase rates forrequests_total
metric per each interval between adjacent points on the graph. See this article for details.FEATURE: MetricsQL: support optional 2nd argument
open
,low
,high
,close
for rollup_candlestick function. If the second argument is passed, then the function returns only the selected aggregation type.FEATURE: MetricsQL: add
mad_over_time(m[d])
function for calculating the median absolute deviation over raw samples on the lookbehind windowd
. See this feature request.FEATURE: MetricsQL: add
range_mad(q)
function for calculating the median absolute deviation over points per each time series returned byq
.FEATURE: MetricsQL: add
range_zscore(q)
function for calculating z-score over points per each time series returned fromq
.FEATURE: MetricsQL: add
range_trim_outliers(k, q)
function for dropping outliers located farther thank*range_mad(q)
from therange_median(q)
. This should help removing outliers during query time at this issue.FEATURE: MetricsQL: add
range_trim_zscore(z, q)
function for dropping outliers located farther thanz*range_stddev(q)
fromrange_avg(q)
. This should help removing outliers during query time at this issue.FEATURE: vmui: show
median
instead ofavg
in graph tooltip and line legend, sincemedian
is more tolerant against spikes. See this issue.FEATURE: add
-search.maxSeriesPerAggrFunc
command-line flag, which can be used for limiting the number of time series MetricsQL aggregate functions can return in a single query. This flag can be useful for preventing OOMs when count_values function is improperly used.FEATURE: vmui: small UX improvements for mobile view. See this feature request and this pull request.
FEATURE: add
-search.logQueryMemoryUsage
command-line flag for logging queries, which need more memory than specified by this command-line flag. See this feature request. Thanks to @michal-kralik for the idea and the initial implementation.FEATURE: allow setting zero value for
-search.latencyOffset
command-line flag. This may be needed in some cases. Previously the minimum supported value for-search.latencyOffset
command-line flag was1s
.BUGFIX: vmagent: immediately cancel in-flight scrape requests during configuration reload when stream parsing mode is disabled. Previously
vmagent
could wait for long time until all the in-flight requests are completed before reloading the configuration. This could significantly slow down configuration reload. See this issue.BUGFIX: vmagent: do not wait for 2 seconds after the first unsuccessful attempt to scrape the target before performing the next attempt. This should improve scrape speed when the target closes http keep-alive connection between scrapes. See this and this issues.
BUGFIX: vmagent: fix Azure service discovery inside Azure Container App. See this issue. Thanks to @MattiasAng for the fix!
BUGFIX: do not put auxiliary directories scheduled for removal into snapshots. This should prevent from
cannot create hard links from ...must-remove...
errors when making snapshots / backups. See this issue.BUGFIX: prevent from possible data ingestion slowdown and query performance slowdown during background merges of big parts on systems with small number of CPU cores (1 or 2 CPU cores). The issue has been introduced in v1.85.0 when implementing this feature. See also this issue.
BUGFIX: properly parse timestamps in milliseconds when ingesting data via OpenTSDB telnet put protocol. Previously timestamps in milliseconds were mistakenly multiplied by 1000. Thanks to @Droxenator for the pull request.
BUGFIX: MetricsQL: do not add extrapolated points outside the real points when using interpolate() function. See this issue.
v1.87.12
Released at 2023-12-10
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
SECURITY: upgrade base docker image (Alpine) from 3.18.4 to 3.19.0. See alpine 3.19.0 release notes.
SECURITY: upgrade Go builder from Go1.21.4 to Go1.21.5. See the list of issues addressed in Go1.21.5.
BUGFIX: vmalert: sanitize label names before sending the alert notification to Alertmanager. Before, vmalert would send notifications with labels containing characters not supported by Alertmanager validator, resulting into validation errors like
msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar"
.BUGFIX: properly escape
<
character in responses returned via/federate
endpoint. See this issue.
v1.87.11 #
Released at 2023-11-14
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
SECURITY: upgrade Go builder from Go1.21.3 to Go1.21.4. the list of issues addressed in Go1.21.4.
BUGFIX: vmagent: properly apply relabeling with
regex
, which start and end with.+
or.*
and which contain alternate sub-regexps. For example,.+;|;.+
or.*foo|bar|baz.*
. Previously such regexps were improperly parsed, which could result in unexpected relabeling results. See this issue.BUGFIX: fix panic, which could occur when query tracing is enabled. See this issue.
BUGFIX: vmstorage: log warning about switching to ReadOnly mode only on state change. Before, vmstorage would log this warning every 1s. See this issue for details.
v1.87.10 #
Released at 2023-10-16
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
SECURITY: upgrade Go builder from Go1.21.1 to Go1.21.3. See the list of issues addressed in Go1.21.2 and the list of issues addressed in Go1.21.3.
BUGFIX: storage: prevent from livelock when forced merge is called under high data ingestion. See this issue.
BUGFIX: Graphite Render API: correctly return
null
instead ofInf
in JSON query responses. See this issue.BUGFIX: vminsert: fix ingestion via multitenant url for opentsdbhttp. See this issue. The bug has been introduced in v1.87.8.
BUGFIX: vmagent: fix support of legacy DataDog agent, which adds trailing slashes to urls. See this issue. Thanks to @maxb for spotting the issue.
v1.87.9 #
Released at 2023-09-10
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
SECURITY: upgrade Go builder from Go1.21.0 to Go1.21.1. See the list of issues addressed in Go1.20.6.
BUGFIX: vminsert enterprise: properly parse
/insert/multitenant/*
urls, which have been broken since v1.93.2. See this issue.BUGFIX: properly build production armv5 binaries for
GOARCH=arm
. This has been broken after the upgrading of Go builder to Go1.21.0. See this issue.BUGFIX: vmselect: return
503 Service Unavailable
status code when partial responses are denied and some ofvmstorage
nodes are temporarily unavailable. Previously422 Unprocessable Entity
status code was mistakenly returned in this case, which could prevent from automatic recovery by re-sending the request to healthy cluster replica in another availability zone.BUGFIX: vmalert: fix the bug when Group’s
params
fields with multiple values were overriding each other instead of adding up. The bug was introduced in this commit starting from v1.87.7. See this issue.
v1.87.8 #
Released at 2023-09-01
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
- BUGFIX: build: fix Docker builds for old Docker releases. See this issue.
- BUGFIX: vmselect: correctly handle requests with
/select/multitenant
prefix. Such requests must be rejected since vmselect does not support multitenancy endpoint. Previously, such requests were causing panic. See this issue. - BUGFIX: vminsert: properly check for read-only state at
vmstorage
. Previously it wasn’t properly checked, which could lead to increased resource usage and data ingestion slowdown when some ofvmstorage
nodes are in read-only mode. See this issue. - BUGFIX: vminsert: properly close broken vmstorage connection during read-only state checks at
vmstorage
. Previously it wasn’t properly closed, which prevents restoringvmstorage
node from read-only mode. See this issue. - BUGFIX: vmstorage: prevent from breaking
vmselect
->vmstorage
RPC communication whenvmstorage
returns an empty label name at/api/v1/labels
request. See this issue. - BUGFIX: do not allow starting VictoriaMetrics components with improperly set boolean command-line flags in the form
-boolFlagName value
, since this leads to silent incomplete flags’ parsing. This form should be replaced with-boolFlagName=value
. See this issue. - BUGFIX: properly replace
:
chars in label names with_
when-usePromCompatibleNaming
command-line flag is passed tovmagent
,vminsert
or single-node VictoriaMetrics. This addresses this comment. - BUGFIX: vmbackup: correctly check if specified
-dst
belongs to specified-storageDataPath
. See this issue.
v1.87.7 #
Released at 2023-08-12
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
SECURITY: upgrade Go builder from Go1.20.4 to Go1.21.0.
SECURITY: upgrade base docker image (Alpine) from 3.18.2 to 3.18.3. See alpine 3.18.3 release notes.
BUGFIX: vmselect: fix timestamp alignment for Prometheus querying API if time argument is less than 10m from the beginning of Unix epoch.
BUGFIX: vminsert: fixed decoding of label values with slash when accepting data via pushgateway protocol. This fixes Prometheus golang client compatibility. See this issue.
BUGFIX: vmagent: properly validate scheme for
proxy_url
field at the scrape config. See this issue for details.BUGFIX: vmagent: close HTTP connections to service discovery servers when they are no longer needed. This should prevent from possible connection exhaustion in some cases. See this issue.
BUGFIX: vmagent: properly apply
if
filters during relabeling. Previously theif
filter could improperly work. See this issue and this pull request.BUGFIX: vmagent: fix possible panic at shutdown when stream aggregation is enabled. See this pull request for details.
BUGFIX: vmagent: use local scrape timestamps for the scraped metrics unless
honor_timestamps: true
option is explicitly set at scrape_config. This fixes gaps for metrics collected from cadvisor or similar exporters, which export metrics with invalid timestamps. See this issue and this comment for details.BUGFIX: vmauth: Properly handle LOCAL command for proxy protocol. See this issue.
BUGFIX: VictoriaMetrics cluster: properly return error from /api/v1/query and /api/v1/query_range at
vmselect
when the-search.maxSamplesPerQuery
or-search.maxSamplesPerSeries
limit is exceeded. Previously incomplete response could be returned without the error ifvmselect
runs with-replicationFactor
greater than 1. See this pull request.BUGFIX: vmalert: correctly calculate evaluation time for rules. Before, there was a low probability for discrepancy between actual time and rules evaluation time if evaluation interval was lower than the execution time for rules within the group.
BUGFIX: vmalert: reset evaluation timestamp after modifying group interval. Before, there could have latency on rule evaluation time.
BUGFIX: vmalert: Properly set datasource query params. See this issue. Thanks to @gsakun for the pull request.
BUGFIX: vmalert: Properly form path to static assets in WEB UI if
http.pathPrefix
set. See this issue.BUGFIX: vmalert: properly return empty slices instead of nil for
/api/v1/rules
for groups with present name but absentrules
. See this issue.BUGFIX: vmctl: interrupt explore procedure in influx mode if vmctl found no numeric fields.
BUGFIX: vmctl: fix panic in case
--remote-read-filter-time-start
flag is not set for remote-read mode. This flag is now required to use remote-read mode. See this issue.
v1.87.6 #
Released at 2023-05-18
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
SECURITY: upgrade Go builder from Go1.20.3 to Go1.20.4. See the list of issues addressed in Go1.20.4.
SECURITY: upgrade base docker image (alpine) from 3.17.3 to 3.18.0. See alpine 3.18.0 release notes.
SECURITY: serve
/robots.txt
content to disallow indexing of the exposed instances by search engines. See this issue for details.BUGFIX: reduce the probability of sudden increase in the number of small parts on systems with small number of CPU cores.
BUGFIX: reduce the possibility of increased CPU usage when data with timestamps older than one hour is ingested into VictoriaMetrics. This reduces spikes for the graph
sum(rate(vm_slow_per_day_index_inserts_total))
. See this pull request.BUGFIX: do not ignore trailing empty field in CSV lines when importing data in CSV format. See this issue.
BUGFIX: disallow
"
chars when parsing Prometheus label names, since they aren’t allowed by Prometheus text exposition format. Previously this could result in silent incorrect parsing of incorrect Prometheus labels such asfoo{"bar"="baz"}
or{foo:"bar",baz="aaa"}
. See this issue.BUGFIX: MetricsQL: fix a panic when the duration in the query contains uppercase
M
suffix. Such a suffix isn’t allowed to use in durations, since it clashes witha million
suffix, e.g. it isn’t clear whetherrate(metric[5M])
means rate over 5 minutes, 5 months or 5 million seconds. See this and this issues.BUGFIX: VictoriaMetrics cluster: prevent from possible panic when the number of vmstorage nodes increases when automatic vmstorage discovery is enabled.
BUGFIX: properly limit the number of OpenTSDB HTTP concurrent requests specified via
-maxConcurrentInserts
command-line flag. See this issue. Thanks to @zouxiang1993 for the fix.BUGFIX: vmalert: properly return empty slices instead of nil for
/api/v1/rules
and/api/v1/alerts
API handlers. See this issue.BUGFIX: vmagent: add
__meta_kubernetes_endpoints_name
label for all ports discovered from endpoint. Previously, ports not matched byService
did not have this label. See this issue for details. Thanks to @thunderbird86 for discovering and fixing the issue.BUGFIX: fix possible infinite loop during
indexdb
rotation when-retentionTimezoneOffset
command-line flag is set and the local timezone is not UTC. See this issue. Thanks to @faceair for the fix.BUGFIX: vmauth: do not return invalid auth credentials in http response by default, since it may be logged by client. See this issue.
BUGFIX: alerts-health: update threshold for
TooHighMemoryUsage
alert from 90% to 80%, since 90% is too high for production environments.BUGFIX: vmagent: properly handle the
vm_promscrape_config_last_reload_successful
metric after config reload. See this issue.BUGFIX: stream aggregation: fix bug with duplicated labels during stream aggregation via single-node VictoriaMetrics. See this issue.
BUGFIX: stream aggregation: suppress
series after dedup
error message in logs when-remoteWrite.streamAggr.dedupInterval
command-line flag is set at vmagent or when-streamAggr.dedupInterval
command-line flag is set at single-node VictoriaMetrics.
v1.87.5 #
Released at 2023-04-06
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
SECURITY: upgrade base docker image (alpine) from 3.17.2 to 3.17.3. See alpine 3.17.3 release notes.
SECURITY: upgrade Go builder from Go1.20.2 to Go1.20.3. See the list of issues addressed in Go1.20.3.
BUGFIX: MetricsQL: properly convert VictoriaMetrics historgram buckets to Prometheus histogram buckets when VictoriaMetrics histogram contain zero buckets. Previously these buckets were ignored, and this could lead to missing Prometheus histogram buckets after the conversion. Thanks to @zklapow for the fix.
BUGFIX: vmagent: fix CPU and memory usage spikes when files pointed by file_sd_config cannot be re-read. See this_issue.
BUGFIX: prevent unexpected merges on start-up when
-storage.minFreeDiskSpaceBytes
is set. See the issue.BUGFIX: properly support comma-separated filters inside retention filters. See this issue.
BUGFIX: verify response code when fetching configuration files via HTTP. See this issue.
v1.87.4 #
Released at 2023-03-25
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
- BUGFIX: prevent from slow snapshot creating under high data ingestion rate. See this issue.
- BUGFIX: vmauth: suppress proxy protocol parsing errors in case of
EOF
. Usually, the error is caused by health checks and is not a sign of an actual error. - BUGFIX: vmbackup: fix snapshot not being deleted in case of error during backup. See this issue.
- BUGFIX: allow using dashes and dots in environment variables names referred in config files via
%{ENV-VAR.SYNTAX}
. See these docs and this issue. - BUGFIX: return back query performance scalability on hosts with big number of CPU cores. The scalability has been reduced in v1.86.0. See this issue.
v1.87.3 #
Released at 2023-03-12
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
SECURITY: upgrade Go builder from Go1.20.1 to Go1.20.2. See the list of issues addressed in Go1.20.2.
BUGFIX: vmstorage: fix a bug, which could lead to incomplete or empty results for heavy queries selecting tens of thousands of time series. See this pull request.
BUGFIX: vmselect: reduce memory usage and CPU usage when performing heavy queries. See this issue.
BUGFIX: prevent from possible
invalid memory address or nil pointer dereference
panic during background merge. The issue has been introduced at v1.85.0. See this issue.BUGFIX: prevent from possible
SIGBUS
crash on ARM architectures (Raspberry Pi), which deny unaligned access to 8-byte words. Thanks to @oliverpool for narrowing down the issue and for the initial attempt to fix it.BUGFIX: VictoriaMetrics cluster: always return
is_partial: true
in partial responses. Previously partial responses could be returned as non-partial in some cases.BUGFIX: VictoriaMetrics cluster: properly take into account
-rpc.disableCompression
command-line flag atvmstorage
. It was ignored since v1.78.0. See this pull request.BUGFIX: vmagent: do not register
vm_promscrape_config_*
metrics if-promscrape.config
flag is not used. Previously those metrics were registered and never updated, which was confusing and could trigger false-positive alerts.BUGFIX: vmctl: skip measurements with no fields when migrating data from influxdb. See this issue.
BUGFIX: vmauth: fix
cannot serve http
panic when plain HTTP request is sent tovmauth
configured to accept requests over proxy protocol-encoded request (e.g. whenvmauth
runs with-httpListenAddr.useProxyProtocol
command-line flag). The issue has been introduced at v1.87.0 when implementing this feature.
v1.87.2 #
Released at 2023-02-24
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
SECURITY: upgrade base docker image (alpine) from 3.17.1 to 3.17.2. See alpine 3.17.2 release notes.
SECURITY: upgrade Go builder from Go1.20.0 to Go1.20.1. See the list of issues addressed in Go1.20.1.
BUGFIX: vmagent: immediately cancel in-flight scrape requests during configuration reload when stream parsing mode is disabled. Previously
vmagent
could wait for long time until all the in-flight requests are completed before reloading the configuration. This could significantly slow down configuration reload. See this issue.BUGFIX: vmagent: do not wait for 2 seconds after the first unsuccessful attempt to scrape the target before performing the next attempt. This should improve scrape speed when the target closes http keep-alive connection between scrapes. See this and this issues.
BUGFIX: vmagent: fix Azure service discovery inside Azure Container App. See this issue. Thanks to @MattiasAng for the fix!
BUGFIX: do not put auxiliary directories scheduled for removal into snapshots. This should prevent from
cannot create hard links from ...must-remove...
errors when making snapshots / backups. See this issue.BUGFIX: prevent from possible data ingestion slowdown and query performance slowdown during background merges of big parts on systems with small number of CPU cores (1 or 2 CPU cores). The issue has been introduced in v1.85.0 when implementing this feature. See also this issue.
BUGFIX: properly parse timestamps in milliseconds when ingesting data via OpenTSDB telnet put protocol. Previously timestamps in milliseconds were mistakenly multiplied by 1000. Thanks to @Droxenator for the pull request.
BUGFIX: MetricsQL: do not add extrapolated points outside the real points when using interpolate() function. See this issue.
v1.87.1 #
Released at 2023-02-09
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
FEATURE: vmalert: alerts state restore procedure was changed to become asynchronous. It doesn’t block groups start anymore which significantly improves vmalert’s startup time. This also means that
-remoteRead.ignoreRestoreErrors
command-line flag becomes deprecated now and will have no effect if configured. While previously state restore attempt was made for all the loaded alerting rules, now it is called only for alerts which became active after the first evaluation. See this issue.FEATURE: vmui: optimize VMUI for use from smartphones and tablets. See this feature request.
FEATURE: vmui: add ability to search tenants in the drop-down list for the tenant selector. See this feature request.
FEATURE: vmui: add avg/min/max/last values to line legends and tooltips for graphs. See this feature request.
FEATURE: vmui: hide the default
per-job resource usage
dashboard if there is a custom dashboard exists at the directory specified via-vmui.customDashboardsPath
command-line flag. See this feature request.BUGFIX: vmagent: fix panic in HashiCorp Nomad service discovery. Thanks to @mr-karan for the pull request.
BUGFIX: vmalert: fix display of rules number per-group for groups with identical names in UI.
BUGFIX: vmalert: prevent disabling state updates tracking per rule via setting values < 1. The minimum number of update states to track is now set to 1.
BUGFIX: vmalert: properly update
debug
andupdate_entries_limit
rule’s params on config’s hot-reload.BUGFIX: properly initialize the
vm_concurrent_insert_current
metric before exposing it. Previously this metric could be left uninitialized in some cases, e.g. its value was zero. This could lead to false alerts for the queryavg_over_time(vm_concurrent_insert_current[1m]) >= vm_concurrent_insert_capacity
. See this issue.BUGFIX: vmagent: immediately cancel in-flight scrape requests during configuration reload when using stream parsing mode. Previously
vmagent
could wait for long time until all the in-flight requests are completed before reloading the configuration. This could significantly slow down configuration reload. See this issue.BUGFIX: vmgateway: do not validate JWT signature if no public keys are provided. Previously this could result in the
error setting up jwt verification
error.
v1.87.0 #
Released at 2023-02-01
v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release
FEATURE: stream aggregation: add the ability to de-duplicate input samples before aggregation via
-streamAggr.dedupInterval
and-remoteWrite.streamAggr.dedupInterval
command-line options.FEATURE: vmui: add dark mode - it can be selected via
settings
menu in the top right corner. See this pull request.FEATURE: vmui: improve visual appearance of the top menu. See this feature request.
FEATURE: vmui: embed fonts into binary instead of loading them from external sources. This allows using
vmui
in full from isolated networks without access to Internet. Thanks to @ScottKevill for the pull request.FEATURE: vmui: add ability to switch between tenants by selecting the needed tenant in the drop-down list at the top right corner of the UI. See this pull request.
FEATURE: vmagent: reduce memory usage when sending stale markers for targets, which expose big number of metrics. See this and this issues.
FEATURE: vmagent: add
__meta_kubernetes_pod_container_id
meta-label to the targets discovered via kubernetes_sd_configs. This label has been added in Prometheus starting fromv2.42.0
. See this feature request.FEATURE: vmagent: add
__meta_azure_machine_size
meta-label to the targets discovered via azure_sd_configs. This label has been added in Prometheus starting fromv2.42.0
. See this pull request.FEATURE: vmauth: allow limiting the number of concurrent requests sent to
vmauth
via-maxConcurrentRequests
command-line flag. This allows controlling memory usage ofvmauth
and the resource usage of backends behindvmauth
. See this feature request. Thanks to @dmitryk-dk for the initial implementation.FEATURE: allow using VictoriaMetrics components behind proxies, which communicate with the backend via proxy protocol. See this feature request. For example, vmauth accepts proxy protocol connections when it starts with
-httpListenAddr.useProxyProtocol
command-line flag.FEATURE: add
-internStringMaxLen
command-line flag, which can be used for fine-tuning RAM vs CPU usage in certain workloads. For example, if the stored time series contain long labels, then it may be useful reducing the-internStringMaxLen
in order to reduce memory usage at the cost of increased CPU usage. See this issue.FEATURE: provide GOARCH=386 binaries for single-node VictoriaMetrics, vmagent, vmalert, vmauth, vmbackup and vmrestore components at releases page. See this feature request. Thanks to @denisgolius for the pull request.
BUGFIX: fix a bug, which could prevent background merges for the previous partitions until restart if the storage didn’t have enough disk space for final deduplication and down-sampling.
BUGFIX: fix a bug, which could lead to increased CPU usage and disk IO usage when adding data to previous months and when the deduplication or downsampling is enabled. See this pull request.
BUGFIX: VictoriaMetrics cluster: propagate all the timeout-related errors from
vmstorage
tovmselect
. Previously some timeout errors weren’t returned fromvmselect
tovmstorage
. Instead,vmstorage
could log the error and close the connection tovmselect
, sovmselect
was logging cryptic errors such ascannot execute funcName="..." on vmstorage "...": EOF
.BUGFIX: vmui: add support for time zone selection for older versions of browsers. See this pull request.
BUGFIX: vmagent: update API version for ec2_sd_configs to fix the issue with missing
__meta_ec2_availability_zone_id
attribute.BUGFIX: vmagent: properly return
200 OK
HTTP status code when importing data via Pushgateway protocol. See this issue.BUGFIX: vmagent: do not add
exported_
prefix to scraped metric names, which clash with the automatically generated metric names ifhonor_labels: true
option is set in the scrape_config. See the this and this issues.BUGFIX: vmauth: allow re-entering authorization info in the web browser if the entered info was incorrect. Previously it was non-trivial to do via the web browser, since
vmauth
was returning400 Bad Request
instead of401 Unauthorized
http response code.BUGFIX: vmauth: always log the client address and the requested URL on proxying errors. Previously some errors could miss this information.
BUGFIX: vmbackup: fix snapshot not being deleted after backup completion. This issue could result in unnecessary snapshots being stored, it is required to delete unnecessary snapshots manually. See the this issue.
BUGFIX: VictoriaMetrics cluster: fix panic on top-level vmselect nodes of multi-level setup when the
-replicationFactor
flag is set and request containstrace
query parameter. See this issue.
v1.86.2 #
Released at 2023-01-18
SECURITY: vmbackup: do not expose basic auth passwords from
-snapshot.createURL
and-snapshot.deleteURL
command-line flags in logs. Thanks to @toanju for the pull request.FEATURE: vmui: add ability to show custom dashboards at vmui by specifying a path to a directory with dashboard config files via
-vmui.customDashboardsPath
command-line flag. See this feature request and these docs.FEATURE: vmui: apply the
step
globally to all the displayed graphs. See this feature request.FEATURE: vmui: improve the appearance of graph lines by using more visually distinct colors. See this feature request.
BUGFIX: do not slow down concurrently executed queries during assisted merges, since assisted merges already prioritize data ingestion over queries. The probability of assisted merges has been increased starting from v1.85.0 because of internal refactoring. This could result in slowed down queries when there is a plenty of free CPU resources. See this and this issues.
BUGFIX: reduce the increased CPU usage at
vmselect
to v1.85.3 level when processing heavy queries. See this issue.BUGFIX: retention filters: fix
FATAL: cannot locate metric name for metricID=...: EOF
panic, which could occur when retention filters are enabled.BUGFIX: vmagent: properly cancel in-flight service discovery requests for consul_sd_configs and nomad_sd_configs when the service list changes. See this issue.
BUGFIX: vmagent: dockerswarm_sd_configs: apply
filters
only to objects of the specifiedrole
. Previously filters were applied to all the objects, which could cause errors when different types of objects were used with filters that were not compatible with them. See this issue.BUGFIX: vmagent: suppress all the scrape errors when
-promscrape.suppressScrapeErrors
is enabled. Previously some scrape errors were logged even if-promscrape.suppressScrapeErrors
flag was set.BUGFIX: vmagent: consistently put the scrape url with scrape target labels to all error logs for failed scrapes. Previously some failed scrapes were logged without this information.
BUGFIX: vmagent: do not send stale markers to remote storage for series exceeding the configured series limit. See this issue.
BUGFIX: vmagent: properly apply series limit when staleness tracking is disabled.
BUGFIX: vmagent: reduce memory usage spikes when big number of scrape targets disappear at once. See this issue. Thanks to @lzfhust for the initial fix.
BUGFIX: Pushgateway import: properly return
200 OK
HTTP response code. See this issue.BUGFIX: MetricsQL: properly parse
M
andMi
suffixes as1e6
multipliers in1M
and1Mi
numeric constants. See this issue. The issue has been introduced in v1.86.0.BUGFIX: vmui: properly display range query results at
Table
view. For example,up[5m]
query now shows all the raw samples for the last 5 minutes for theup
metric at theTable
view. See this issue.
v1.86.1 #
Released at 2023-01-10
- BUGFIX: return correct query results over time series with gaps. The issue has been introduced in v1.86.0.
- BUGFIX: properly take into account the timeout passed by
vmselect
tovmstorage
during query execution. This issue could result in the following error logs atvmstorage
under load:cannot process vmselect request: cannot execute "search_v7": couldn't start executing the request in 0.000 seconds, since -search.maxConcurrentRequests=... concurrent requests are already executed
. The issue has been introduced in v1.86.0.
v1.86.0 #
Released at 2023-01-10
It is recommended upgrading to VictoriaMetrics v1.86.1 because v1.86.0 contains a bug, which could lead to incorrect query results over time series with gaps.
Update note 1: This release changes the logic behind -maxConcurrentInserts
command-line flag. Previously this flag was limiting the number of concurrent connections established from clients, which send data to VictoriaMetrics. Some of these connections could be temporarily idle. Such connections do not take significant CPU and memory resources, so there is no need in limiting their count. The new logic takes into account only those connections, which actively ingest new data to VictoriaMetrics and to vmagent. This means that the default -maxConcurrentInserts
value should handle cases, which could require increasing the value in the previous releases. So it is recommended trying to remove the explicitly set -maxConcurrentInserts
command-line flag after upgrading to this release and verifying whether this reduces CPU and memory usage.
Update note 2: The vm_concurrent_addrows_current
and vm_concurrent_addrows_capacity
metrics exported by vmstorage
are replaced with vm_concurrent_insert_current
and vm_concurrent_insert_capacity
metrics in order to be consistent with the corresponding metrics exported by vminsert
. Please update queries in dashboards and alerting rules with new metric names if old metric names are used there.
FEATURE: vmagent: add support for aggregation of incoming samples by time and by labels. See these docs and this feature request.
FEATURE: vmagent: reduce memory usage when scraping big number of targets without the need to enable stream parsing mode.
FEATURE: vmagent: add support for Prometheus-compatible target discovery for HashiCorp Nomad services via nomad_sd_configs. See this feature request. Thanks to @mr-karan for the implementation.
FEATURE: vmagent: automatically pre-fetch
metric_relabel_configs
and the target labels when clicking on thedebug metrics relabeling
link at thehttp://vmagent:8429/targets
page at the particular target. See these docs.FEATURE: vmui: add ability to explore metrics exported by a particular
job
/instance
. See these docs and this feature request.FEATURE: allow passing partial
RFC3339
date/time totime
,start
andend
query args at querying APIs and export APIs. For example,2022
is equivalent to2022-01-01T00:00:00Z
, while2022-01-30T14
is equivalent to2022-01-30T14:00:00Z
. See these docs.FEATURE: MetricsQL: allow using unicode letters in identifiers. For example,
температура{город="Київ"}
is a valid MetricsQL expression now. Previously every non-ascii letters should be escaped with\
char when used inside MetricsQL expression:\т\е\м\п\е\р\а\т\у\р\а{\г\о\р\о\д="Киев"}
. Now both expressions are equivalent. Thanks to @hzwwww for the pull request.FEATURE: relabeling: add support for
keepequal
anddropequal
relabeling actions, which are supported by Prometheus starting from v2.41.0. These relabeling actions are almost identical tokeep_if_equal
anddrop_if_equal
relabeling actions supported by VictoriaMetrics sincev1.38.0
- see these docs - so it is recommended sticking tokeep_if_equal
anddrop_if_equal
actions instead of switching tokeepequal
anddropequal
.FEATURE: csvimport: support empty values for imported metrics. See this issue.
FEATURE: vmalert: allow configuring the default number of stored rule’s update states in memory via global
-rule.updateEntriesLimit
command-line flag or per-rule via rule’supdate_entries_limit
configuration param. See these docs and this pull request.FEATURE: improve the logic behind
-maxConcurrentInserts
command-line flag. Previously this flag was limiting the number of concurrent connections from clients, which write data to VictoriaMetrics or vmagent. Some of these connections could be idle for some time. These connections do not need significant amounts of CPU and memory, so there is no sense in limiting their count. The updated logic behind-maxConcurrentInserts
limits the number of active insert requests, not counting idle connections.FEATURE: protect all the http endpoints with
-httpAuth.*
command-line flag. Previously endpoints protected by-*AuthKey
command-line flags weren’t protected by-httpAuth.*
. This could complicate the proper security setup. See this issue.FEATURE: VictoriaMetrics cluster: add
-maxConcurrentInserts
and-insert.maxQueueDuration
command-line flags tovmstorage
, so they could be tuned if needed in the same way as atvminsert
nodes.FEATURE: VictoriaMetrics cluster: limit the number of concurrently executed requests at
vmstorage
proportionally to the number of available CPU cores, since every request can saturate a single CPU core atvmstorage
. Previously a singlevmstorage
could accept and start processing arbitrary number of concurrent requests received from big number ofvmselect
nodes. This could result in increased RAM, CPU and disk IO usage or event to out of memory crash atvmstorage
side under high load. The limit can be fine-tuned if needed via-search.maxConcurrentRequests
command-line flag atvmstorage
according to these docs.vmstorage
now exposes the following additional metrics athttp://vmstorage:8482/metrics
page:vm_vmselect_concurrent_requests_capacity
- the maximum number of requests allowed to execute concurrentlyvm_vmselect_concurrent_requests_current
- the current number of concurrently executed requestsvm_vmselect_concurrent_requests_limit_reached_total
- the total number of requests, which were put in the wait queue when-search.maxConcurrentRequests
concurrent requests are being executedvm_vmselect_concurrent_requests_limit_timeout_total
- the total number of canceled requests because they were sitting in the wait queue for more than-search.maxQueueDuration
BUGFIX: vmui: properly update the
step
value in url after thestep
input field has been manually changed. This allows preserving the properstep
when copy-n-pasting the url to another instance of web browser. See this issue.BUGFIX: vmui: properly update tooltip when quickly hovering multiple lines on the graph. See this issue.
BUGFIX: properly parse floating-point numbers without integer or fractional parts such as
.123
and20.
during data import. See this issue.BUGFIX: MetricsQL: properly parse durations with uppercase suffixes such as
10S
,5MS
,1W
, etc. See this issue.BUGFIX: vmagent: fix a panic during target discovery when
vmagent
runs with-promscrape.dropOriginalLabels
command-line flag. See this issue. The bug has been introduced in v1.85.0.BUGFIX: vmagent: dockerswarm_sd_configs: properly encode
filters
field. See this issue.BUGFIX: vmagent: fix possible resource leak after hot reload of the updated consul_sd_configs. See this issue.
BUGFIX: vmagent: fix a panic in gce_sd_configs when the discovered instance has zero labels. See this issue. The issue has been introduced in v1.85.0.
BUGFIX: properly return label names starting from uppercase such as
CamelCaseLabel
from /api/v1/labels. See this issue.BUGFIX: fix
opentsdb
HTTP endpoint not respecting-httpAuth.*
flags. See this issueBUGFIX: consistently select the sample with the biggest value out of samples with identical timestamps during querying when the deduplication is enabled according to this feature request. Previously random samples could be selected during querying.
Previous releases #
See changes for older releases here.