CHANGELOG for the year 2023

v1.96.0#

Released at 2023-12-13

vmalert’s metrics vmalert_alerting_rules_error and vmalert_recording_rules_error were replaced with vmalert_alerting_rules_errors_total and vmalert_recording_rules_errors_total. See this issue for details.

  • SECURITY: upgrade base docker image (Alpine) from 3.18.4 to 3.19.0. See alpine 3.19.0 release notes.

  • SECURITY: upgrade Go builder from Go1.21.4 to Go1.21.5. See the list of issues addressed in Go1.21.5.

  • FEATURE: vmauth: add ability to send requests to the first available backend and fall back to other hot standby backends when the first backend is unavailable. This allows building highly available setups as shown in these docs. See this issue.

  • FEATURE: vmselect: allow specifying multiple groups of vmstorage nodes with independent -replicationFactor per each group. See these docs and this feature request for details.

  • FEATURE: vmselect: allow opening vmui and investigating Top queries and Active queries when the vmselect is overloaded with concurrent queries (e.g. when more than -search.maxConcurrentRequests concurrent queries are executed). Previously an attempt to open Top queries or Active queries at vmui could result in couldn't start executing the request in ... seconds, since -search.maxConcurrentRequests=... concurrent requests are executed error, which could complicate debugging of overloaded vmselect or single-node VictoriaMetrics.

  • FEATURE: vmagent: add -enableMultitenantHandlers command-line flag, which allows receiving data via VictoriaMetrics cluster urls at vmagent and converting tenant ids to (vm_account_id, vm_project_id) labels before sending the data to the configured -remoteWrite.url. See these docs for details.

  • FEATURE: vmagent: add -remoteWrite.disableOnDiskQueue command-line flag, which can be used for disabling data queueing to disk when the remote storage cannot keep up with the data ingestion rate. See these docs and this feature request.

  • FEATURE: vmagent: add support for reading and writing samples via Google PubSub. See these docs.

  • FEATURE: vmagent: show all the dropped targets together with the reason why they are dropped at http://vmagent:8429/service-discovery page. Previously targets, which were dropped because of target sharding weren’t displayed on this page. This could complicate service discovery debugging. See this issue and this feature request.

  • FEATURE: reduce the default value for -import.maxLineLen command-line flag from 100MB to 10MB in order to prevent excessive memory usage during data import via /api/v1/import.

  • FEATURE: vmagent: add keep_if_contains and drop_if_contains relabeling actions. See these docs for details.

  • FEATURE: vmagent: export vm_promscrape_scrape_pool_targets metric to track the number of targets each scrape job discovers. See this feature request.

  • FEATURE: vmalert: provide /vmalert/api/v1/rule and /api/v1/rule API endpoints to get the rule object in JSON format. See these docs for details.

  • FEATURE: vmalert: deprecate process gauge metrics vmalert_alerting_rules_error and vmalert_recording_rules_error in favour of vmalert_alerting_rules_errors_total and vmalert_recording_rules_errors_total counter metrics. Counter metric type is more suitable for error counting as it preserves the state change between the scrapes. See this issue for details.

  • FEATURE: MetricsQL: add day_of_year() function, which returns the day of the year for each of the given unix timestamps. See this issue for details. Thanks to @luckyxiaoqiang for the pull request.

  • FEATURE: all VictoriaMetrics binaries: expose additional metrics at /metrics page, which may simplify debugging of VictoriaMetrics components (see this feature request):

    • go_sched_latencies_seconds - the histogram, which shows the time goroutines have spent in runnable state before actually running. Big values point to the lack of CPU time for the current workload.
    • go_mutex_wait_seconds_total - the counter, which shows the total time spent by goroutines waiting for locked mutex. Big values point to mutex contention issues.
    • go_gc_cpu_seconds_total - the counter, which shows the total CPU time spent by Go garbage collector.
    • go_gc_mark_assist_cpu_seconds_total - the counter, which shows the total CPU time spent by goroutines in GC mark assist state.
    • go_gc_pauses_seconds - the histogram, which shows the duration of GC pauses.
    • go_scavenge_cpu_seconds_total - the counter, which shows the total CPU time spent by Go runtime for returning memory to the Operating System.
    • go_memlimit_bytes - the value of GOMEMLIMIT environment variable.
  • FEATURE: vmui: enhance autocomplete functionality with caching. See this issue.

  • FEATURE: add field version to the response for /api/v1/status/buildinfo API for using more efficient API in Grafana for receiving label values. Add additional info about setup Grafana datasource. See this issue and these docs for details.

  • FEATURE: add -search.maxResponseSeries command-line flag for limiting the number of time series a single query to /api/v1/query or /api/v1/query_range can return. This limit can protect Grafana from high memory usage when the query returns too many series. See this feature request.

  • FEATURE: Alerting rules for VictoriaMetrics: ease aggregation for certain alerting rules to keep more useful labels for the context. Before, all extra labels except job and instance were ignored. See this pull request and this follow-up commit. Thanks to @7840vz.

  • FEATURE: vmctl: allow reversing the migrating order from the newest to the oldest data for vm-native and remote-read modes via --vm-native-filter-time-reverse and --remote-read-filter-time-reverse command-line flags respectively. See: https://docs.victoriametrics.com/vmctl.html#using-time-based-chunking-of-migration and this feature request.

  • BUGFIX: MetricsQL: properly calculate values for the first point on the graph for queries, which do not use rollup functions. For example, previously count(up) could return lower than expected values for the first point on the graph. This also could result in lower than expected values in the middle of the graph like in this issue when the response caching isn’t disabled. The issue has been introduced in v1.95.0.

  • BUGFIX: vmagent: prevent from FATAL: cannot flush metainfo panic when -remoteWrite.multitenantURL command-line flag is set. See this issue.

  • BUGFIX: vmagent: properly decode zstd-encoded data blocks received via VictoriaMetrics remote_write protocol. See this issue comment.

  • BUGFIX: vmagent: properly add new labels at output_relabel_configs during stream aggregation. Previously this could lead to corrupted labels in output samples. Thanks to @ChengChung for providing detailed report for this bug.

  • BUGFIX: vmalert-tool: allow using arbitrary eval_time in alert_rule_test case. Previously, test cases with eval_time not being a multiple of evaluation_interval would fail.

  • BUGFIX: vmalert: sanitize label names before sending the alert notification to Alertmanager. Before, vmalert would send notifications with labels containing characters not supported by Alertmanager validator, resulting into validation errors like msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar".

  • BUGFIX: vmbackupmanager: fix vmbackupmanager not deleting previous object versions from S3 when applying retention policy with -deleteAllObjectVersions command-line flag.

  • BUGFIX: vminsert: fix panic when ingesting data via NewRelic protocol into VictoriaMetrics cluster. See this issue.

  • BUGFIX: properly escape < character in responses returned via /federate endpoint. See this issue.

  • BUGFIX: vmctl: check for Error field in response from influx client during migration. Before, only network errors were checked. Thanks to @wozz for the pull request.

v1.95.1#

Released at 2023-11-16

  • FEATURE: dashboards: use version instead of short_version in version change annotation for single/cluster dashboards. The update should reflect version changes even if different flavours of the same release were applied (custom builds).

  • BUGFIX: fix a bug, which could result in improper results and/or to cannot merge series: duplicate series found error during range query execution. The issue has been introduced in v1.95.0. See this bugreport for details.

  • BUGFIX: improve deadline detection when using buffered connection for communication between cluster components. Before, due to nature of a buffered connection the deadline could have been exceeded while reading or writing buffered data to connection. See this pull request.

v1.95.0#

Released at 2023-11-15

It is recommended upgrading to v1.95.1 because v1.95.0 contains a bug, which can lead to incorrect query results and to cannot merge series: duplicate series found error. See this issue for details.

vmalert’s cmd-line flag -datasource.lookback will be deprecated soon. Please use -rule.evalDelay command-line flag instead and see more details on how to use it here. The flag datasource.lookback will have no effect in the next release and will be removed in the future releases. See this issue.

vmalert’s cmd-line flag -datasource.queryTimeAlignment was deprecated and will have no effect anymore. It will be completely removed in next releases. See this issue and more detailed changes related to vmalert below.

  • SECURITY: upgrade Go builder from Go1.21.1 to Go1.21.4. See the list of issues addressed in Go1.21.2, the list of issues addressed in Go1.21.3 and the list of issues addressed in Go1.21.4.

  • FEATURE: vmselect: improve performance for repeated instant queries if they contain one of the following rollup functions:

    The optimization is enabled when these functions contain lookbehind window in square brackets bigger or equal to 6h (the threshold can be changed via -search.minWindowForInstantRollupOptimization command-line flag). The optimization improves performance for SLO/SLI-like queries such as avg_over_time(up[30d]) or sum(rate(http_request_errors_total[3d])) / sum(rate(http_requests_total[3d])), which can be generated by sloth or similar projects.

  • FEATURE: vmselect: improve query performance on systems with big number of CPU cores (>=32). Add -search.maxWorkersPerQuery command-line flag, which can be used for fine-tuning query performance on systems with big number of CPU cores. See this pull request.

  • FEATURE: vmselect: expose vm_memory_intensive_queries_total counter metric which gets increased each time -search.logQueryMemoryUsage memory limit is exceeded by a query. This metric should help to identify expensive and heavy queries without inspecting the logs.

  • FEATURE: MetricsQL: add drop_empty_series() function, which can be used for filtering out empty series before performing additional calculations as shown in this issue.

  • FEATURE: MetricsQL: add labels_equal() function, which can be used for searching series with identical values for the given labels. See this feature request.

  • FEATURE: MetricsQL: add outlier_iqr_over_time(m[d]) and outliers_iqr(q) functions, which allow detecting anomalies in samples and series using Interquartile range method.

  • FEATURE: vmalert: add eval_alignment attribute for Groups, it will align group query requests timestamp with interval like datasource.queryTimeAlignment did. This also means that datasource.queryTimeAlignment command-line flag becomes deprecated now and will have no effect if configured. If datasource.queryTimeAlignment was set to false before, then eval_alignment has to be set to false explicitly under group. See this issue.

  • FEATURE: vmalert: add -rule.evalDelay flag and eval_delay attribute for Groups. The new flag and param can be used to adjust the time parameter for rule evaluation requests to match intentional query delay from the datasource. See this issue.

  • FEATURE: vmalert: allow specifying full url in notifier static_configs target address, like http://alertmanager:9093/test/api/v2/alerts. See this issue.

  • FEATURE: vmalert: reduce the number of queries for restoring alerts state on start-up. The change should speed up the restore process and reduce pressure on remoteRead.url. See this pull request.

  • FEATURE: vmalert: add label file pointing to the group’s filename to metrics vmalert_recording_.* and vmalert_alerts_.*. The filename should help identifying alerting rules belonging to specific groups with identical names but different filenames. See this issue.

  • FEATURE: vmalert: automatically retry remote-write requests on closed connections. The change should reduce the amount of logs produced in environments with short-living connections or environments without support of keep-alive on network balancers.

  • FEATURE: vmagent: support data ingestion from NewRelic infrastructure agent. See these docs, this feature request and this pull request.

  • FEATURE: vmagent: add -remoteWrite.shardByURL.labels command-line flag, which can be used for specifying a list of labels for sharding outgoing samples among the configured -remoteWrite.url destinations if -remoteWrite.shardByURL command-line flag is set. See these docs and this feature request for details.

  • FEATURE: vmagent: do not exit on startup when scrape_configs refer to non-existing or invalid files with auth configs, since these files may appear / updated later. See this feature request and this pull request.

  • FEATURE: vmagent: allow loading TLS certificates from HTTP and HTTPS urls by specifying these urls at cert_file and key_file options inside tls_config and proxy_tls_config sections at http client settings.

  • FEATURE: vmagent: reduce CPU load when big number of targets are scraped over HTTPS with the same custom TLS certificate configured via tls_config->cert_file and tls_config->key_file at scrape_config.

  • FEATURE: vmbackup: add -filestream.disableFadvise command-line flag, which can be used for disabling fadvise syscall during backup upload to the remote storage. By default vmbackup uses fadvise syscall in order to prevent from eviction of recently accessed data from the OS page cache when backing up large files. Sometimes the fadvise syscall may take significant amounts of CPU when the backup is performed with large value of -concurrency command-line flag on systems with big number of CPU cores. In this case it is better to manually disable fadvise syscall by passing -filestream.disableFadvise command-line flag to vmbackup. See this pull request for details.

  • FEATURE: vmbackup: add -deleteAllObjectVersions command-line flag, which can be used for forcing removal of all object versions in remote object storage. See this issue and these docs for the details.

  • FEATURE: Alerting rules for VictoriaMetrics: account for vmauth component for alerts ServiceDown and TooManyRestarts.

  • FEATURE: Alerting rules for VictoriaMetrics: make TooHighMemoryUsage more tolerable to spikes or near-the-threshold states. The change should reduce number of false positives.

  • FEATURE: Alerting rules for VictoriaMetrics: add TooManyMissedIterations alerting rule for vmalert to detect groups that miss their evaulations due to slow queries.

  • FEATURE: vmui: add support for functions, labels, values in autocomplete. See this issue.

  • FEATURE: vmui: retain specified time interval when executing a query from Top Queries. See this issue.

  • FEATURE: vmui: improve repeated VMUI page load times by enabling caching of static js and css at web browser side according to these recommendations.

  • FEATURE: vmui: sort legend under the graph in descending order of median values. This should simplify graph analysis, since usually the most important lines have bigger values.

  • FEATURE: vmui: reduce vertical space usage, so more information is visible on the screen without scrolling.

  • FEATURE: vmui: show query execution duration in the header of query input field. This should help optimizing query performance.

  • FEATURE: support Strict-Transport-Security, Content-Security-Policy and X-Frame-Options HTTP response headers in the all VictoriaMetrics components. The values for headers can be specified via the following command-line flags: -http.header.hsts, -http.header.csp and -http.header.frameOptions.

  • FEATURE: vmalert-tool: add unittest command to run unittest for alerting and recording rules. See this pull request for details.

  • FEATURE: dashboards/vmalert: add new panel Missed evaluations for indicating alerting groups that miss their evaluations.

  • FEATURE: all: track requests with wrong auth key and wrong basic auth at vm_http_request_errors_total metric with reason="wrong_auth_key" and reason="wrong_basic_auth". See this issue. Thanks to @venkatbvc for the pull request.

  • FEATURE: vmauth: add ability to drop the specified number of /-delimited prefix parts from the request path before proxying the request to the matching backend. See these docs.

  • FEATURE: vmauth: add ability to skip TLS verification and to specify TLS Root CA when connecting to backends. See these docs and this issue.

  • FEATURE: vmstorage: gradually close vminsert connections during 25 seconds at graceful shutdown. This should reduce data ingestion slowdown during rolling restarts. The duration for gradual closing of vminsert connections can be configured via -storage.vminsertConnsShutdownDuration command-line flag. See this issue and these docs for details.

  • FEATURE: vmstorage: add -blockcache.missesBeforeCaching command-line flag, which can be used for fine-tuning RAM usage for indexdb/dataBlocks cache when queries touching big number of time series are executed.

  • FEATURE: add -loggerMaxArgLen command-line flag for fine-tuning the maximum lengths of logged args.

  • BUGFIX: vmalert: strip sensitive information such as auth headers or passwords from datasource, remote-read, remote-write or notifier URLs in log messages or UI. This behavior is by default and is controlled via -datasource.showURL, -remoteRead.showURL, remoteWrite.showURL or -notifier.showURL cmd-line flags. See this issue.

  • BUGFIX: vmalert: fix vmalert web UI when running on 32-bit architectures machine.

  • BUGFIX: vmalert: do not send requests to configured remote systems when -datasource.*, -remoteWrite.*, -remoteRead.* or -notifier.* command-line flags refer files with invalid auth configs. Previously such requests were sent without properly set auth headers. Now the requests are sent only after the files are updated with valid auth configs. See this pull request.

  • BUGFIX: vmalert: properly maintain alerts state in replay mode if alert’s for param was bigger than replay request range (usually a couple of hours). See this issue for details.

  • BUGFIX: vmalert: increment vmalert_remotewrite_errors_total metric if all retries to send remote-write request failed. Before, this metric was incremented only if remote-write client’s buffer is overloaded.

  • BUGFIX: vmalert: increment vmalert_remotewrite_dropped_rows_total metric if remote-write client’s buffer is overloaded. Before, these metrics were incremented only after unsuccessful HTTP calls.

  • BUGFIX: vmselect: improve performance and memory usage during query processing on machines with big number of CPU cores. See this issue.

  • BUGFIX: dashboards: fix vminsert/vmstorage/vmselect metrics filtering when dashboard is used to display data from many sub-clusters with unique job names. Before, only one specific job could have been accounted for component-specific panels, instead of all available jobs for the component.

  • BUGFIX: dashboards: respect job and instance filters for alerts annotation in cluster and single-node dashboards.

  • BUGFIX: dashboards: update description for RSS and anonymous memory panels to be consistent for single-node, cluster and vmagent dashboards.

  • BUGFIX: dashboards/vmalert: apply desc sorting in tooltips for vmalert dashboard in order to improve visibility of the outliers on graph.

  • BUGFIX: dashboards/vmalert: properly apply time series filter for panel No data errors. Before, the panel didn’t respect job or instance filters.

  • BUGFIX: dashboards/vmalert: fix panel Errors rate to Alertmanager not showing any data due to wrong label filters.

  • BUGFIX: dashboards/cluster: fix description about max threshold for Concurrent selects panel. Before, it was mistakenly implying that max is equal to the double of available CPUs.

  • BUGFIX: VictoriaMetrics cluster: bump hard-coded limit for search query size at vmstorage from 1MB to 5MB. The change should be more suitable for real-world scenarios and protect vmstorage from excessive memory usage. See this issue for details

  • BUGFIX: vmbackup: fix error when creating an incremental backup with the -origin command-line flag. See this issue for details.

  • BUGFIX: vmagent: properly apply relabeling with regex, which start and end with .+ or .* and which contain alternate sub-regexps. For example, .+;|;.+ or .*foo|bar|baz.*. Previously such regexps were improperly parsed, which could result in undexpected relabeling results. See this issue.

  • BUGFIX: vmagent: properly discover Kubernetes targets via kubernetes_sd_configs. Previously some targets and some labels could be skipped during service discovery because of the bug introduced in v1.93.5 when implementing this feature. See this issue for more details.

  • BUGFIX: vmagent: fix vmagent ignoring configuration reload for streaming aggregation if it was started with empty streaming aggregation config. Thanks to @aluode99 for the pull request.

  • BUGFIX: vmagent: do not scrape targets if the corresponding scrape_configs refer to files with invalid auth configs. Previously the targets were scraped without properly set auth headers in this case. Now targets are scraped only after the files are updated with valid auth configs. See this pull request.

  • BUGFIX: vmagent: properly parse ca, cert and key options at tls_config section inside http client settings. Previously string values couldn’t be parsed for these options, since the parser was mistakenly expecting a list of uint8 values instead.

  • BUGFIX: vmagent: properly drop samples if -streamAggr.dropInput command-line flag is set and -remoteWrite.streamAggr.config contains an empty file. See this issue.

  • BUGFIX: vmagent: do not print redundant error logs when failed to scrape consul or nomad target. See this pull request.

  • BUGFIX: vmagent: generate proper link to the main page and to favicon.ico at http pages served by vmagent such as /targets or /service-discovery when vmagent sits behind an http proxy with custom http path prefixes. See this issue.

  • BUGFIX: vmagent: properly decode Snappy-encoded data blocks received via VictoriaMetrics remote_write protocol. See this issue.

  • BUGFIX: vmstorage: prevent deleted series to be searchable via /api/v1/series API if they were re-ingested with staleness markers. This situation could happen if user deletes the series from the target and from VM, and then vmagent sends stale markers for absent series. Thanks to @ilyatrefilov for the issue and pull request.

  • BUGFIX: vmstorage: log warning about switching to ReadOnly mode only on state change. Before, vmstorage would log this warning every 1s. See this issue for details.

  • BUGFIX: vmauth: show browser authorization window for unauthorized requests to unsupported paths if the unauthorized_user section is specified. This allows properly authorizing the user. See this issue for details.

  • BUGFIX: vmauth: properly proxy requests to HTTP/2.0 backends and properly pass Host header to backends.

  • BUGFIX: vmui: fix the Disable cache toggle at JSON and Table views. Previously response caching was always enabled and couldn’t be disabled at these views.

  • BUGFIX: vmui: correctly display query errors on Explore Prometheus Metrics page. See this issue for details.

  • BUGFIX: vmui: properly handle trailing slash in the server URL. See this issue.

  • BUGFIX: vmbackupmanager: correctly print error in logs when copying backup fails. Previously, error was displayed in metrics but was missing in logs.

  • BUGFIX: fix panic, which could occur when query tracing is enabled. See this issue.

v1.94.0#

Released at 2023-10-02

  • FEATURE: MetricsQL: add support for numbers with underscore delimiters such as 1_234_567_890 and 1.234_567_890. These numbers are easier to read than 1234567890 and 1.234567890.

  • FEATURE: vmbackup: add support for server-side copy of existing backups. See these docs for details.

  • FEATURE: vmui: add the option to see the latest 25 queries. See this issue.

  • FEATURE: vmagent: add ability to set member num label for all the metrics scraped by a particular vmagent instance in a cluster of vmagents via -promscrape.cluster.memberLabel command-line flag. See these docs and this issue.

  • FEATURE: vmagent: do not log unexpected EOF when reading incoming metrics, since this error is expected and is handled during metrics’ parsing. This reduces the amounts of noisy logs. See this issue.

  • FEATURE: vmagent: retry failed write request on the closed connection immediately, without waiting for backoff. This should improve data delivery speed and reduce amount of error logs emitted by vmagent when using idle connections. See related issue.

  • FEATURE: vmagent: reduces load on Kubernetes control plane during initial service discovery. See this issue for details.

  • FEATURE: VictoriaMetrics cluster: reduce the maximum recovery time at vmselect and vminsert when some of vmstorage nodes become unavailable because of networking issues from 60 seconds to 3 seconds by default. The recovery time can be tuned at vmselect and vminsert nodes with -vmstorageUserTimeout command-line flag if needed. Thanks to @wjordan for the pull request.

  • FEATURE: vmui: add Prometheus data support to the “Explore cardinality” page. See this issue for details.

  • FEATURE: vmui: make the warning message more noticeable for text fields. See this issue.

  • FEATURE: vmui: add button for auto-formatting PromQL/MetricsQL queries. See this issue. Thanks to @aramattamara for the pull request.

  • FEATURE: vmui: improve accessibility score to 100 according to Google’s Lighthouse tests.

  • FEATURE: vmui: organize min, max, median values on the chart legend and tooltips for better visibility.

  • FEATURE: vmui: add explanation about cardinality explorer statistic inaccuracy in VictoriaMetrics cluster. See this issue.

  • FEATURE: vmui: add storage of query history in localStorage. See the pull request.

  • FEATURE: dashboards: provide copies of Grafana dashboards alternated with VictoriaMetrics datasource at dashboards/vm.

  • FEATURE: vmauth: added ability to set, override and clear request and response headers on a per-user and per-path basis. See this issue and these docs for details.

  • FEATURE: vmauth: add ability to retry requests to the remaining backends if they return response status codes specified in the retry_status_codes list. See this feature request.

  • FEATURE: vmauth: expose metrics vmauth_config_last_reload_* for tracking the state of config reloads, similarly to vmagent/vmalert components.

  • FEATURE: vmauth: do not print logs like SIGHUP received... once per configured -configCheckInterval cmd-line flag. This log will be printed only if config reload was invoked manually.

  • FEATURE: vmalert: add eval_offset attribute for Groups. If specified, Group will be evaluated at the exact time offset on the range of [0…evaluationInterval]. The setting might be useful for cron-like rules which must be evaluated at specific moments of time. See this issue for details.

  • FEATURE: vmalert: validate MetricsQL function names in alerting and recording rules when vmalert runs with -dryRun command-line flag. Previously it was allowed to use unknown (aka invalid) MetricsQL function names there. For example, foo() was counted as a valid query. See this feature request.

  • FEATURE: limit the length of string params in log messages to 500 chars. Longer string params are replaced with the first_250_chars..last_250_chars. This prevents from too long log lines, which can be emitted by VictoriaMetrics components.

  • FEATURE: docker compose environment: add vmauth component to cluster’s docker-compose example for balancing load among multiple vmselect components.

  • FEATURE: MetricsQL: make sure that q2 series are returned after q1 series in the results of q1 or q2 query, in the same way as Prometheus does. See this issue.

  • FEATURE: MetricsQL: return empty result from bitmap_and(a, b), bitmap_or(a, b) and bitmap_xor(a, b) if a or b have no value at the particular timestamp. Previously 0 was returned in this case. See this issue.

  • FEATURE: stop exposing vm_merge_need_free_disk_space metric, since it has been appeared that it confuses users while doesn’t bring any useful information. See this comment.

  • BUGFIX: Official Grafana dashboards for VictoriaMetrics: fix display of ingested rows rate for Samples ingested/s and Samples rate panels for vmagent’s dasbhoard. Previously, not all ingested protocols were accounted in these panels. An extra panel Rows rate was added to Ingestion section to display the split for rows ingested rate by protocol.

  • BUGFIX: vmui: fix the bug causing render looping when switching to heatmap.

  • BUGFIX: VictoriaMetrics enterprise validate -dedup.minScrapeInterval value and -downsampling.period intervals are multiples of each other. See these docs.

  • BUGFIX: vmbackup: properly copy appliedRetention.txt files inside <-storageDataPath>/{data} folders during incremental backups. Previously the new appliedRetention.txt could be skipped during incremental backups, which could lead to increased load on storage after restoring from backup. See this issue.

  • BUGFIX: vmagent: suppress context canceled error messages in logs when vmagent is reloading service discovery config. This error could appear starting from v1.93.5. See this PR.

  • BUGFIX: vmagent: remove concurrency limit during parsing of scraped metrics, which was mistakenly applied to it. With this change cmd-line flag -maxConcurrentInserts won’t have effect on scraping anymore.

  • BUGFIX: MetricsQL: allow passing median_over_time to aggr_over_time. See this issue.

  • BUGFIX: vminsert: fix ingestion via multitenant url for opentsdbhttp. See this issue. The bug has been introduced in v1.93.2.

  • BUGFIX: vmagent: fix support of legacy DataDog agent, which adds trailing slashes to urls. See this issue. Thanks to @maxb for spotting the issue.

v1.93.9#

Released at 2023-12-10

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

v1.93.8#

Released at 2023-11-15

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

v1.93.7#

Released at 2023-11-02

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

  • BUGFIX: vmagent: properly discover Kubernetes targets via kubernetes_sd_configs. Previously some targets and some labels could be skipped during service discovery because of the bug introduced in v1.93.5 when implementing this feature. See this issue for more details.
  • BUGFIX: vmagent: properly parse ca, cert and key options at tls_config section inside http client settings. Previously string values couldn’t be parsed for these options, since the parser was mistakenly expecting a list of uint8 values instead.
  • BUGFIX: vmagent: properly drop samples if -streamAggr.dropInput command-line flag is set and -remoteWrite.streamAggr.config contains an empty file. See this issue.
  • BUGFIX: vmagent: do not print redundant error logs when failed to scrape consul or nomad target. See this pull request.
  • BUGFIX: vmstorage: prevent deleted series to be searchable via /api/v1/series API if they were re-ingested with staleness markers. This situation could happen if user deletes the series from the target and from VM, and then vmagent sends stale markers for absent series. Thanks to @ilyatrefilov for the issue and pull request.
  • BUGFIX: vmstorage: log warning about switching to ReadOnly mode only on state change. Before, vmstorage would log this warning every 1s. See this issue for details.
  • BUGFIX: vmauth: show browser authorization window for unauthorized requests to unsupported paths if the unauthorized_user section is specified. This allows properly authorizing the user. See this issue for details.

v1.93.6#

Released at 2023-10-16

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

  • SECURITY: upgrade Go builder from Go1.21.1 to Go1.21.3. See the list of issues addressed in Go1.21.2 and the list of issues addressed in Go1.21.3.

  • BUGFIX: vmalert: strip sensitive information such as auth headers or passwords from datasource, remote-read, remote-write or notifier URLs in log messages or UI. This behavior is by default and is controlled via -datasource.showURL, -remoteRead.showURL, remoteWrite.showURL or -notifier.showURL cmd-line flags. See this issue.

  • BUGFIX: vmselect: improve performance and memory usage during query processing on machines with big number of CPU cores. See this issue for details.

  • BUGFIX: VictoriaMetrics cluster: bump hard-coded limit for search query size at vmstorage from 1MB to 5MB. The change should be more suitable for real-world scenarios and protect vmstorage from excessive memory usage. See this issue for details

  • BUGFIX: vmagent: fix vmagent ignoring configuration reload for streaming aggregation if it was started with empty streaming aggregation config. Thanks to @aluode99 for the pull request.

  • BUGFIX: vmbackup: properly copy appliedRetention.txt files inside <-storageDataPath>/{data} folders during incremental backups. Previously the new appliedRetention.txt could be skipped during incremental backups, which could lead to increased load on storage after restoring from backup. See this issue.

  • BUGFIX: vmagent: suppress context canceled error messages in logs when vmagent is reloading service discovery config. This error could appear starting from v1.93.5. See this PR.

  • BUGFIX: MetricsQL: allow passing median_over_time to aggr_over_time. See this issue.

  • BUGFIX: vminsert: fix ingestion via multitenant url for opentsdbhttp. See this issue. The bug has been introduced in v1.93.2.

  • BUGFIX: vmagent: fix support of legacy DataDog agent, which adds trailing slashes to urls. See this issue. Thanks to @maxb for spotting the issue.

v1.93.5#

Released at 2023-09-19

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

  • BUGFIX: storage: prevent from livelock when forced merge is called under high data ingestion. See this issue.
  • BUGFIX: Graphite Render API: correctly return null instead of Inf in JSON query responses. See this issue.
  • BUGFIX: vmbackup: properly copy parts.json files inside <-storageDataPath>/{data,indexdb} folders during incremental backups. Previously the new parts.json could be skipped during incremental backups, which could lead to inability to restore from the backup. See this issue. This issue has been introduced in v1.90.0.
  • BUGFIX: vmagent: properly close connections to Kubernetes API server after the change in selectors or namespaces sections of kubernetes_sd_configs. Previously vmagent could continue polling Kubernetes API server with the old selectors or namespaces configs additionally to polling new configs. See this issue.
  • BUGFIX: vmauth: prevent configuration reloading if there were no changes in config. This improves memory usage when -configCheckInterval cmd-line flag is configured and config has extensive list of regexp expressions requiring additional memory on parsing.

v1.93.4#

Released at 2023-09-10

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

  • SECURITY: upgrade Go builder from Go1.21.0 to Go1.21.1. See the list of issues addressed in Go1.20.6.

  • BUGFIX: vminsert enterprise: properly parse /insert/multitenant/* urls, which have been broken since v1.93.2. See this issue.

  • BUGFIX: properly build production armv5 binaries for GOARCH=arm. This has been broken after the upgrading of Go builder to Go1.21.0. See this issue.

  • BUGFIX: vmselect: return 503 Service Unavailable status code when partial responses are denied and some of vmstorage nodes are temporarily unavailable. Previously 422 Unprocessable Entiry status code was mistakenly returned in this case, which could prevent from automatic recovery by re-sending the request to healthy cluster replica in another availability zone.

  • BUGFIX: vmalert: fix the bug when Group’s params fields with multiple values were overriding each other instead of adding up. The bug was introduced in this commit starting from v1.91.1. See this issue.

  • BUGFIX: vmagent: fix possble corruption of labels in the collected samples if -remoteWrite.label is set toghether with multiple -remoteWrite.url options. The bug has been introduced in v1.93.1. See this issue.

v1.93.3#

Released at 2023-09-02

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

  • BUGFIX: vminsert: properly close broken vmstorage connection during read-only state checks at vmstorage. Previously it wasn’t properly closed, which prevents restoring vmstorage node from read-only mode. See this issue.
  • BUGFIX: vmstorage: prevent from breaking vmselect -> vmstorage RPC communication when vmstorage returns an empty label name at /api/v1/labels request. See this issue.

v1.93.2#

Released at 2023-09-01

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

  • BUGFIX: build: fix Docker builds for old Docker releases. See this issue.
  • BUGFIX: vmagent: consistently set User-Agent header to vm_promscrape during scraping with enabled or disabled stream parsing mode. See this issue.
  • BUGFIX: vmagent: consistently set timeout for scraping with enabled or disabled stream parsing mode. See this issue.
  • BUGFIX: vmalert: correctly re-use HTTP request object on EOF retries when querying the configured datasource. Previously, there was a small chance that query retry wouldn’t succeed.
  • BUGFIX: vmselect: correctly handle requests with /select/multitenant prefix. Such requests must be rejected since vmselect does not support multitenancy endpoint. Previously, such requests were causing panic. See this issue.
  • BUGFIX: vminsert: properly check for read-only state at vmstorage. Previously it wasn’t properly checked, which could lead to increased resource usage and data ingestion slowdown when some of vmstorage nodes are in read-only mode. See this issue.

v1.93.1#

Released at 2023-08-23

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

  • BUGFIX: prevent from possible data loss during indexdb rotation. See this issue for details.
  • BUGFIX: do not allow starting VictoriaMetrics components with improperly set boolean command-line flags in the form -boolFlagName value, since this leads to silent incomplete flags’ parsing. This form should be replaced with -boolFlagName=value. See this issue.
  • BUGFIX: vmagent: properly set labels from -remoteWrite.label command-line flag just before sending samples to the configured -remoteWrite.url according to these docs. Previously these labels were incorrectly set before the relabeling configured via -remoteWrite.urlRelabelConfigs and the stream aggregation configured via -remoteWrite.streamAggr.config, so these labels could be lost or incorrectly transformed before sending the samples to remote storage. The fix allows using -remoteWrite.label for identifying vmagent instances in cluster mode. See this issue and these docs for more details.
  • BUGFIX: remove DEBUG logging when parsing if filters inside relabeling rules and when parsing match filters inside stream aggregation rules.
  • BUGFIX: properly replace : chars in label names with _ when -usePromCompatibleNaming command-line flag is passed to vmagent, vminsert or single-node VictoriaMetrics. This addresses this comment.
  • BUGFIX: vmbackup: correctly check if specified -dst belongs to specified -storageDataPath. See this issue.
  • BUGFIX: vmctl: don’t interrupt the migration process if no metrics were found for a specific tenant. See this issue.

v1.93.0#

Released at 2023-08-12

It is recommended upgrading to VictoriaMetrics v1.93.1 because v1.93.0 contains a bug, which can lead to data loss because of incorrect indexdb rotation. See this issue for details.

v1.93.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.93.x line will be supported for at least 12 months since v1.93.0 release

Update note: starting from this release, vmagent ignores timestamps provided by scrape targets by default - it associates scraped metrics with local timestamps instead. Set honor_timestamps: true in scrape configs if timestamps provided by scrape targets must be used instead. This change helps removing gaps for metrics collected from cadvisor such as container_memory_usage_bytes. This also improves data compression and query performance over metrics collected from cadvisor. See more details here.

  • SECURITY: upgrade Go builder from Go1.20.6 to Go1.21.0 in order to fix this issue.

  • SECURITY: upgrade base docker image (Alpine) from 3.18.2 to 3.18.3. See alpine 3.18.3 release notes.

  • FEATURE: MetricsQL: add share_eq_over_time(m[d], eq) function for calculating the share (in the range [0...1]) of raw samples on the given lookbehind window d, which are equal to eq. See this feature request. Thanks to @Damon07 for the pull request.

  • FEATURE: vmauth: allow configuring deadline for a backend to be excluded from the rotation on errors via -failTimeout cmd-line flag. This feature could be useful when it is expected for backends to be not available for significant periods of time. See this issue for details. Thanks to @SunKyu for the pull request.

  • FEATURE: vmalert: remove deprecated in v1.61.0 -rule.configCheckInterval command-line flag. Use -configCheckInterval command-line flag instead.

  • FEATURE: vmalert: remove support of deprecated web links of /api/v1/<groupID>/<alertID>/status form in favour of /api/v1/alerts?group_id=<>&alert_id=<> links. Links of /api/v1/<groupID>/<alertID>/status form were deprecated in v1.79.0. See this issue for details.

  • FEATURE: vmctl: allow disabling binary export API protocol via -vm-native-disable-binary-protocol cmd-line flag when migrating data from VictoriaMetrics. Disabling binary protocol can be useful for deduplication of the exported data before ingestion. For this, deduplication need to be configured at -vm-native-src-addr side and -vm-native-disable-binary-protocol should be set on vmctl side.

  • FEATURE: vmctl: add support of week step for time-based chunking migration. See this issue.

  • FEATURE: vmctl: allow specifying custom full url at --remote-read-src-addr command-line flag if --remote-read-disable-path-append command-line flag is set. This allows importing data from urls, which do not end with /api/v1/read. For example, from Promscale. See this issue.

  • FEATURE: vmui: add warning in query field of vmui for partial data responses. See this issue.

  • FEATURE: vmui: allow displaying the full error message on click for trimmed error messages in vmui. See this issue.

  • FEATURE: Official Grafana dashboards for VictoriaMetrics: add Concurrent inserts panel to vmagent’s dasbhoard. The new panel supposed to show whether the number of concurrent inserts processed by vmagent isn’t reaching the limit.

  • FEATURE: Official Grafana dashboards for VictoriaMetrics: add panels for absolute Mem and CPU usage by vmalert. See related issue here.

  • FEATURE: Official Grafana dashboards for VictoriaMetrics: correctly calculate Bytes per point value for single-server and cluster VM dashboards. Before, the calculation mistakenly accounted for the number of entries in indexdb in denominator, which could have shown lower values than expected.

  • FEATURE: Alerting rules for VictoriaMetrics: ConcurrentFlushesHitTheLimit alerting rule was moved from single-server and cluster alerts to the list of “health” alerts as it could be related to many VictoriaMetrics components.

  • BUGFIX: storage: properly set next retention time for indexDB. Previously it may enter into endless retention loop. See this issue for details.

  • BUGFIX: vmagent: return human readable error if opentelemetry has json encoding. Follow-up after PR.

  • BUGFIX: vmagent: properly validate scheme for proxy_url field at the scrape config. See this issue for details.

  • BUGFIX: vmagent: properly apply if filters during relabeling. Previously the if filter could improperly work. See this issue and this pull request.

  • BUGFIX: vmagent: use local scrape timestamps for the scraped metrics unless honor_timestamps: true option is explicitly set at scrape_config. This fixes gaps for metrics collected from cadvisor or similar exporters, which export metrics with invalid timestamps. See this issue and this comment for details. The issue has been introduced in v1.68.0.

  • BUGFIX: vmagent: fixes runtime panic at OpenTelemetry parser. Opentelemetry format allows histograms without sum fields. Such histogram converted as counter with _count suffix. See this issue.

  • BUGFIX: vmagent: keep unmatched series at stream aggregation when -remoteWrite.streamAggr.dropInput is set to false to match intended behaviour introduced at v1.92.0. See this issue.

  • BUGFIX: vmalert: properly set vmalert_config_last_reload_successful value on configuration updates or rollbacks. The bug was introduced in v1.92.0 in this PR.

  • BUGFIX: vmalert: fix vmalert_remotewrite_send_duration_seconds_total value, before it didn’t count in the real time spending on remote write requests. See this pr for details.

  • BUGFIX: vmbackupmanager: fix panic when creating a backup to a local filesystem on Windows. See this issue.

  • BUGFIX: vmui: properly handle client address with X-Forwarded-For part at the Active queries page. See this comment.

  • BUGFIX: MetricsQL: prevent from panic when the lookbehind window in square brackets of rollup function is parsed into negative value. See this issue.

v1.92.1#

Released at 2023-07-28

v1.92.0#

Released at 2023-07-27

Update note: this release contains backwards-incompatible change to indexdb, so rolling back to the previous versions of VictoriaMetrics may result in partial data loss of entries in indexdb.

Update note: starting from this release, stream aggregation writes the following samples to the configured remote storage by default:

  • aggregated samples;
  • the original input samples, which match zero match options from the provided config.

Previously only aggregated samples were written to the storage by default. The previous behavior can be restored in the following ways:

  • by passing -streamAggr.dropInput command-line flag to single-node VictoriaMetrics;
  • by passing -remoteWrite.streamAggr.dropInput command-line flag per each configured -remoteWrite.streamAggr.config at vmagent.

  • SECURITY: upgrade base docker image (alpine) from 3.18.0 to 3.18.2. See alpine 3.18.2 release notes.

  • SECURITY: upgrade Go builder from Go1.20.5 to Go1.20.6. See the list of issues addressed in Go1.20.6.

  • FEATURE: reduce memory usage by up to 5x for setups with high churn rate and long retention. See the description for this change and this issue for details.

  • FEATURE: reduce spikes in CPU and disk IO usage during indexdb rotation (aka inverted index), which is performed once per -retentionPeriod. The new algorithm gradually pre-populates newly created indexdb during the last hour before the rotation. The number of pre-populated series in the newly created indexdb can be monitored via vm_timeseries_precreated_total metric. This should resolve this issue.

  • FEATURE: MetricsQL: allow selecting time series matching at least one of multiple or filters. For example, {env="prod",job="a" or env="dev",job="b"} selects series with either {env="prod",job="a"} or {env="dev",job="b"} labels. This functionality allows passing the selected series to rollup functions without the need to use subqueries. See these docs.

  • FEATURE: MetricsQL: add ability to preserve metric names for binary operation results via keep_metric_names modifier. For example, ({__name__=~"foo|bar"} / 10) keep_metric_names leaves foo and bar metric names in division results. See these docs. This helps to address issues like this one.

  • FEATURE: MetricsQL: add ability to copy all the labels from one side of many-to-one operations by specifying * inside group_left() or group_right(). Also allow adding a prefix for copied label names via group_left(*) prefix "..." syntax. For example, the following query copies Kubernetes namespace labels to kube_pod_info series and adds ns_ prefix for the copied label names: kube_pod_info * on(namespace) group_left(*) prefix "ns_" kube_namespace_labels. The labels from on() list aren’t prefixed. This feature resolves this and that questions at StackOverflow.

  • FEATURE: MetricsQL: add ability to specify durations via WITH templates. Examples:

    • WITH (w = 5m) m[w] is automatically transformed to m[5m]
    • WITH (f(window, step, off) = m[window:step] offset off) f(5m, 10s, 1h) is automatically transformed to m[5m:10s] offset 1h Thanks to @lujiajing1126 for the initial idea and implementation. See this feature request.
  • FEATURE: vmui: added a new page with the list of currently running queries. See this issue and these docs.

  • FEATURE: vmagent: add support for data ingestion via OpenTelemetry protocol. See these docs, this feature request and this pull request.

  • FEATURE: vmagent: allow sharding outgoing time series among the configured remote storage systems. This can be useful for building horizontally scalable stream aggregation, when samples for the same time series must be aggregated by the same vmagent instance at the second level. See these docs and this feature request for details.

  • FEATURE: vmagent: allow configuring staleness interval in stream aggregation config. See this issue for details.

  • FEATURE: vmagent: allow specifying a list of series selectors inside if option of relabeling rules. The corresponding relabeling rule is executed when at least a single series selector matches. See these docs.

  • FEATURE: stream aggregation: allow specifying a list of series selectors inside match option of stream aggregation configs. The input sample is aggregated when at least a single series selector matches. See this feature request.

  • FEATURE: stream aggregation: preserve input samples, which match zero match options from the configured aggregations. Previously all the input samples were dropped by default, so only the aggregated samples are written to the output storage. The previous behavior can be restored by passing -streamAggr.dropInput command-line flag to single-node VictoriaMetrics or by passing -remoteWrite.streamAggr.dropInput command-line flag to vmagent.

  • FEATURE: vmctl: add verbose output for docker installations or when TTY isn’t available. See this issue.

  • FEATURE: vmctl: interrupt backoff retries when import process is cancelled. The change makes vmctl more responsive in case of errors during the import. See this pull request.

  • FEATURE: vmctl: update backoff policy on retries to reduce probability of overloading for source or destination databases. See this issue.

  • FEATURE: vmstorage: suppress “broken pipe” and “connection reset by peer” errors for search queries on vmstorage side. See this and this commits.

  • FEATURE: Official Grafana dashboards for VictoriaMetrics: add panel for tracking rate of syscalls while writing or reading from disk via process_io_(read|write)_syscalls_total metrics.

  • FEATURE: accept timestamps in milliseconds at start, end and time query args in Prometheus querying API. See these docs and this feature request.

  • FEATURE: vmalert: update retry policy for pushing data to -remoteWrite.url. By default, vmalert will make multiple retry attempts with exponential delay. The total time spent during retry attempts shouldn’t exceed -remoteWrite.retryMaxTime (default is 30s). When retry time is exceeded vmalert drops the data dedicated for -remoteWrite.url. Before, vmalert dropped data after 5 retry attempts with 1s delay between attempts (not configurable). See -remoteWrite.retryMinInterval and -remoteWrite.retryMaxTime cmd-line flags.

  • FEATURE: vmalert: expose vmalert_remotewrite_send_duration_seconds_total counter, which can be used for determining high saturation of every connection to remote storage with an alerting query sum(rate(vmalert_remotewrite_send_duration_seconds_total[5m])) by(job, instance) > 0.9 * max(vmalert_remotewrite_concurrency) by(job, instance). This query triggers when a connection is saturated by more than 90%. This usually means that -remoteWrite.concurrency command-line flag must be increased in order to increase the number of concurrent writings into remote endpoint. See this feature request.

  • FEATUTE: vmalert: display the error message received during unsuccessful config reload in vmalert’s UI. See this issue for details.

  • FEATUTE: vmalert: allow disabling of step param attached to instant queries. This might be useful for using vmalert with datasources that to not support this param, unlike VictoriaMetrics. See this issue for details.

  • FEATUTE: vmalert: support option for “blackholing” alerting notifications if -notifier.blackhole cmd-line flag is set. Enable this flag if you want vmalert to evaluate alerting rules without sending any notifications to external receivers (eg. alertmanager). See this issue for details. Thanks to @venkatbvc for the pull request.

  • FEATURE: vmalert: add unit test for alerting and recording rules, see more details here. Thanks to @Haleygo for the pull request.

  • FEATURE: vmalert: allow overriding default GET params for rules with graphite datasource type, in the same way as it happens for prometheus type. See this issue.

  • FEATUTE: vmalert: support keep_firing_for field for alerting rules. See docs updated here and this issue. Thanks to @Haleygo for the pull request.

  • FEATURE: vmauth: expose vmauth_user_request_duration_seconds and vmauth_unauthorized_user_request_duration_seconds summary metrics for measuring requests latency per user.

  • FEATURE: vmbackup: show backup progress percentage in log during backup uploading. See this issue.

  • FEATURE: vmrestore: show restoring progress percentage in log during backup downloading. See this issue.

  • FEATURE: add ability to fine-tune Graphite API limits via the following command-line flags: -search.maxGraphiteTagKeys for limiting the number of tag keys returned from Graphite API for tags -search.maxGraphiteTagValues for limiting the number of tag values returned from Graphite API for tag values -search.maxGraphiteSeries for limiting the number of series (aka paths) returned from Graphite API for series See this issue.

  • BUGFIX: properly return series from /api/v1/series if it finds more than the limit series (limit is an optional query arg passed to this API). Previously the limit exceeded error error was returned in this case. See this issue.

  • BUGFIX: vmui: fix application routing issues and problems with manual URL changes. See this pull request and this issue.

  • BUGFIX: add validation for invalid partial RFC3339 timestamp formats in query and export APIs.

  • BUGFIX: vmctl: interrupt explore procedure in influx mode if vmctl found no numeric fields.

  • BUGFIX: vmctl: fix panic in case --remote-read-filter-time-start flag is not set for remote-read mode. This flag is now required to use remote-read mode. See this issue.

  • BUGFIX: vmctl: fix formatting issue, which could add superflouos s characters at the end of samples/s output during data migration. For example, it could write samples/ssssss. See this issue.

  • BUGFIX: vmalert: use RFC3339 time format in query args instead of unix timestamp for all issued queries to Prometheus-like datasources.

  • BUGFIX: vmalert: correctly calculate evaluation time for rules. Before, there was a low probability for discrepancy between actual time and rules evaluation time if evaluation interval was lower than the execution time for rules within the group.

  • BUGFIX: vmalert: reset evaluation timestamp after modifying group interval. Before, there could have latency on rule evaluation time.

  • BUGFIX: vmselect: fix timestamp alignment for Prometheus querying API if time argument is less than 10m from the beginning of Unix epoch.

  • BUGFIX: vmagent: close HTTP connections to service discovery servers when they are no longer needed. This should prevent from possible connection exhasution in some cases. See this issue.

  • BUGFIX: vmagent: do not show relabel debug links at the /targets page when vmagent runs with -promscrape.dropOriginalLabels command-line flag, since it has no the original labels needed for relabel debug. See this issue.

  • BUGFIX: vminsert: fixed decoding of label values with slash when accepting data via pushgateway protocol. This fixes Prometheus golang client compatibility. See this issue.

  • BUGFIX: MetricsQL: properly parse binary operations with reserved words on the right side such as foo + (on{bar="baz"}). Previously such queries could lead to panic. See this issue.

  • BUGFIX: Official Grafana dashboards for VictoriaMetrics: display cache usage for all components on panel Cache usage % by type for cluster dashboard. Before, only vmstorage caches were shown.

v1.91.3#

Released at 2023-06-30

  • SECURITY: upgrade Go builder from Go1.20.4 to Go1.20.5. See the list of issues addressed in Go1.20.5.

  • BUGFIX: vmagent: fix possible panic at shutdown when stream aggregation is enabled. See this pull request for details.

  • BUGFIX: vmagent: fixed service name detection for consulagent service discovery in case of a difference in service name and service id. See this issue for details.

  • BUGFIX: vmalert: retry all errors except 4XX status codes while pushing via remote-write to the remote storage. Previously, errors like broken connection could prevent vmalert from retrying the request.

  • BUGFIX: vmalert: properly interrupt retry attempts on vmalert shutdown. Before, vmalert could have waited for all retries to finish for shutdown.

  • BUGFIX: vmbackupmanager: fix an issue with vmbackupmanager not being able to restore data from a backup stored in GCS. See this issue for details.

  • BUGFIX: VictoriaMetrics cluster: properly return error from /api/v1/query and /api/v1/query_range at vmselect when the -search.maxSamplesPerQuery or -search.maxSamplesPerSeries limit is exceeded. Previously incomplete response could be returned without the error if vmselect runs with -replicationFactor greater than 1. See this pull request.

  • BUGFIX: storage: prevent from possible crashloop after the migration from versions below v1.90.0 to newer versions. See this issue for details.

  • BUGFIX: vmui: fix a memory leak issue associated with chart updates. See this pull request.

  • BUGFIX: vmbackupmanager: fix removing storage data dir before restoring from backup.

  • BUGFIX: vmselect: wait for all vmstorage nodes to respond when the -replicationFactor flag is set bigger than > 1. Before, vmselect could have skip waiting for the slowest replicas to respond. This could have resulted in issues illustrated here. Now, this optimization is disabled by default and could be re-enabled by passing -search.skipSlowReplicas cmd-line flag to vmselect. See more details here.

v1.91.2#

Released at 2023-06-02

  • BUGFIX: vmalert: fix nil map assignment panic in runtime introduced in this change.

v1.91.1#

Released at 2023-06-01

  • FEATURE:vmagent: Adds follow_redirects at service discovery level of scrape configuration. See this issue. Thanks to @Haleygo for the pull request.

  • FEATURE: vmselect: Decreases startup time for vmselect with a big number of vmstorage nodes. See this issue. Thanks to @Haleygo for the pull request.

  • BUGFIX: vmalert: Properly form path to static assets in WEB UI if http.pathPrefix set. See this issue.

  • BUGFIX: vmalert: Properly set datasource query params. See this issue. Thanks to @gsakun for the pull request.

  • BUGFIX: vmalert: properly return empty slices instead of nil for /api/v1/rules for groups with present name but absent rules. See this issue.

  • BUGFIX: vmauth: Properly handle LOCAL command for proxy protocol. See this issue.

  • BUGFIX: vmbackupmanager: Fixes crash on startup. See this issue.

  • BUGFIX: vmui: fix bug with custom URL in global settings not respecting tenantID change. See this issue.

v1.91.0#

Released at 2023-05-18

  • SECURITY: upgrade Go builder from Go1.20.3 to Go1.20.4. See the list of issues addressed in Go1.20.4.

  • SECURITY: serve /robots.txt content to disallow indexing of the exposed instances by search engines. See this issue for details.

  • FEATURE: update docker compose environment to V2 in respect to V1 deprecation notice from June 2023. See Migrate to Compose V2.

  • FEATURE: deprecate -bigMergeConcurrency command-line flag, since improper configuration for this flag frequently led to uncontrolled growth of unmerged parts, which, in turn, could lead to queries slowdown and increased CPU usage. The concurrency for background merges can be controlled via -smallMergeConcurrency command-line flag, though it isn’t recommended to change this flag in general case.

  • FEATURE: do not execute the incoming request if it has been canceled by the client before the execution start. See this pull request.

  • FEATURE: support time formats with timezones. For example, 2024-01-02+02:00 means January 2, 2024 at +02:00 time zone. See these docs.

  • FEATURE: expose process_* metrics at /metrics page of all the VictoriaMetrics components under Windows OS. See this pull request.

  • FEATURE: reduce the amounts of unimportant INFO logging during VictoriaMetrics startup / shutdown. This should improve visibility for potentially important logs.

  • FEATURE: upgrade base docker image (alpine) from 3.17.3 to 3.18.0. See alpine 3.18.0 release notes.

  • FEATURE: VictoriaMetrics cluster: do not pollute logs with cannot read hello: cannot read message with size 11: EOF messages at vmstorage during TCP health checks performed by Consul or other services. See this issue.

  • FEATURE: vmagent: support the ability to filter consul_sd_configs targets in more optimal way via new filter option. See this feature request.

  • FEATURE: vmagent: add support for consulagent_sd_configs. See this feature request.

  • FEATURE: vmagent: emit a warning if too small value is passed to -remoteWrite.maxDiskUsagePerURL command-line flag. See this issue.

  • FEATURE: vmalert: add support of recursive globs for -rule and -rule.templates command-line flags by using ** in the glob pattern. See this issue.

  • FEATURE: vmalert: add ability to specify custom per-group HTTP headers sent to the configured notifiers. See this issue. Thanks to @Haleygo for the pull request.

  • FEATURE: vmalert: detect alerting rules which don’t match any series. See these docs and this feature request.

  • FEATURE: vmalert: support loading rules via HTTP URL. See this issue. Thanks to @Haleygo for the pull request.

  • FEATURE: vmalert: add buttons for filtering groups/rules with errors or with no-match warning in web UI for page /groups. See this issue.

  • FEATURE: vmalert: do not retry remote-write requests for responses with 4XX status codes. This aligns with Prometheus remote write specification. Thanks to @MichaHoffmann for the pull request.

  • FEATURE: vmauth: add ability to filter incoming requests by IP. See these docs and this feature request.

  • FEATURE: vmauth: add ability to proxy requests to the specified backends for unauthorized users. See this feature request.

  • FEATURE: vmauth: add ability to specify default route for unmatched requests. See this feature request.

  • FEATURE: vmauth: retry POST requests on the remaining backends if the currently selected backend isn’t reachable. See this issue.

  • FEATURE: vmui: add ability to compare the data for the previous day with the data for the current day at Cardinality Explorer. See this feature request.

  • FEATURE: vmui: display histograms as heatmaps in Metrics explorer. See this feature request.

  • FEATURE: vmui: add WITH template playground. See this feature request.

  • FEATURE: vmui: add ability to debug relabeling. See this feature request.

  • FEATURE: vmui: add an ability to copy and execute queries listed at top queries page. Also make more human readable the query duration column. See this feature request and this pull request.

  • FEATURE: vmui: increase default font size for better readability.

  • FEATURE: vmui: cardinality explorer: return back a table with labels containing the highest number of unique label values. See issue.

  • FEATURE: vmui: add notification icon for queries that do not match any time series. A warning icon appears next to the query field when the executed query does not match any time series. See this feature request.

  • FEATURE: vmbackup: add -s3StorageClass command-line flag for setting the storage class for AWS S3 backups. See this issue. Thanks to @justcompile for the pull request.

  • FEATURE: vmbackup: store backup creation and completion time in backup_complete.ignore file of backup contents. This allows determining the exact timestamp when the backup was created and completed.

  • FEATURE: vmbackupmanager: add created_at field to the output of /api/v1/backups API and vmbackupmanager backup list command. See this doc for data format details.

  • FEATURE: vmbackupmanager: add commands for locking/unlocking backups against deletion by retention policy. See this doc for data format details.

  • FEATURE: vmctl: add support for different time formats for --vm-native-filter-time-start and --vm-native-filter-time-end command-line flags. See this issue.

  • FEATURE: vmctl: set default value for --vm-native-step-interval command-line flag to month. This enables time-based chunking of data based on monthly step value when using native migration mode. See this issue.

  • BUGFIX: reduce the probability of sudden increase in the number of small parts on systems with small number of CPU cores.

  • BUGFIX: reduce the possibility of increased CPU usage when data with timestamps older than one hour is ingested into VictoriaMetrics. This reduces spikes for the graph sum(rate(vm_slow_per_day_index_inserts_total)). See this pull request.

  • BUGFIX: fix possible infinite loop during indexdb rotation when -retentionTimezoneOffset command-line flag is set and the local timezone is not UTC. See this issue. Thanks to @faceair for the fix.

  • BUGFIX: do not panic at Windows during snapshot deletion. Instead, delete the snapshot on the next restart. See this comment for details.

  • BUGFIX: change the max allowed value for -memory.allowedPercent from 100 to 200. See this issue.

  • BUGFIX: properly limit the number of OpenTSDB HTTP concurrent requests specified via -maxConcurrentInserts command-line flag. See this issue. Thanks to @zouxiang1993 for the fix.

  • BUGFIX: do not ignore trailing empty field in CSV lines when importing data in CSV format. See this issue.

  • BUGFIX: disallow " chars when parsing Prometheus label names, since they aren’t allowed by Prometheus text exposition format. Previously this could result in silent incorrect parsing of incorrect Prometheus labels such as foo{"bar"="baz"} or {foo:"bar",baz="aaa"}. See this issue.

  • BUGFIX: VictoriaMetrics cluster: prevent from possible panic when the number of vmstorage nodes increases when automatic vmstorage discovery is enabled.

  • BUGFIX: MetricsQL: fix a panic when the duration in the query contains uppercase M suffix. Such a suffix isn’t allowed to use in durations, since it clashes with a million suffix, e.g. it isn’t clear whether rate(metric[5M]) means rate over 5 minutes, 5 months or 5 million seconds. See this and this issues.

  • BUGFIX: vmagent: properly handle the vm_promscrape_config_last_reload_successful metric after config reload. See this issue.

  • BUGFIX: vmagent: add __meta_kubernetes_endpoints_name label for all ports discovered from endpoint. Previously, ports not matched by Service did not have this label. See this issue for details. Thanks to @thunderbird86 for discovering and fixing the issue.

  • BUGFIX: vmalert: retry failed read request on the closed connection one more time. This improves rules execution reliability when connection between vmalert and datasource closes unexpectedly.

  • BUGFIX: vmalert: properly display an error when using query function for templating value of -external.alert.source flag. See this issue.

  • BUGFIX: vmalert: properly return empty slices instead of nil for /api/v1/rules and /api/v1/alerts API handlers. See this issue.

  • BUGFIX: vmauth: do not return invalid auth credentials in http response by default, since it may be logged by client. See this issue.

  • BUGFIX: vmui: fix the display of the tenant selector. See this issue.

  • BUGFIX: vmui: fix UI freeze when the query returns non-histogram series alongside histogram series.

  • BUGFIX: vmui: fix the text display on buttons in Safari 16.4.

  • BUGFIX: alerts-health: update threshold for TooHighMemoryUsage alert from 90% to 80%, since 90% is too high for production environments.

  • BUGFIX: vmbackup: fix compatibility with Windows OS. See this issue.

  • BUGFIX: vmctl: fix performance issue when migrating data from VictoriaMetrics according to these docs. Add the ability to speed up the data migration via --vm-native-disable-retries command-line flag. See this issue.

  • BUGFIX: stream aggregation: fix bug with duplicated labels during stream aggregation via single-node VictoriaMetrics. See this issue.

v1.90.0#

Released at 2023-04-06

Update note: this release contains backwards-incompatible change in storage data format, so the previous versions of VictoriaMetrics will exit with the unexpected number of substrings in the part name error when trying to run them on the data created by v1.90.0 or newer versions. The solution is to upgrade to v1.90.0 or newer releases

  • SECURITY: upgrade base docker image (alpine) from 3.17.2 to 3.17.3. See alpine 3.17.3 release notes.

  • SECURITY: upgrade Go builder from Go1.20.2 to Go1.20.3. See the list of issues addressed in Go1.20.3.

  • FEATURE: open source Graphite Render API. This API allows using VictoriaMetrics as a drop-in replacement for Graphite at both data ingestion and querying sides and reducing infrastructure costs by up to 10x comparing to Graphite. See this case study as an example.

  • FEATURE: release Windows binaries for single-node VictoriaMetrics, VictoriaMetrics cluster, vmbackup and vmrestore. See this, this and this issues. This release of VictoriaMetrics for Windows cannot delete snapshots due to Windows constraints. See this comment for details. This issue should be resolved in future releases.

  • FEATURE: log metrics with truncated labels if the length of label value in the ingested metric exceeds -maxLabelValueLen. This should simplify debugging for this case.

  • FEATURE: vmagent: show target URL when debugging target relabeling. This should simplify target relabel debugging a bit. See this pull request.

  • FEATURE: vmagent: add support for VictoriaMetrics remote write protocol when sending / receiving data to / from Kafka. This protocol allows saving egress network bandwidth costs when sending data from vmagent to Kafka located in another datacenter or availability zone. See this feature request.

  • FEATURE: vmagent: add -kafka.consumer.topic.concurrency command-line flag. It controls the number of Kafka consumer workers to use by vmagent. It should eliminate the need to start multiple vmagent instances to improve data transfer rate. See this feature request.

  • FEATURE: vmagent: add support for Kafka producer and consumer on arm64 machines. See this issue.

  • FEATURE: vmagent: delete unused buffered data at -remoteWrite.tmpDataPath directory when there is no matching -remoteWrite.url to send this data to. See this feature request.

  • FEATURE: vmagent: add the ability for hot reloading of stream aggregation configs. See these docs and this feature request.

  • FEATURE: check the contents of -relabelConfig and -streamAggr.config files additionally to -promscrape.config when single-node VictoriaMetrics runs with -dryRun command-line flag. This aligns the behaviour of single-node VictoriaMetrics with vmagent behaviour for -dryRun command-line flag.

  • FEATURE: vmui: automatically draw a heatmap graph when the query selects a single histogram. This simplifies analyzing histograms. See this feature request.

  • FEATURE: vmui: add support for drag’n’drop and paste from clipboard in the “Trace analyzer” page. See this pull request.

  • FEATURE: vmui: hide messages longer than 3 lines in the trace. You can view the full message by clicking on the show more button. See this pull request.

  • FEATURE: vmui: add the ability to manually input date and time when selecting a time range. See this pull request.

  • FEATURE: vmui: updated usability and the search process in cardinality explorer. Made this process straightforward for user. See this pull request.

  • FEATURE: vmui: add the ability to collapse/expand the legend. See this pull request.

  • FEATURE: vmui: add tips for working with the graph and legend. See this pull request.

  • FEATURE: vmui: add apply and cancel buttons to settings popup. See this issue.

  • FEATURE: vmctl: automatically disable progress bar when TTY isn’t available. See this issue.

  • FEATURE: vmauth: add -configCheckInterval command-line flag, which can be used for automatic re-reading the -auth.config file. See this feature request.

  • BUGFIX: prevent from slow snapshot creating under high data ingestion rate. See this issue.

  • BUGFIX: vmauth: suppress proxy protocol parsing errors in case of EOF. Usually, the error is caused by health checks and is not a sign of an actual error.

  • BUGFIX: vmui: fix displaying errors for each query. See this issue.

  • BUGFIX: vmbackup: fix snapshot not being deleted in case of error during backup. See this issue.

  • BUGFIX: stream aggregation: suppress series after dedup error message in logs when -remoteWrite.streamAggr.dedupInterval command-line flag is set at vmagent or when -streamAggr.dedupInterval command-line flag is set at single-node VictoriaMetrics.

  • BUGFIX: allow using dashes and dots in environment variables names referred in config files via %{ENV-VAR.SYNTAX}. See these docs and this issue.

  • BUGFIX: return back query performance scalability on hosts with big number of CPU cores. The scalability has been reduced in v1.86.0. See this issue.

  • BUGFIX: MetricsQL: properly convert VictoriaMetrics historgram buckets to Prometheus histogram buckets when VictoriaMetrics histogram contain zero buckets. Previously these buckets were ignored, and this could lead to missing Prometheus histogram buckets after the conversion. Thanks to @zklapow for the fix.

  • BUGFIX: vmagent: fix CPU and memory usage spikes when files pointed by file_sd_config cannot be re-read. See this_issue.

  • BUGFIX: prevent unexpected merges on start-up when -storage.minFreeDiskSpaceBytes is set. See the issue.

  • BUGFIX: properly support comma-separated filters inside retention filters. See this issue.

  • BUGFIX: verify response code when fetching configuration files via HTTP. See this issue.

  • BUGFIX: vmalert: replace empty labels with "" instead of "<no value>" during templating, as Prometheus does. See this issue.

  • BUGFIX: vmctl: properly pass multiple filters from --vm-native-filter-match command-line flag to the data source. Previously filters from --vm-native-filter-match were only used to discover the metric names, and the metric names like __name__="metric_name" has been taken into account, while the remaining filters were ignored. For example --vm-native-src-addr={foo="bar",baz="abc"} may found metric_name{foo="bar",baz="abc"} and filter was treated as --vm-native-src-addr={__name__="metrics_name"}, e.g. foo="bar",baz="abc" filter was ignored. See this issue.

v1.89.1#

Released at 2023-03-12

  • BUGFIX: prevent from possible cannot unmarshal timeseries from rollupResultCache panic after the upgrade to v1.89.0.

v1.89.0#

Released at 2023-03-12

Update note: this release can crash with cannot unmarshal timeseries from rollupResultCache panic after the upgrade from the previous releases. This issue can be fixed by removing caches stored on disk according to these docs. Another option is to upgrade to v1.89.1.

  • SECURITY: upgrade Go builder from Go1.20.1 to Go1.20.2. See the list of issues addressed in Go1.20.2.

  • FEATURE: vmctl: increase the default value for --remote-read-http-timeout command-line option from 30s (30 seconds) to 5m (5 minutes). This reduces the probability of timeout errors when migrating big number of time series. See this pull request.

  • FEATURE: vmctl: migrate series one-by-one in vm-native mode. This allows better tracking the migration progress and resuming the migration process from the last migrated time series. See this pull request and this feature request.

  • FEATURE: vmctl: add --vm-native-src-headers and --vm-native-dst-headers command-line flags, which can be used for setting custom HTTP headers during vm-native migration mode. Thanks to @baconmania for the pull request.

  • FEATURE: vmctl: add --vm-native-src-bearer-token and --vm-native-dst-bearer-token command-line flags, which can be used for setting Bearer token headers for the source and the destination storage during vm-native migration mode. See this feature request.

  • FEATURE: vmctl: add --vm-native-disable-http-keep-alive command-line flag to allow vmctl to use non-persistent HTTP connections in vm-native migration mode. Thanks to @baconmania for the pull request.

  • FEATURE: vmalert: log number of configration files found for each specified -rule command-line flag.

  • FEATURE: vmalert enterprise: concurrently read config files from S3, GCS or S3-compatible object storage. This significantly improves config load speed for cases when there are thousands of files to read from the object storage.

  • BUGFIX: vmstorage: fix a bug, which could lead to incomplete or empty results for heavy queries selecting tens of thousands of time series. See this pull request.

  • BUGFIX: vmselect: reduce memory usage and CPU usage when performing heavy queries. See this issue.

  • BUGFIX: prevent from possible invalid memory address or nil pointer dereference panic during background merge. The issue has been introduced at v1.85.0. See this issue.

  • BUGFIX: prevent from possible SIGBUS crash on ARM architectures (Raspberry Pi), which deny unaligned access to 8-byte words. Thanks to @oliverpool for narrowing down the issue and for the initial attempt to fix it.

  • BUGFIX: VictoriaMetrics cluster: always return is_partial: true in partial responses. Previously partial responses could be returned as non-partial in some cases.

  • BUGFIX: VictoriaMetrics cluster: properly take into account -rpc.disableCompression command-line flag at vmstorage. It was ignored since v1.78.0. See this pull request.

  • BUGFIX: vmagent: fix panic when writing data to Kafka. The panic has been introduced in v1.88.0.

  • BUGFIX: vmui: stop showing Please enter a valid Query and execute it error message on the first load of vmui.

  • BUGFIX: vmui: properly process Run in VMUI button click in VictoriaMetrics datasource plugin for Grafana.

  • BUGFIX: vmui: fix the display of the selected value for dropdowns on Explore page.

  • BUGFIX: vmui: do not send step param for instant queries. See this issue.

  • BUGFIX: vmauth: fix cannot serve http panic when plain HTTP request is sent to vmauth configured to accept requests over proxy protocol-encoded request (e.g. when vmauth runs with -httpListenAddr.useProxyProtocol command-line flag). The issue has been introduced at v1.87.0 when implementing this feature.

  • BUGFIX: vmgateway: properly parse RSA public key discovered via JWK endpoint.

v1.88.1#

Released at 2023-02-27

  • FEATURE: add -snapshotCreateTimeout flag to allow configuring timeout for snapshot process. See this issue.

  • FEATURE: expose vm_http_requests_total and vm_http_request_errors_total metrics for snapshot/* paths at VictoriaMetrics cluster vmstorage and VictoriaMetrics Single. See this issue.

  • FEATURE: vmgateway: add the ability to discover keys for JWT verification via OpenID discovery endpoint. See these docs.

  • FEATURE: add -internStringDisableCache command-line flag for disabling the cache for interned strings. This flag may be useful in some cases for reducing memory usage at the cost of higher CPU usage.

  • FEATURE: add -internStringCacheExpireDuration command-line flag for controlling the lifetime of cached interned strings.

  • BUGFIX: MetricsQL: fix panic when executing the query aggr_func(rollup*(some_value)). The panic has been introduced in v1.88.0.

  • BUGFIX: vmagent: use the provided -remoteWrite.* auth options when determining whether the remote storage supports VictoriaMetrics remote write protocol. Previously the auth options were ignored. This was preventing from automatic switch to VictoriaMetrics remote write protocol.

  • BUGFIX: vmagent: do not register vm_promscrape_config_* metrics if -promscrape.config flag is not used. Previously those metrics were registered and never updated, which was confusing and could trigger false-positive alerts.

  • BUGFIX: vmctl: skip measurements with no fields when migrating data from influxdb. See this issue.

  • BUGFIX: delete failed snapshot contents from disk on failed attempt to create snapshot. Previously failed snapshot contents could remain on disk in incomplete state. See this issue

v1.88.0#

Released at 2023-02-24

  • SECURITY: upgrade base docker image (alpine) from 3.17.1 to 3.17.2. See alpine 3.17.2 release notes.

  • SECURITY: upgrade Go builder from Go1.20.0 to Go1.20.1. See the list of issues addressed in Go1.20.1.

  • FEATURE: vmagent: add support for VictoriaMetrics remote write protocol. This protocol allows saving egress network bandwidth costs when sending data from vmagent to VictoriaMetrics located in another datacenter or availability zone. This also allows reducing disk IO under high load when vmagent starts queuing the collected data to disk when the remote storage is temporarily unavailable or cannot keep up with the data ingestion rate. See this feature request.

  • FEATURE: vmagent: add support for Kuma Control Plane targets discovery aka kuma_sd_configs. See this issue.

  • FEATURE: vmgateway: add the ability to verify JWT signature via JWKS endpoint. See these docs.

  • FEATURE: vmauth: add the ability to limit the number of concurrent requests on a per-user basis via -maxConcurrentPerUserRequests command-line flag and via max_concurrent_requests config option. See this feature request and these docs.

  • FEATURE: vmauth: automatically retry failing GET requests on all the configured backends. Previously the backend error has been immediately returned to the client without retrying the request on the remaining backends.

  • FEATURE: vmauth: choose the backend with the minimum number of concurrently executed requests among the configured backends in a round-robin manner for serving the incoming requests. This allows spreading the load among backends more evenly, while improving the response time.

  • FEATURE: vmalert enterprise: add ability to read alerting and recording rules from S3, GCS or S3-compatible object storage. See these docs.

  • FEATURE: vmctl: automatically retry requests to remote storage if up to 5 errors occur during the data migration process. This should help continuing the data migration process on temporary errors. Previously vmctl was stopping after the first error. See this feature request.

  • FEATURE: MetricsQL: support optional 2nd argument min, max or avg for rollup, rollup_delta, rollup_deriv, rollup_increase, rollup_rate and rollup_scrape_interval function. If the second argument is passed, then the function returns only the selected aggregation type. This change can be useful for situations where only one type of rollup calculation is needed. For example, rollup_rate(requests_total[1i], "max") would return only the max increase rates for requests_total metric per each interval between adjacent points on the graph. See this article for details.

  • FEATURE: MetricsQL: support optional 2nd argument open, low, high, close for rollup_candlestick function. If the second argument is passed, then the function returns only the selected aggregation type.

  • FEATURE: MetricsQL: add share(q) aggregate function.

  • FEATURE: MetricsQL: add mad_over_time(m[d]) function for calculating the median absolute deviation over raw samples on the lookbehind window d. See this feature request.

  • FEATURE: MetricsQL: add range_mad(q) function for calculating the median absolute deviation over points per each time series returned by q.

  • FEATURE: MetricsQL: add range_zscore(q) function for calculating z-score over points per each time series returned from q.

  • FEATURE: MetricsQL: add range_trim_outliers(k, q) function for dropping outliers located farther than k*range_mad(q) from the range_median(q). This should help removing outliers during query time at this issue.

  • FEATURE: MetricsQL: add range_trim_zscore(z, q) function for dropping outliers located farther than z*range_stddev(q) from range_avg(q). This should help removing outliers during query time at this issue.

  • FEATURE: vmui: show median instead of avg in graph tooltip and line legend, since median is more tolerant against spikes. See this issue.

  • FEATURE: add -search.maxSeriesPerAggrFunc command-line flag, which can be used for limiting the number of time series MetricsQL aggregate functions can return in a single query. This flag can be useful for preventing OOMs when count_values function is improperly used.

  • FEATURE: vmui: small UX improvements for mobile view. See this feature request and this pull request.

  • FEATURE: add -search.logQueryMemoryUsage command-line flag for logging queries, which need more memory than specified by this command-line flag. See this feature request. Thanks to @michal-kralik for the idea and the intial implementation.

  • FEATURE: allow setting zero value for -search.latencyOffset command-line flag. This may be needed in some cases. Previously the minimum supported value for -search.latencyOffset command-line flag was 1s.

  • BUGFIX: vmagent: immediately cancel in-flight scrape requests during configuration reload when stream parsing mode is disabled. Previously vmagent could wait for long time until all the in-flight requests are completed before reloading the configuration. This could significantly slow down configuration reload. See this issue.

  • BUGFIX: vmagent: do not wait for 2 seconds after the first unsuccessful attempt to scrape the target before performing the next attempt. This should improve scrape speed when the target closes http keep-alive connection between scrapes. See this and this issues.

  • BUGFIX: vmagent: fix Azure service discovery inside Azure Container App. See this issue. Thanks to @MattiasAng for the fix!

  • BUGFIX: do not put auxiliary directories scheduled for removal into snapshots. This should prevent from cannot create hard links from ...must-remove... errors when making snapshots / backups. See this issue.

  • BUGFIX: prevent from possible data ingestion slowdown and query performance slowdown during background merges of big parts on systems with small number of CPU cores (1 or 2 CPU cores). The issue has been introduced in v1.85.0 when implementing this feature. See also this issue.

  • BUGFIX: properly parse timestamps in milliseconds when ingesting data via OpenTSDB telnet put protocol. Previously timestamps in milliseconds were mistakenly multiplied by 1000. Thanks to @Droxenator for the pull request.

  • BUGFIX: MetricsQL: do not add extrapolated points outside the real points when using interpolate() function. See this issue.

v1.87.12

Released at 2023-12-10

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • SECURITY: upgrade base docker image (Alpine) from 3.18.4 to 3.19.0. See alpine 3.19.0 release notes.

  • SECURITY: upgrade Go builder from Go1.21.4 to Go1.21.5. See the list of issues addressed in Go1.21.5.

  • BUGFIX: vmalert: sanitize label names before sending the alert notification to Alertmanager. Before, vmalert would send notifications with labels containing characters not supported by Alertmanager validator, resulting into validation errors like msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar".

  • BUGFIX: properly escape < character in responses returned via /federate endpoint. See this issue.

v1.87.11#

Released at 2023-11-14

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • SECURITY: upgrade Go builder from Go1.21.3 to Go1.21.4. the list of issues addressed in Go1.21.4.

  • BUGFIX: vmagent: properly apply relabeling with regex, which start and end with .+ or .* and which contain alternate sub-regexps. For example, .+;|;.+ or .*foo|bar|baz.*. Previously such regexps were improperly parsed, which could result in undexpected relabeling results. See this issue.

  • BUGFIX: fix panic, which could occur when query tracing is enabled. See this issue.

  • BUGFIX: vmstorage: log warning about switching to ReadOnly mode only on state change. Before, vmstorage would log this warning every 1s. See this issue for details.

v1.87.10#

Released at 2023-10-16

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

v1.87.9#

Released at 2023-09-10

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • SECURITY: upgrade Go builder from Go1.21.0 to Go1.21.1. See the list of issues addressed in Go1.20.6.

  • BUGFIX: vminsert enterprise: properly parse /insert/multitenant/* urls, which have been broken since v1.93.2. See this issue.

  • BUGFIX: properly build production armv5 binaries for GOARCH=arm. This has been broken after the upgrading of Go builder to Go1.21.0. See this issue.

  • BUGFIX: vmselect: return 503 Service Unavailable status code when partial responses are denied and some of vmstorage nodes are temporarily unavailable. Previously 422 Unprocessable Entiry status code was mistakenly returned in this case, which could prevent from automatic recovery by re-sending the request to healthy cluster replica in another availability zone.

  • BUGFIX: vmalert: fix the bug when Group’s params fields with multiple values were overriding each other instead of adding up. The bug was introduced in this commit starting from v1.87.7. See this issue.

v1.87.8#

Released at 2023-09-01

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • BUGFIX: build: fix Docker builds for old Docker releases. See this issue.
  • BUGFIX: vmselect: correctly handle requests with /select/multitenant prefix. Such requests must be rejected since vmselect does not support multitenancy endpoint. Previously, such requests were causing panic. See this issue.
  • BUGFIX: vminsert: properly check for read-only state at vmstorage. Previously it wasn’t properly checked, which could lead to increased resource usage and data ingestion slowdown when some of vmstorage nodes are in read-only mode. See this issue.
  • BUGFIX: vminsert: properly close broken vmstorage connection during read-only state checks at vmstorage. Previously it wasn’t properly closed, which prevents restoring vmstorage node from read-only mode. See this issue.
  • BUGFIX: vmstorage: prevent from breaking vmselect -> vmstorage RPC communication when vmstorage returns an empty label name at /api/v1/labels request. See this issue.
  • BUGFIX: do not allow starting VictoriaMetrics components with improperly set boolean command-line flags in the form -boolFlagName value, since this leads to silent incomplete flags’ parsing. This form should be replaced with -boolFlagName=value. See this issue.
  • BUGFIX: properly replace : chars in label names with _ when -usePromCompatibleNaming command-line flag is passed to vmagent, vminsert or single-node VictoriaMetrics. This addresses this comment.
  • BUGFIX: vmbackup: correctly check if specified -dst belongs to specified -storageDataPath. See this issue.

v1.87.7#

Released at 2023-08-12

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • SECURITY: upgrade Go builder from Go1.20.4 to Go1.21.0.

  • SECURITY: upgrade base docker image (Alpine) from 3.18.2 to 3.18.3. See alpine 3.18.3 release notes.

  • BUGFIX: vmselect: fix timestamp alignment for Prometheus querying API if time argument is less than 10m from the beginning of Unix epoch.

  • BUGFIX: vminsert: fixed decoding of label values with slash when accepting data via pushgateway protocol. This fixes Prometheus golang client compatibility. See this issue.

  • BUGFIX: vmagent: properly validate scheme for proxy_url field at the scrape config. See this issue for details.

  • BUGFIX: vmagent: close HTTP connections to service discovery servers when they are no longer needed. This should prevent from possible connection exhasution in some cases. See this issue.

  • BUGFIX: vmagent: properly apply if filters during relabeling. Previously the if filter could improperly work. See this issue and this pull request.

  • BUGFIX: vmagent: fix possible panic at shutdown when stream aggregation is enabled. See this pull request for details.

  • BUGFIX: vmagent: use local scrape timestamps for the scraped metrics unless honor_timestamps: true option is explicitly set at scrape_config. This fixes gaps for metrics collected from cadvisor or similar exporters, which export metrics with invalid timestamps. See this issue and this comment for details.

  • BUGFIX: vmauth: Properly handle LOCAL command for proxy protocol. See this issue.

  • BUGFIX: VictoriaMetrics cluster: properly return error from /api/v1/query and /api/v1/query_range at vmselect when the -search.maxSamplesPerQuery or -search.maxSamplesPerSeries limit is exceeded. Previously incomplete response could be returned without the error if vmselect runs with -replicationFactor greater than 1. See this pull request.

  • BUGFIX: vmalert: correctly calculate evaluation time for rules. Before, there was a low probability for discrepancy between actual time and rules evaluation time if evaluation interval was lower than the execution time for rules within the group.

  • BUGFIX: vmalert: reset evaluation timestamp after modifying group interval. Before, there could have latency on rule evaluation time.

  • BUGFIX: vmalert: Properly set datasource query params. See this issue. Thanks to @gsakun for the pull request.

  • BUGFIX: vmalert: Properly form path to static assets in WEB UI if http.pathPrefix set. See this issue.

  • BUGFIX: vmalert: properly return empty slices instead of nil for /api/v1/rules for groups with present name but absent rules. See this issue.

  • BUGFIX: vmctl: interrupt explore procedure in influx mode if vmctl found no numeric fields.

  • BUGFIX: vmctl: fix panic in case --remote-read-filter-time-start flag is not set for remote-read mode. This flag is now required to use remote-read mode. See this issue.

v1.87.6#

Released at 2023-05-18

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • SECURITY: upgrade Go builder from Go1.20.3 to Go1.20.4. See the list of issues addressed in Go1.20.4.

  • SECURITY: upgrade base docker image (alpine) from 3.17.3 to 3.18.0. See alpine 3.18.0 release notes.

  • SECURITY: serve /robots.txt content to disallow indexing of the exposed instances by search engines. See this issue for details.

  • BUGFIX: reduce the probability of sudden increase in the number of small parts on systems with small number of CPU cores.

  • BUGFIX: reduce the possibility of increased CPU usage when data with timestamps older than one hour is ingested into VictoriaMetrics. This reduces spikes for the graph sum(rate(vm_slow_per_day_index_inserts_total)). See this pull request.

  • BUGFIX: do not ignore trailing empty field in CSV lines when importing data in CSV format. See this issue.

  • BUGFIX: disallow " chars when parsing Prometheus label names, since they aren’t allowed by Prometheus text exposition format. Previously this could result in silent incorrect parsing of incorrect Prometheus labels such as foo{"bar"="baz"} or {foo:"bar",baz="aaa"}. See this issue.

  • BUGFIX: MetricsQL: fix a panic when the duration in the query contains uppercase M suffix. Such a suffix isn’t allowed to use in durations, since it clashes with a million suffix, e.g. it isn’t clear whether rate(metric[5M]) means rate over 5 minutes, 5 months or 5 million seconds. See this and this issues.

  • BUGFIX: VictoriaMetrics cluster: prevent from possible panic when the number of vmstorage nodes increases when automatic vmstorage discovery is enabled.

  • BUGFIX: properly limit the number of OpenTSDB HTTP concurrent requests specified via -maxConcurrentInserts command-line flag. See this issue. Thanks to @zouxiang1993 for the fix.

  • BUGFIX: vmalert: properly return empty slices instead of nil for /api/v1/rules and /api/v1/alerts API handlers. See this issue.

  • BUGFIX: vmagent: add __meta_kubernetes_endpoints_name label for all ports discovered from endpoint. Previously, ports not matched by Service did not have this label. See this issue for details. Thanks to @thunderbird86 for discovering and fixing the issue.

  • BUGFIX: fix possible infinite loop during indexdb rotation when -retentionTimezoneOffset command-line flag is set and the local timezone is not UTC. See this issue. Thanks to @faceair for the fix.

  • BUGFIX: vmauth: do not return invalid auth credentials in http response by default, since it may be logged by client. See this issue.

  • BUGFIX: alerts-health: update threshold for TooHighMemoryUsage alert from 90% to 80%, since 90% is too high for production environments.

  • BUGFIX: vmagent: properly handle the vm_promscrape_config_last_reload_successful metric after config reload. See this issue.

  • BUGFIX: stream aggregation: fix bug with duplicated labels during stream aggregation via single-node VictoriaMetrics. See this issue.

  • BUGFIX: stream aggregation: suppress series after dedup error message in logs when -remoteWrite.streamAggr.dedupInterval command-line flag is set at vmagent or when -streamAggr.dedupInterval command-line flag is set at single-node VictoriaMetrics.

v1.87.5#

Released at 2023-04-06

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

v1.87.4#

Released at 2023-03-25

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • BUGFIX: prevent from slow snapshot creating under high data ingestion rate. See this issue.
  • BUGFIX: vmauth: suppress proxy protocol parsing errors in case of EOF. Usually, the error is caused by health checks and is not a sign of an actual error.
  • BUGFIX: vmbackup: fix snapshot not being deleted in case of error during backup. See this issue.
  • BUGFIX: allow using dashes and dots in environment variables names referred in config files via %{ENV-VAR.SYNTAX}. See these docs and this issue.
  • BUGFIX: return back query performance scalability on hosts with big number of CPU cores. The scalability has been reduced in v1.86.0. See this issue.

v1.87.3#

Released at 2023-03-12

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • SECURITY: upgrade Go builder from Go1.20.1 to Go1.20.2. See the list of issues addressed in Go1.20.2.

  • BUGFIX: vmstorage: fix a bug, which could lead to incomplete or empty results for heavy queries selecting tens of thousands of time series. See this pull request.

  • BUGFIX: vmselect: reduce memory usage and CPU usage when performing heavy queries. See this issue.

  • BUGFIX: prevent from possible invalid memory address or nil pointer dereference panic during background merge. The issue has been introduced at v1.85.0. See this issue.

  • BUGFIX: prevent from possible SIGBUS crash on ARM architectures (Raspberry Pi), which deny unaligned access to 8-byte words. Thanks to @oliverpool for narrowing down the issue and for the initial attempt to fix it.

  • BUGFIX: VictoriaMetrics cluster: always return is_partial: true in partial responses. Previously partial responses could be returned as non-partial in some cases.

  • BUGFIX: VictoriaMetrics cluster: properly take into account -rpc.disableCompression command-line flag at vmstorage. It was ignored since v1.78.0. See this pull request.

  • BUGFIX: vmagent: do not register vm_promscrape_config_* metrics if -promscrape.config flag is not used. Previously those metrics were registered and never updated, which was confusing and could trigger false-positive alerts.

  • BUGFIX: vmctl: skip measurements with no fields when migrating data from influxdb. See this issue.

  • BUGFIX: vmauth: fix cannot serve http panic when plain HTTP request is sent to vmauth configured to accept requests over proxy protocol-encoded request (e.g. when vmauth runs with -httpListenAddr.useProxyProtocol command-line flag). The issue has been introduced at v1.87.0 when implementing this feature.

v1.87.2#

Released at 2023-02-24

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

v1.87.1#

Released at 2023-02-09

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • FEATURE: vmalert: alerts state restore procedure was changed to become asynchronous. It doesn’t block groups start anymore which significantly improves vmalert’s startup time. This also means that -remoteRead.ignoreRestoreErrors command-line flag becomes deprecated now and will have no effect if configured. While previously state restore attempt was made for all the loaded alerting rules, now it is called only for alerts which became active after the first evaluation. See this issue.

  • FEATURE: vmui: optimize VMUI for use from smartphones and tablets. See this feature request.

  • FEATURE: vmui: add ability to search tenants in the drop-down list for the tenant selector. See this feature request.

  • FEATURE: vmui: add avg/min/max/last values to line legends and tooltips for graphs. See this feature request.

  • FEATURE: vmui: hide the default per-job resource usage dashboard if there is a custom dashboard exists at the directory specified via -vmui.customDashboardsPath command-line flag. See this feature request.

  • BUGFIX: vmagent: fix panic in HashiCorp Nomad service discovery. Thanks to @mr-karan for the pull request.

  • BUGFIX: vmalert: fix display of rules number per-group for groups with identical names in UI.

  • BUGFIX: vmalert: prevent disabling state updates tracking per rule via setting values < 1. The minimum number of update states to track is now set to 1.

  • BUGFIX: vmalert: properly update debug and update_entries_limit rule’s params on config’s hot-reload.

  • BUGFIX: properly initialize the vm_concurrent_insert_current metric before exposing it. Previously this metric could be left uninitialized in some cases, e.g. its value was zero. This could lead to false alerts for the query avg_over_time(vm_concurrent_insert_current[1m]) >= vm_concurrent_insert_capacity. See this issue.

  • BUGFIX: vmagent: immediately cancel in-flight scrape requests during configuration reload when using stream parsing mode. Previously vmagent could wait for long time until all the in-flight requests are completed before reloading the configuration. This could significantly slow down configuration reload. See this issue.

  • BUGFIX: vmgateway: do not validate JWT signature if no public keys are provided. Previously this could result in the error setting up jwt verification error.

v1.87.0#

Released at 2023-02-01

v1.87.x is a line of LTS releases. It contains important up-to-date bugfixes. The v1.87.x line will be supported for at least 12 months since v1.87.0 release

  • FEATURE: stream aggregation: add the ability to de-duplicate input samples before aggregation via -streamAggr.dedupInterval and -remoteWrite.streamAggr.dedupInterval command-line options.

  • FEATURE: vmui: add dark mode - it can be selected via settings menu in the top right corner. See this pull request.

  • FEATURE: vmui: improve visual appearance of the top menu. See this feature request.

  • FEATURE: vmui: embed fonts into binary instead of loading them from external sources. This allows using vmui in full from isolated networks without access to Internet. Thanks to @ScottKevill for the pull request.

  • FEATURE: vmui: add ability to switch between tenants by selecting the needed tenant in the drop-down list at the top right corner of the UI. See this pull request.

  • FEATURE: vmagent: reduce memory usage when sending stale markers for targets, which expose big number of metrics. See this and this issues.

  • FEATURE: vmagent: add __meta_kubernetes_pod_container_id meta-label to the targets discovered via kubernetes_sd_configs. This label has been added in Prometheus starting from v2.42.0. See this feature request.

  • FEATURE: vmagent: add __meta_azure_machine_size meta-label to the targets discovered via azure_sd_configs. This label has been added in Prometheus starting from v2.42.0. See this pull request.

  • FEATURE: vmauth: allow limiting the number of concurrent requests sent to vmauth via -maxConcurrentRequests command-line flag. This allows controlling memory usage of vmauth and the resource usage of backends behind vmauth. See this feature request. Thanks to @dmitryk-dk for the initial implementation.

  • FEATURE: allow using VictoriaMetrics components behind proxies, which communicate with the backend via proxy protocol. See this feature request. For example, vmauth accepts proxy protocol connections when it starts with -httpListenAddr.useProxyProtocol command-line flag.

  • FEATURE: add -internStringMaxLen command-line flag, which can be used for fine-tuning RAM vs CPU usage in certain workloads. For example, if the stored time series contain long labels, then it may be useful reducing the -internStringMaxLen in order to reduce memory usage at the cost of increased CPU usage. See this issue.

  • FEATURE: provide GOARCH=386 binaries for single-node VictoriaMetrics, vmagent, vmalert, vmauth, vmbackup and vmrestore components at releases page. See this feature request. Thanks to @denisgolius for the pull request.

  • BUGFIX: fix a bug, which could prevent background merges for the previous partitions until restart if the storage didn’t have enough disk space for final deduplication and down-sampling.

  • BUGFIX: fix a bug, which could lead to increased CPU usage and disk IO usage when adding data to previous months and when the deduplication or downsampling is enabled. See this pull request.

  • BUGFIX: VictoriaMetrics cluster: propagate all the timeout-related errors from vmstorage to vmselect. Previously some timeout errors weren’t returned from vmselect to vmstorage. Instead, vmstorage could log the error and close the connection to vmselect, so vmselect was logging cryptic errors such as cannot execute funcName="..." on vmstorage "...": EOF.

  • BUGFIX: vmui: add support for time zone selection for older versions of browsers. See this pull request.

  • BUGFIX: vmagent: update API version for ec2_sd_configs to fix the issue with missing __meta_ec2_availability_zone_id attribute.

  • BUGFIX: vmagent: properly return 200 OK HTTP status code when importing data via Pushgateway protocol. See this issue.

  • BUGFIX: vmagent: do not add exported_ prefix to scraped metric names, which clash with the automatically generated metric names if honor_labels: true option is set in the scrape_config. See the this and this issues.

  • BUGFIX: vmauth: allow re-entering authorization info in the web browser if the entered info was incorrect. Previously it was non-trivial to do via the web browser, since vmauth was returning 400 Bad Request instead of 401 Unauthorized http response code.

  • BUGFIX: vmauth: always log the client address and the requested URL on proxying errors. Previously some errors could miss this information.

  • BUGFIX: vmbackup: fix snapshot not being deleted after backup completion. This issue could result in unnecessary snapshots being stored, it is required to delete unnecessary snapshots manually. See the this issue.

  • BUGFIX: VictoriaMetrics cluster: fix panic on top-level vmselect nodes of multi-level setup when the -replicationFactor flag is set and request contains trace query parameter. See this issue.

v1.86.2#

Released at 2023-01-18

  • SECURITY: vmbackup: do not expose basic auth passwords from -snapshot.createURL and -snapshot.deleteURL command-line flags in logs. Thanks to @toanju for the pull request.

  • FEATURE: vmui: add ability to show custom dashboards at vmui by specifying a path to a directory with dashboard config files via -vmui.customDashboardsPath command-line flag. See this feature request and these docs.

  • FEATURE: vmui: apply the step globally to all the displayed graphs. See this feature request.

  • FEATURE: vmui: improve the appearance of graph lines by using more visually distinct colors. See this feature request.

  • BUGFIX: do not slow down concurrently executed queries during assisted merges, since assisted merges already prioritize data ingestion over queries. The probability of assisted merges has been increased starting from v1.85.0 because of internal refactoring. This could result in slowed down queries when there is a plenty of free CPU resources. See this and this issues.

  • BUGFIX: reduce the increased CPU usage at vmselect to v1.85.3 level when processing heavy queries. See this issue.

  • BUGFIX: retention filters: fix FATAL: cannot locate metric name for metricID=...: EOF panic, which could occur when retention filters are enabled.

  • BUGFIX: vmagent: properly cancel in-flight service discovery requests for consul_sd_configs and nomad_sd_configs when the service list changes. See this issue.

  • BUGFIX: vmagent: dockerswarm_sd_configs: apply filters only to objects of the specified role. Previously filters were applied to all the objects, which could cause errors when different types of objects were used with filters that were not compatible with them. See this issue.

  • BUGFIX: vmagent: suppress all the scrape errors when -promscrape.suppressScrapeErrors is enabled. Previously some scrape errors were logged even if -promscrape.suppressScrapeErrors flag was set.

  • BUGFIX: vmagent: consistently put the scrape url with scrape target labels to all error logs for failed scrapes. Previously some failed scrapes were logged without this information.

  • BUGFIX: vmagent: do not send stale markers to remote storage for series exceeding the configured series limit. See this issue.

  • BUGFIX: vmagent: properly apply series limit when staleness tracking is disabled.

  • BUGFIX: vmagent: reduce memory usage spikes when big number of scrape targets disappear at once. See this issue. Thanks to @lzfhust for the initial fix.

  • BUGFIX: Pushgateway import: properly return 200 OK HTTP response code. See this issue.

  • BUGFIX: MetricsQL: properly parse M and Mi suffixes as 1e6 multipliers in 1M and 1Mi numeric constants. See this issue. The issue has been introduced in v1.86.0.

  • BUGFIX: vmui: properly display range query results at Table view. For example, up[5m] query now shows all the raw samples for the last 5 minutes for the up metric at the Table view. See this issue.

v1.86.1#

Released at 2023-01-10

  • BUGFIX: return correct query results over time series with gaps. The issue has been introduced in v1.86.0.
  • BUGFIX: properly take into account the timeout passed by vmselect to vmstorage during query execution. This issue could result in the following error logs at vmstorage under load: cannot process vmselect request: cannot execute "search_v7": couldn't start executing the request in 0.000 seconds, since -search.maxConcurrentRequests=... concurrent requests are already executed. The issue has been introduced in v1.86.0.

v1.86.0#

Released at 2023-01-10

It is recommended upgrading to VictoriaMetrics v1.86.1 because v1.86.0 contains a bug, which could lead to incorrect query results over time series with gaps.

Update note 1: This release changes the logic behind -maxConcurrentInserts command-line flag. Previously this flag was limiting the number of concurrent connections established from clients, which send data to VictoriaMetrics. Some of these connections could be temporarily idle. Such connections do not take significant CPU and memory resources, so there is no need in limiting their count. The new logic takes into account only those connections, which actively ingest new data to VictoriaMetrics and to vmagent. This means that the default -maxConcurrentInserts value should handle cases, which could require increasing the value in the previous releases. So it is recommended trying to remove the explicitly set -maxConcurrentInserts command-line flag after upgrading to this release and verifying whether this reduces CPU and memory usage.

Update note 2: The vm_concurrent_addrows_current and vm_concurrent_addrows_capacity metrics exported by vmstorage are replaced with vm_concurrent_insert_current and vm_concurrent_insert_capacity metrics in order to be consistent with the corresponding metrics exported by vminsert. Please update queries in dahsboards and alerting rules with new metric names if old metric names are used there.

  • FEATURE: vmagent: add support for aggregation of incoming samples by time and by labels. See these docs and this feature request.

  • FEATURE: vmagent: reduce memory usage when scraping big number of targets without the need to enable stream parsing mode.

  • FEATURE: vmagent: add support for Prometheus-compatible target discovery for HashiCorp Nomad services via nomad_sd_configs. See this feature request. Thanks to @mr-karan for the implementation.

  • FEATURE: vmagent: automatically pre-fetch metric_relabel_configs and the target labels when clicking on the debug metrics relabeling link at the http://vmagent:8429/targets page at the particular target. See these docs.

  • FEATURE: vmui: add ability to explore metrics exported by a particular job / instance. See these docs and this feature request.

  • FEATURE: allow passing partial RFC3339 date/time to time, start and end query args at querying APIs and export APIs. For example, 2022 is equivalent to 2022-01-01T00:00:00Z, while 2022-01-30T14 is equivalent to 2022-01-30T14:00:00Z. See these docs.

  • FEATURE: MetricsQL: allow using unicode letters in identifiers. For example, температура{город="Киев"} is a valid MetricsQL expression now. Previously every non-ascii letters should be escaped with \ char when used inside MetricsQL expression: \т\е\м\п\е\р\а\т\у\р\а{\г\о\р\о\д="Киев"}. Now both expressions are equivalent. Thanks to @hzwwww for the pull request.

  • FEATURE: relabeling: add support for keepequal and dropequal relabeling actions, which are supported by Prometheus starting from v2.41.0. These relabeling actions are almost identical to keep_if_equal and drop_if_equal relabeling actions supported by VictoriaMetrics since v1.38.0 - see these docs - so it is recommended sticking to keep_if_equal and drop_if_equal actions instead of switching to keepequal and dropequal.

  • FEATURE: csvimport: support empty values for imported metrics. See this issue.

  • FEATURE: vmalert: allow configuring the default number of stored rule’s update states in memory via global -rule.updateEntriesLimit command-line flag or per-rule via rule’s update_entries_limit configuration param. See these docs and this pull request.

  • FEATURE: improve the logic benhind -maxConcurrentInserts command-line flag. Previously this flag was limiting the number of concurrent connections from clients, which write data to VictoriaMetrics or vmagent. Some of these connections could be idle for some time. These connections do not need significant amounts of CPU and memory, so there is no sense in limiting their count. The updated logic behind -maxConcurrentInserts limits the number of active insert requests, not counting idle connections.

  • FEATURE: protect all the http endpoints with -httpAuth.* command-line flag. Previously endpoints protected by -*AuthKey command-line flags weren’t protected by -httpAuth.*. This could complicate the proper security setup. See this issue.

  • FEATURE: VictoriaMetrics cluster: add -maxConcurrentInserts and -insert.maxQueueDuration command-line flags to vmstorage, so they could be tuned if needed in the same way as at vminsert nodes.

  • FEATURE: VictoriaMetrics cluster: limit the number of concurrently executed requests at vmstorage proportionally to the number of available CPU cores, since every request can saturate a single CPU core at vmstorage. Previously a single vmstorage could accept and start processing arbitrary number of concurrent requests received from big number of vmselect nodes. This could result in increased RAM, CPU and disk IO usage or event to out of memory crash at vmstorage side under high load. The limit can be fine-tuned if needed via -search.maxConcurrentRequests command-line flag at vmstorage according to these docs. vmstorage now exposes the following additional metrics at http://vmstorage:8482/metrics page:

    • vm_vmselect_concurrent_requests_capacity - the maximum number of requests allowed to execute concurrently
    • vm_vmselect_concurrent_requests_current - the current number of concurrently executed requests
    • vm_vmselect_concurrent_requests_limit_reached_total - the total number of requests, which were put in the wait queue when -search.maxConcurrentRequests concurrent requests are being executed
    • vm_vmselect_concurrent_requests_limit_timeout_total - the total number of canceled requests because they were sitting in the wait queue for more than -search.maxQueueDuration
  • BUGFIX: vmui: properly update the step value in url after the step input field has been manually changed. This allows preserving the proper step when copy-n-pasting the url to another instance of web browser. See this issue.

  • BUGFIX: vmui: properly update tooltip when quickly hovering multiple lines on the graph. See this issue.

  • BUGFIX: properly parse floating-point numbers without integer or fractional parts such as .123 and 20. during data import. See this issue.

  • BUGFIX: MetricsQL: properly parse durations with uppercase suffixes such as 10S, 5MS, 1W, etc. See this issue.

  • BUGFIX: vmagent: fix a panic during target discovery when vmagent runs with -promscrape.dropOriginalLabels command-line flag. See this issue. The bug has been introduced in v1.85.0.

  • BUGFIX: vmagent: dockerswarm_sd_configs: properly encode filters field. See this issue.

  • BUGFIX: vmagent: fix possible resource leak after hot reload of the updated consul_sd_configs. See this issue.

  • BUGFIX: vmagent: fix a panic in gce_sd_configs when the discovered instance has zero labels. See this issue. The issue has been introduced in v1.85.0.

  • BUGFIX: properly return label names starting from uppercase such as CamelCaseLabel from /api/v1/labels. See this issue.

  • BUGFIX: fix opentsdb HTTP endpoint not respecting -httpAuth.* flags. See this issue

  • BUGFIX: consistently select the sample with the biggest value out of samples with identical timestamps during querying when the deduplication is enabled according to this feature request. Previously random samples could be selected during querying.

Previous releases#

See changes for older releases here.