This document provides a comprehensive reference for all metrics exposed by vlagent at the http://localhost:9429/metrics endpoint.
These metrics follow the Prometheus exposition format and can be used for monitoring, alerting, and performance analysis of log collection and remote write operations.
Table of Contents #
HTTP Request Metrics #
vl_http_requests_total #
Type: Counter
Labels:
path:/select/logsql/query,/insert/jsonline,/insert/loki/api/v1/push, etc.format:json,protobuf
Description: HTTP requests arriving at VictoriaLogs endpoints. Counts all requests immediately when received, before any validation or authentication happens.
vl_http_errors_total #
Type: Counter
Labels:
path: endpoint pathformat: request format when applicable (e.g.protobuf)
Description: Errors encountered while processing requests for the given endpoint. The counter is incremented by endpoint handlers on non-trivial processing errors (e.g. request decoding/parsing failures or query execution errors) and may differ from the number of HTTP responses with error status. For line-oriented ingestion (e.g. /insert/jsonline), it is incremented per invalid log line inside a request.
vl_http_request_duration_seconds #
Type: Summary
Labels:
path: endpoint path
Description: Complete time spent processing each HTTP request from start to finish. Includes all processing steps: parsing request data, validating parameters, storing logs, and sending responses. Captured when requests complete successfully.
Data Ingestion Metrics #
vl_rows_ingested_total #
Type: Counter
Labels:
type:jsonline,loki,elasticsearch,datadog,opentelemetry,journald,syslog,kubernetes_logs
Description: Log entries successfully parsed and added to the processing pipeline. Counts all entries that pass initial validation, including debug entries that are processed but not stored when debug=1 is used.
vl_bytes_ingested_total #
Type: Counter
Labels:
type:jsonline,loki,elasticsearch,datadog,opentelemetry,journald,syslog,kubernetes_logs
Description: Estimated JSON size of ingested log entry fields. Calculated using field name lengths and values to provide consistent volume measurement across different input formats like JSON, Loki, or syslog.
vl_rows_dropped_total #
Type: Counter
Labels:
reason:debug,too_many_fields,too_big_timestamp,too_small_timestamp,invalid_cri_line
Description: Log entries rejected for specific reasons. debug counts entries processed with debug=1 (parsed but not stored). too_many_fields counts entries exceeding -insert.maxFieldsPerLine. too_small_timestamp counts entries older than -retentionPeriod. too_big_timestamp counts entries newer than -futureRetention. invalid_cri_line counts entries that fail to parse as CRI-compatible log lines in the
Kubernetes Collector
mode.
vl_insert_flush_duration_seconds #
Type: Summary
Labels:
type: ingestion protocol
Description: Time taken to flush accumulated logs from memory buffers to storage. Triggered when buffers fill up or during periodic flushes (every ~1 second with jitter). High values suggest storage bottlenecks or slow disk performance.
vl_too_long_lines_skipped_total #
Type: Counter
Description: Log lines exceeding -insert.maxLineSizeBytes (default 256KB) during parsing. Lines are skipped to prevent memory exhaustion. Indicates malformed data, overly verbose logs, or need to increase the size limit.
Remote Write Metrics #
vlagent_remotewrite_block_size_bytes #
Type: Histogram
Description: Compressed size of data blocks after zstd compression before transmission to remote storage. Blocks flushed from memory buffers when reaching -remoteWrite.maxBlockSize or during periodic flushes every -remoteWrite.flushInterval.
vlagent_remotewrite_block_size_rows #
Type: Histogram
Description: Number of log entries batched together in each block before compression and transmission. Pending logs flushed to remote storage, shows batching efficiency and optimal buffer sizes.
vlagent_remotewrite_duration_seconds #
Type: Histogram
Labels:
url: remote storage URL
Description: Time spent on individual HTTP requests to remote storage endpoints. Measured from request start to response completion, including network latency and remote processing time.
vlagent_remotewrite_bytes_sent_total #
Type: Counter
Labels:
url: remote storage URL
Description: Compressed bytes successfully transmitted to remote storage after receiving 2xx HTTP responses. Only counts confirmed successful deliveries, making it reliable for monitoring actual data throughput.
vlagent_remotewrite_blocks_sent_total #
Type: Counter
Labels:
url: remote storage URL
Description: Number of compressed data blocks successfully delivered to remote storage. Only counts successful deliveries after receiving 2xx HTTP responses, shows accurate count of transmission batches.
vlagent_remotewrite_requests_total #
Type: Counter
Labels:
url: remote storage URLstatus_code: HTTP status code
Description: HTTP requests to remote storage endpoints grouped by response status. Successful requests use status_code="2XX", while failed requests show specific status codes like 400, 500, etc.
vlagent_remotewrite_errors_total #
Type: Counter
Labels:
url: remote storage URL
Description: Failed HTTP requests to remote storage due to network errors, timeouts, or connection failures. Counted before retry attempts, detects connectivity issues that trigger the retry mechanism.
vlagent_remotewrite_pending_data_bytes #
Type: Gauge
Labels:
path: file pathurl: remote storage URL
Description: Total bytes in persistent queue awaiting transmission to remote storage. Includes both in-memory buffers and disk-based queue files. Growing values suggest ingestion rate exceeds transmission capacity or remote storage issues.
vlagent_remotewrite_pending_inmemory_blocks #
Type: Gauge
Labels:
path: file pathurl: remote storage URL
Description: Compressed data blocks held in memory before writing to persistent queue files. Higher values consume more memory but provide faster access than disk-based queue. Controlled by memory limits and queue configuration.
vlagent_remotewrite_queue_blocked #
Type: Gauge
Labels:
path: file pathurl: remote storage URL
Description: Queue write status where 1 means blocked and 0 means accepting data. Becomes 1 when persistent queue reaches -remoteWrite.maxDiskUsagePerURL limit, causing new data to be dropped to prevent disk exhaustion.
vlagent_remotewrite_rate_limit_reached_total #
Type: Counter
Labels:
url: remote storage URL
Description: Times transmission was throttled by -remoteWrite.rateLimit setting. Data blocks waiting for rate limiter before transmission, shows configured bandwidth limits actively controlling send rates.
vlagent_remotewrite_rate_limit #
Type: Gauge
Labels:
url: remote storage URL
Description: Configured transmission rate limit from -remoteWrite.rateLimit setting in bytes per second. Shows 0 when rate limiting is disabled, otherwise displays the maximum allowed transmission speed.
vlagent_remotewrite_packets_dropped_total #
Type: Counter
Labels:
url: remote storage URL
Description: Data blocks permanently discarded due to unrecoverable errors. Remote storage returning 400 or 404 status codes, shows client errors or invalid data that cannot be retried.
vlagent_remotewrite_retries_count_total #
Type: Counter
Labels:
url: remote storage URL
Description: Retry attempts for failed HTTP requests to remote storage. Each retry cycle triggered by network errors or server failures, with exponential backoff delays controlled by -remoteWrite.retryMinInterval and -remoteWrite.retryMaxTime.
vlagent_remotewrite_send_duration_seconds_total #
Type: Counter
Labels:
url: remote storage URL
Description: Cumulative time spent in HTTP request processing to remote storage. Includes all request attempts, retries, and network time, shows total transmission overhead for capacity planning and bottleneck analysis.
vlagent_remotewrite_queues #
Type: Gauge
Labels:
url: remote storage URL
Description: Number of parallel transmission workers configured via -remoteWrite.queues flag. Higher values provide more concurrent transmission capacity but consume additional memory and connection resources.
Error and Network Metrics #
vl_errors_total #
Type: Counter
Labels:
type:syslog
Description: Syslog parsing errors encountered during log line processing. Individual syslog messages that fail to parse due to malformed timestamps, invalid priorities, or other RFC3164/RFC5424 format violations. Syslog data quality monitoring.
vl_udp_requests_total #
Type: Counter
Labels:
type:syslog
Description: UDP packets received at syslog endpoints configured via -syslog.listenAddr.udp. Total network traffic volume to syslog UDP listeners regardless of content validity.
vl_udp_errors_total #
Type: Counter
Labels:
type:syslog
Description: UDP network errors at syslog endpoints including temporary network failures, connection resets, and socket read failures. Excludes parsing errors which are tracked separately. UDP network connectivity issues.
Grafana Dashboards #
VictoriaLogs provides official Grafana dashboards that utilize these metrics: