Configuration #

The operator is set up using environment variables and command-line flags . Most environment variables control settings related to Resources , like CPU and memory defaults, image versions. Command-line flags configure the operator itself, like leader election, TLS, webhook validation, and rate limits.

Environment variables #

Run this command Available from v0.57.0 to see all environment variables your operator supports:

      OPERATOR_POD_NAME=$(kubectl get pod -l "app.kubernetes.io/name=victoria-metrics-operator"  -n vm -o jsonpath="{.items[0].metadata.name}");
kubectl exec -n vm "$OPERATOR_POD_NAME" -- /app --printDefaults 2>&1

# Output:
# KEY                   DEFAULT        REQUIRED    DESCRIPTION
# VM_METRICS_VERSION    v1.117.0       false       
# VM_LOGS_VERSION       v1.21.0        false 
# ... 
    

This is the latest operator environment variables:

Environment variables
VM_METRICS_VERSION: `v1.124.0` #
VM_LOGS_VERSION: `v1.28.0` #
VM_ANOMALY_VERSION: `v1.25.2` #
VM_USECUSTOMCONFIGRELOADER: `false` # enables custom config reloader for vmauth and vmagent, it should speed-up config reloading process.
VM_CONTAINERREGISTRY: `-` # container registry name prefix, e.g. docker.io
VM_CUSTOMCONFIGRELOADERIMAGE: `victoriametrics/operator:config-reloader-v0.62.0` #
VM_PSPAUTOCREATEENABLED: `false` #
VM_CONFIG_RELOADER_LIMIT_CPU: `unlimited` # defines global resource.limits.cpu for all config-reloader containers
VM_CONFIG_RELOADER_LIMIT_MEMORY: `unlimited` # defines global resource.limits.memory for all config-reloader containers
VM_CONFIG_RELOADER_REQUEST_CPU: `-` # defines global resource.requests.cpu for all config-reloader containers
VM_CONFIG_RELOADER_REQUEST_MEMORY: `-` # defines global resource.requests.memory for all config-reloader containers
VM_VLOGSDEFAULT_IMAGE: `victoriametrics/victoria-logs` #
VM_VLOGSDEFAULT_VERSION: `${VM_LOGS_VERSION}` #
VM_VLOGSDEFAULT_PORT: `9428` #
VM_VLOGSDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VLOGSDEFAULT_RESOURCE_LIMIT_MEM: `1500Mi` #
VM_VLOGSDEFAULT_RESOURCE_LIMIT_CPU: `1200m` #
VM_VLOGSDEFAULT_RESOURCE_REQUEST_MEM: `500Mi` #
VM_VLOGSDEFAULT_RESOURCE_REQUEST_CPU: `150m` #
VM_VLAGENTDEFAULT_IMAGE: `victoriametrics/vlagent` #
VM_VLAGENTDEFAULT_VERSION: `${VM_LOGS_VERSION}` #
VM_VLAGENTDEFAULT_PORT: `9429` #
VM_VLAGENTDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VLAGENTDEFAULT_RESOURCE_LIMIT_MEM: `500Mi` #
VM_VLAGENTDEFAULT_RESOURCE_LIMIT_CPU: `200m` #
VM_VLAGENTDEFAULT_RESOURCE_REQUEST_MEM: `200Mi` #
VM_VLAGENTDEFAULT_RESOURCE_REQUEST_CPU: `50m` #
VM_VLSINGLEDEFAULT_IMAGE: `victoriametrics/victoria-logs` #
VM_VLSINGLEDEFAULT_VERSION: `${VM_LOGS_VERSION}` #
VM_VLSINGLEDEFAULT_PORT: `9428` #
VM_VLSINGLEDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VLSINGLEDEFAULT_RESOURCE_LIMIT_MEM: `1500Mi` #
VM_VLSINGLEDEFAULT_RESOURCE_LIMIT_CPU: `1200m` #
VM_VLSINGLEDEFAULT_RESOURCE_REQUEST_MEM: `500Mi` #
VM_VLSINGLEDEFAULT_RESOURCE_REQUEST_CPU: `150m` #
VM_VMALERTDEFAULT_IMAGE: `victoriametrics/vmalert` #
VM_VMALERTDEFAULT_VERSION: `${VM_METRICS_VERSION}` #
VM_VMALERTDEFAULT_CONFIGRELOADIMAGE: `jimmidyson/configmap-reload:v0.3.0` #
VM_VMALERTDEFAULT_PORT: `8080` #
VM_VMALERTDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VMALERTDEFAULT_RESOURCE_LIMIT_MEM: `500Mi` #
VM_VMALERTDEFAULT_RESOURCE_LIMIT_CPU: `200m` #
VM_VMALERTDEFAULT_RESOURCE_REQUEST_MEM: `200Mi` #
VM_VMALERTDEFAULT_RESOURCE_REQUEST_CPU: `50m` #
VM_VMALERTDEFAULT_CONFIGRELOADERCPU: `10m` # Deprecated:: use VM_CONFIG_RELOADER_REQUEST_CPU instead
VM_VMALERTDEFAULT_CONFIGRELOADERMEMORY: `25Mi` # Deprecated:: use VM_CONFIG_RELOADER_REQUEST_MEMORY instead
VM_VMSERVICESCRAPEDEFAULT_ENFORCEENDPOINTSLICES: `false` # Use endpointslices instead of endpoints as discovery role for vmservicescrape when generate scrape config for vmagent.
VM_VMAGENTDEFAULT_IMAGE: `victoriametrics/vmagent` #
VM_VMAGENTDEFAULT_VERSION: `${VM_METRICS_VERSION}` #
VM_VMAGENTDEFAULT_CONFIGRELOADIMAGE: `quay.io/prometheus-operator/prometheus-config-reloader:v0.82.1` #
VM_VMAGENTDEFAULT_PORT: `8429` #
VM_VMAGENTDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VMAGENTDEFAULT_RESOURCE_LIMIT_MEM: `500Mi` #
VM_VMAGENTDEFAULT_RESOURCE_LIMIT_CPU: `200m` #
VM_VMAGENTDEFAULT_RESOURCE_REQUEST_MEM: `200Mi` #
VM_VMAGENTDEFAULT_RESOURCE_REQUEST_CPU: `50m` #
VM_VMAGENTDEFAULT_CONFIGRELOADERCPU: `10m` # Deprecated:: use VM_CONFIG_RELOADER_REQUEST_CPU instead
VM_VMAGENTDEFAULT_CONFIGRELOADERMEMORY: `25Mi` # Deprecated:: use VM_CONFIG_RELOADER_REQUEST_MEMORY instead
VM_VMANOMALYDEFAULT_IMAGE: `victoriametrics/vmanomaly` #
VM_VMANOMALYDEFAULT_VERSION: `${VM_ANOMALY_VERSION}` #
VM_VMANOMALYDEFAULT_CONFIGRELOADIMAGE: `quay.io/prometheus-operator/prometheus-config-reloader:v0.82.1` #
VM_VMANOMALYDEFAULT_PORT: `8490` #
VM_VMANOMALYDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VMANOMALYDEFAULT_RESOURCE_LIMIT_MEM: `500Mi` #
VM_VMANOMALYDEFAULT_RESOURCE_LIMIT_CPU: `200m` #
VM_VMANOMALYDEFAULT_RESOURCE_REQUEST_MEM: `200Mi` #
VM_VMANOMALYDEFAULT_RESOURCE_REQUEST_CPU: `50m` #
VM_VMANOMALYDEFAULT_CONFIGRELOADERCPU: `10m` # Deprecated: use VM_CONFIG_RELOADER_REQUEST_CPU instead
VM_VMANOMALYDEFAULT_CONFIGRELOADERMEMORY: `25Mi` # Deprecated: use VM_CONFIG_RELOADER_REQUEST_MEMORY instead
VM_VMSINGLEDEFAULT_IMAGE: `victoriametrics/victoria-metrics` #
VM_VMSINGLEDEFAULT_VERSION: `${VM_METRICS_VERSION}` #
VM_VMSINGLEDEFAULT_PORT: `8429` #
VM_VMSINGLEDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_MEM: `1500Mi` #
VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_CPU: `1200m` #
VM_VMSINGLEDEFAULT_RESOURCE_REQUEST_MEM: `500Mi` #
VM_VMSINGLEDEFAULT_RESOURCE_REQUEST_CPU: `150m` #
VM_VMCLUSTERDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VMCLUSTERDEFAULT_VMSELECTDEFAULT_IMAGE: `victoriametrics/vmselect` #
VM_VMCLUSTERDEFAULT_VMSELECTDEFAULT_VERSION: `${VM_METRICS_VERSION}-cluster` #
VM_VMCLUSTERDEFAULT_VMSELECTDEFAULT_PORT: `8481` #
VM_VMCLUSTERDEFAULT_VMSELECTDEFAULT_RESOURCE_LIMIT_MEM: `1000Mi` #
VM_VMCLUSTERDEFAULT_VMSELECTDEFAULT_RESOURCE_LIMIT_CPU: `500m` #
VM_VMCLUSTERDEFAULT_VMSELECTDEFAULT_RESOURCE_REQUEST_MEM: `500Mi` #
VM_VMCLUSTERDEFAULT_VMSELECTDEFAULT_RESOURCE_REQUEST_CPU: `100m` #
VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_IMAGE: `victoriametrics/vmstorage` #
VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_VERSION: `${VM_METRICS_VERSION}-cluster` #
VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_VMINSERTPORT: `8400` #
VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_VMSELECTPORT: `8401` #
VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_PORT: `8482` #
VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_RESOURCE_LIMIT_MEM: `1500Mi` #
VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_RESOURCE_LIMIT_CPU: `1000m` #
VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_RESOURCE_REQUEST_MEM: `500Mi` #
VM_VMCLUSTERDEFAULT_VMSTORAGEDEFAULT_RESOURCE_REQUEST_CPU: `250m` #
VM_VMCLUSTERDEFAULT_VMINSERTDEFAULT_IMAGE: `victoriametrics/vminsert` #
VM_VMCLUSTERDEFAULT_VMINSERTDEFAULT_VERSION: `${VM_METRICS_VERSION}-cluster` #
VM_VMCLUSTERDEFAULT_VMINSERTDEFAULT_PORT: `8480` #
VM_VMCLUSTERDEFAULT_VMINSERTDEFAULT_RESOURCE_LIMIT_MEM: `500Mi` #
VM_VMCLUSTERDEFAULT_VMINSERTDEFAULT_RESOURCE_LIMIT_CPU: `500m` #
VM_VMCLUSTERDEFAULT_VMINSERTDEFAULT_RESOURCE_REQUEST_MEM: `200Mi` #
VM_VMCLUSTERDEFAULT_VMINSERTDEFAULT_RESOURCE_REQUEST_CPU: `150m` #
VM_VMALERTMANAGER_CONFIGRELOADERIMAGE: `jimmidyson/configmap-reload:v0.3.0` #
VM_VMALERTMANAGER_CONFIGRELOADERCPU: `10m` # Deprecated: use VM_CONFIG_RELOADER_REQUEST_CPU instead
VM_VMALERTMANAGER_CONFIGRELOADERMEMORY: `25Mi` # Deprecated: use VM_CONFIG_RELOADER_REQUEST_MEMORY instead
VM_VMALERTMANAGER_ALERTMANAGERDEFAULTBASEIMAGE: `prom/alertmanager` #
VM_VMALERTMANAGER_ALERTMANAGERVERSION: `v0.28.1` #
VM_VMALERTMANAGER_LOCALHOST: `127.0.0.1` #
VM_VMALERTMANAGER_USEDEFAULTRESOURCES: `true` #
VM_VMALERTMANAGER_RESOURCE_LIMIT_MEM: `256Mi` #
VM_VMALERTMANAGER_RESOURCE_LIMIT_CPU: `100m` #
VM_VMALERTMANAGER_RESOURCE_REQUEST_MEM: `56Mi` #
VM_VMALERTMANAGER_RESOURCE_REQUEST_CPU: `30m` #
VM_DISABLESELFSERVICESCRAPECREATION: `false` #
VM_VMBACKUP_IMAGE: `victoriametrics/vmbackupmanager` #
VM_VMBACKUP_VERSION: `${VM_METRICS_VERSION}-enterprise` #
VM_VMBACKUP_PORT: `8300` #
VM_VMBACKUP_USEDEFAULTRESOURCES: `true` #
VM_VMBACKUP_RESOURCE_LIMIT_MEM: `500Mi` #
VM_VMBACKUP_RESOURCE_LIMIT_CPU: `500m` #
VM_VMBACKUP_RESOURCE_REQUEST_MEM: `200Mi` #
VM_VMBACKUP_RESOURCE_REQUEST_CPU: `150m` #
VM_VMAUTHDEFAULT_IMAGE: `victoriametrics/vmauth` #
VM_VMAUTHDEFAULT_VERSION: `${VM_METRICS_VERSION}` #
VM_VMAUTHDEFAULT_CONFIGRELOADIMAGE: `quay.io/prometheus-operator/prometheus-config-reloader:v0.82.1` #
VM_VMAUTHDEFAULT_PORT: `8427` #
VM_VMAUTHDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VMAUTHDEFAULT_RESOURCE_LIMIT_MEM: `300Mi` #
VM_VMAUTHDEFAULT_RESOURCE_LIMIT_CPU: `200m` #
VM_VMAUTHDEFAULT_RESOURCE_REQUEST_MEM: `100Mi` #
VM_VMAUTHDEFAULT_RESOURCE_REQUEST_CPU: `50m` #
VM_VMAUTHDEFAULT_CONFIGRELOADERCPU: `10m` # Deprecated: use VM_CONFIG_RELOADER_REQUEST_CPU instead
VM_VMAUTHDEFAULT_CONFIGRELOADERMEMORY: `25Mi` # Deprecated: use VM_CONFIG_RELOADER_REQUEST_MEMORY instead
VM_VLCLUSTERDEFAULT_USEDEFAULTRESOURCES: `true` #
VM_VLCLUSTERDEFAULT_VLSELECTDEFAULT_IMAGE: `victoriametrics/victoria-logs` #
VM_VLCLUSTERDEFAULT_VLSELECTDEFAULT_VERSION: `${VM_LOGS_VERSION}` #
VM_VLCLUSTERDEFAULT_VLSELECTDEFAULT_PORT: `9471` #
VM_VLCLUSTERDEFAULT_VLSELECTDEFAULT_RESOURCE_LIMIT_MEM: `1024Mi` #
VM_VLCLUSTERDEFAULT_VLSELECTDEFAULT_RESOURCE_LIMIT_CPU: `1000m` #
VM_VLCLUSTERDEFAULT_VLSELECTDEFAULT_RESOURCE_REQUEST_MEM: `256Mi` #
VM_VLCLUSTERDEFAULT_VLSELECTDEFAULT_RESOURCE_REQUEST_CPU: `100m` #
VM_VLCLUSTERDEFAULT_VLSTORAGEDEFAULT_IMAGE: `victoriametrics/victoria-logs` #
VM_VLCLUSTERDEFAULT_VLSTORAGEDEFAULT_VERSION: `${VM_LOGS_VERSION}` #
VM_VLCLUSTERDEFAULT_VLSTORAGEDEFAULT_PORT: `9491` #
VM_VLCLUSTERDEFAULT_VLSTORAGEDEFAULT_RESOURCE_LIMIT_MEM: `2048Mi` #
VM_VLCLUSTERDEFAULT_VLSTORAGEDEFAULT_RESOURCE_LIMIT_CPU: `1000m` #
VM_VLCLUSTERDEFAULT_VLSTORAGEDEFAULT_RESOURCE_REQUEST_MEM: `512Mi` #
VM_VLCLUSTERDEFAULT_VLSTORAGEDEFAULT_RESOURCE_REQUEST_CPU: `200m` #
VM_VLCLUSTERDEFAULT_VLINSERTDEFAULT_IMAGE: `victoriametrics/victoria-logs` #
VM_VLCLUSTERDEFAULT_VLINSERTDEFAULT_VERSION: `${VM_LOGS_VERSION}` #
VM_VLCLUSTERDEFAULT_VLINSERTDEFAULT_PORT: `9481` #
VM_VLCLUSTERDEFAULT_VLINSERTDEFAULT_RESOURCE_LIMIT_MEM: `1024Mi` #
VM_VLCLUSTERDEFAULT_VLINSERTDEFAULT_RESOURCE_LIMIT_CPU: `1000m` #
VM_VLCLUSTERDEFAULT_VLINSERTDEFAULT_RESOURCE_REQUEST_MEM: `256Mi` #
VM_VLCLUSTERDEFAULT_VLINSERTDEFAULT_RESOURCE_REQUEST_CPU: `100m` #
VM_ENABLEDPROMETHEUSCONVERTER_PODMONITOR: `true` #
VM_ENABLEDPROMETHEUSCONVERTER_SERVICESCRAPE: `true` #
VM_ENABLEDPROMETHEUSCONVERTER_PROMETHEUSRULE: `true` #
VM_ENABLEDPROMETHEUSCONVERTER_PROBE: `true` #
VM_ENABLEDPROMETHEUSCONVERTER_ALERTMANAGERCONFIG: `true` #
VM_ENABLEDPROMETHEUSCONVERTER_SCRAPECONFIG: `true` #
VM_FILTERCHILDLABELPREFIXES: `-` #
VM_FILTERCHILDANNOTATIONPREFIXES: `-` #
VM_PROMETHEUSCONVERTERADDARGOCDIGNOREANNOTATIONS: `false` # adds compare-options and sync-options for prometheus objects converted by operator. It helps to properly use converter with ArgoCD
VM_ENABLEDPROMETHEUSCONVERTEROWNERREFERENCES: `false` #
VM_FILTERPROMETHEUSCONVERTERLABELPREFIXES: `-` # allows filtering for converted labels, labels with matched prefix will be ignored
VM_FILTERPROMETHEUSCONVERTERANNOTATIONPREFIXES: `-` # allows filtering for converted annotations, annotations with matched prefix will be ignored
VM_CLUSTERDOMAINNAME: `-` # Defines domain name suffix for in-cluster addresses most known ClusterDomainName is .cluster.local
VM_APPREADYTIMEOUT: `80s` # Defines deadline for deployment/statefulset to transit into ready state to wait for transition to ready state
VM_PODWAITREADYTIMEOUT: `80s` # Defines single pod deadline to wait for transition to ready state
VM_PODWAITREADYINTERVALCHECK: `5s` # Defines poll interval for pods ready check at statefulset rollout update
VM_FORCERESYNCINTERVAL: `60s` # configures force resync interval for VMAgent, VMAlert, VMAlertmanager and VMAuth.
VM_ENABLESTRICTSECURITY: `false` # EnableStrictSecurity will add default `securityContext` to pods and containers created by operator Default PodSecurityContext include: 1. RunAsNonRoot: true 2. RunAsUser/RunAsGroup/FSGroup: 65534 ‘65534’ refers to ’nobody’ in all the used default images like alpine, busybox. If you’re using customize image, please make sure ‘65534’ is a valid uid in there or specify SecurityContext. 3. FSGroupChangePolicy: &onRootMismatch If KubeVersion>=1.20, use `FSGroupChangePolicy="onRootMismatch"` to skip the recursive permission change when the root of the volume already has the correct permissions 4. SeccompProfile: type: RuntimeDefault Use `RuntimeDefault` seccomp profile by default, which is defined by the container runtime, instead of using the Unconfined (seccomp disabled) mode. Default container SecurityContext include: 1. AllowPrivilegeEscalation: false 2. ReadOnlyRootFilesystem: true 3. Capabilities: drop: - all turn off `EnableStrictSecurity` by default, see https://github.com/VictoriaMetrics/operator/issues/749 for details

Modify environment variables #

To add environment variables to the operator, use the following Kustomize -based approach. This method assumes the operator was installed using the Quick Start guide . Alternatively, you can edit the manifest file directly. If you used Helm, apply the changes using Helm’s values configuration.

The example below customize CPU\Memory default limits for VMSingle resource. The commands create a patch add-operator-envs/patch.yaml that adds environment variables to the operator deployment, a add-operator-envs/kustomization.yaml configuration to apply the patch, and then call kustomize build to rewrite the operator-and-crds.yaml file with the applied changes:

      mkdir -p add-operator-envs;

cat <<'EOF' > add-operator-envs/patch.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: vm-operator
  namespace: vm
spec:
  template:
    spec:
      containers:
      - name: manager
        env:
        - name: VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_MEM
          value: "3000Mi"
        - name: VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_CPU
          value: "2400m"
EOF

cat <<'EOF' > add-operator-envs/kustomization.yaml
resources:
  - ../operator-and-crds.yaml

patches:
  - path: patch.yaml
    target:
      kind: Deployment
      name: vm-operator
EOF

kustomize build add-operator-envs -o operator-and-crds.yaml --load-restrictor=LoadRestrictionsNone;
cat operator-and-crds.yaml | grep -E -A 1 "VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_MEM|VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_CPU";

# Output:
#        - name: VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_MEM
#          value: 3000Mi
#        - name: VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_CPU
#          value: 2400m
    

Apply the changes to the operator deployment:

      kubectl apply -f operator-and-crds.yaml;
kubectl -n vm rollout status deployment vm-operator --watch=true;

# Output:
# Waiting for deployment "vm-operator" rollout to finish: 1 old replicas are pending termination...
# Waiting for deployment "vm-operator" rollout to finish: 1 old replicas are pending termination...
# deployment "vm-operator" successfully rolled out
    

Run this command to print modified environment variables:

      kubectl get deployment -n vm vm-operator \
    -o jsonpath='{range .spec.template.spec.containers[?(@.name=="manager")].env[*]}{.name}{"\n"}{end}';

# Output:
# WATCH_NAMESPACE
# VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_MEM
# VM_VMSINGLEDEFAULT_RESOURCE_LIMIT_CPU
    

Flags #

Run this command to see all flags your operator supports:

      OPERATOR_POD_NAME=$(kubectl get pod -l "app.kubernetes.io/name=victoria-metrics-operator"  -n vm -o jsonpath="{.items[0].metadata.name}");
kubectl exec -n vm "$OPERATOR_POD_NAME" -- /app --help 2>&1;

# Output:
# Usage of /app:
#   -client.burst int
#       defines K8s client burst (default 100)
# ...
    

This is the latest operator flags:

      Usage of bin/operator:
  -client.burst int
    	defines K8s client burst (default 100)
  -client.qps int
    	defines K8s client QPS. The value should be increased for the cluster with large number of objects > 10_000. (default 50)
  -controller.cacheSyncTimeout duration
    	controls timeout for caches to be synced. (default 3m0s)
  -controller.disableCRDOwnership
    	disables CRD ownership add to cluster wide objects, must be disabled for clusters, lower than v1.16.0
  -controller.disableCacheFor string
    	disables client for cache for API resources. Supported objects - namespace,pod,service,secret,configmap,deployment,statefulset (default "configmap,secret")
  -controller.disableReconcileFor string
    	disables reconcile controllers for given list of comma separated CRD names. For example - VMCluster,VMSingle,VMAuth.Note, child controllers still require parent object CRDs.
  -controller.maxConcurrentReconciles int
    	Configures number of concurrent reconciles. It should improve performance for clusters with many objects. (default 15)
  -controller.prometheusCRD.resyncPeriod duration
    	Configures resync period for prometheus CRD converter. Disabled by default
  -controller.statusLastUpdateTimeTTL duration
    	Configures TTL for LastUpdateTime status.conditions fields. It's used to detect stale parent objects on child objects. Like VMAlert->VMRule .status.Conditions.Type (default 1h0m0s)
  -default.kubernetesVersion.major uint
    	Major version of kubernetes server, if operator cannot parse actual kubernetes response (default 1)
  -default.kubernetesVersion.minor uint
    	Minor version of kubernetes server, if operator cannot parse actual kubernetes response (default 21)
  -disableSecretKeySpaceTrim
    	disables trim of space at Secret/Configmap value content. It's a common mistake to put new line to the base64 encoded secret value.
  -health-probe-bind-address string
    	The address the probes (health, ready) binds to. (default ":8081")
  -leader-elect
    	Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.
  -leader-elect-id string
    	Defines the name of the resource that leader election will use for holding the leader lock. (default "57410f0d.victoriametrics.com")
  -leader-elect-lease-duration duration
    	Defines the duration that non-leader candidates will wait to force acquire leadership. This is measured against time of last observed ack. (default 15s)
  -leader-elect-namespace string
    	Defines optional namespace name in which the leader election resource will be created. By default, uses in-cluster namespace name.
  -leader-elect-renew-deadline duration
    	Defines the duration that the acting controlplane will retry refreshing leadership lock before giving up. (default 10s)
  -loggerJSONFields string
    	Allows renaming fields in JSON formatted logsExample: "ts:timestamp,msg:message" renames "ts" to "timestamp" and "msg" to "message".Supported fields: ts, level, caller, msg
  -metrics-bind-address string
    	The address the metric endpoint binds to. (default ":8080")
  -mtls.CAName string
    	Optional name of TLS Root CA for verifying client certificates at the corresponding -metrics-bind-address when -mtls.enable is enabled. By default the host system TLS Root CA is used for client certificate verification.  (default "clietCA.crt")
  -mtls.enable
    	Whether to require valid client certificate for https requests to the corresponding -metrics-bind-address. This flag works only if -tls.enable flag is set.
  -pprof-addr string
    	The address for pprof/debug API. Empty value disables server (default ":8435")
  -printDefaults
    	print all variables with their default values and exit
  -printFormat string
    	output format for --printDefaults. Can be table, json, yaml or list (default "table")
  -tls.certDir string
    	root directory for metrics webserver cert, key and mTLS CA. (default "/tmp/k8s-metrics-server/serving-certs")
  -tls.certName string
    	name of metric server Tls certificate inside tls.certDir. Default -  (default "tls.crt")
  -tls.enable
    	enables secure tls (https) for metrics webserver.
  -tls.keyName string
    	name of metric server Tls key inside tls.certDir. Default - tls.key (default "tls.key")
  -version
    	Show operator version
  -webhook.certDir string
    	root directory for webhook cert and key (default "/tmp/k8s-webhook-server/serving-certs/")
  -webhook.certName string
    	name of webhook server Tls certificate inside tls.certDir (default "tls.crt")
  -webhook.enable
    	adds webhook server, you must mount cert and key or use cert-manager
  -webhook.keyName string
    	name of webhook server Tls key inside tls.certDir (default "tls.key")
  -webhook.port int
    	port to start webhook server on (default 9443)
  -zap-devel
    	Development Mode defaults(encoder=consoleEncoder,logLevel=Debug,stackTraceLevel=Warn). Production Mode defaults(encoder=jsonEncoder,logLevel=Info,stackTraceLevel=Error)
  -zap-encoder value
    	Zap log encoding (one of 'json' or 'console')
  -zap-log-level value
    	Zap Level to configure the verbosity of logging. Can be one of 'debug', 'info', 'error', 'panic'or any integer value > 0 which corresponds to custom debug levels of increasing verbosity
    	Note: warn is missing by design due to warn level not being supported by controller-runtime
    	See: https://dave.cheney.net/2015/11/05/lets-talk-about-logging and https://github.com/kubernetes-sigs/controller-runtime/issues/2002 for more information.
  -zap-stacktrace-level value
    	Zap Level at and above which stacktraces are captured (one of 'info', 'error', 'panic').
  -zap-time-encoding value
    	Zap time encoding (one of 'epoch', 'millis', 'nano', 'iso8601', 'rfc3339' or 'rfc3339nano'). Defaults to 'epoch'.
    

Modify flags #

To add flags to the operator, use the following Kustomize -based approach. This method assumes the operator was installed using the Quick Start guide . Alternatively, you can edit the manifest file directly. If you used Helm, apply the changes using Helm’s values configuration.

The example below shows how to change log level. The commands create a patch add-operator-flag/patch.yaml that adds command line argument to the operator deployment, a add-operator-flag/kustomization.yaml configuration to apply the patch, and then call kustomize build to rewrite the operator-and-crds.yaml file with the applied changes:

      mkdir -p add-operator-flag;

cat <<'EOF' > add-operator-flag/patch.yaml
- op: add
  path: /spec/template/spec/containers/0/args/-
  value: '-zap-log-level=debug'
EOF

cat <<'EOF' > add-operator-flag/kustomization.yaml
resources:
  - ../operator-and-crds.yaml

patches:
  - path: patch.yaml
    target:
      kind: Deployment
      name: vm-operator
EOF

kustomize build add-operator-flag -o operator-and-crds.yaml --load-restrictor=LoadRestrictionsNone;
cat operator-and-crds.yaml | grep "zap-log-level";

# Output:
#        - -zap-log-level=debug
    

Apply the changes to the operator deployment:

      kubectl apply -f operator-and-crds.yaml;
kubectl -n vm rollout status deployment vm-operator --watch=true;

# Output:
# Waiting for deployment "vm-operator" rollout to finish: 1 old replicas are pending termination...
# Waiting for deployment "vm-operator" rollout to finish: 1 old replicas are pending termination...
# deployment "vm-operator" successfully rolled out
    

Run this command to print modified flags variables:

      kubectl get deployment -n vm vm-operator \
  -o jsonpath='{range .spec.template.spec.containers[?(@.name=="manager")]}{.args[*]}{end}{"\n"}'

# Output:
# --leader-elect --health-probe-bind-address=:8081 --metrics-bind-address=:8080 -zap-log-level=debug
    

Scrape operator metrics #

To collect the operator metrics, you can create a VMServiceScrape resource. Configure it to collects metrics from the pods that match the operator labels. Apply scrape config in vm namespace, same where the operator is running.

The example below works if you installed the operator using the Quick Start - Operator section. You may need to update the match labels and\or namespace to fit your own operator setup.

      cat <<'EOF' > operator-scrape.yaml
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
  name: operator-service-scrape
  namespace: vm
spec:
  selector:
    matchLabels:
      # You might need to change the labels below
      app.kubernetes.io/instance: default
      app.kubernetes.io/name: victoria-metrics-operator
  endpoints:
    - port: http
  # Uncomment the lines below if the VMServiceScrape is applied in  a namespace
  # different from the one where the operator is running.
  # namespaceSelector:
  #  matchNames:
  #    - default
EOF

kubectl apply -f operator-scrape.yaml;
kubectl wait -n vm --for=jsonpath='{.status.updateStatus}'=operational vmservicescrape/operator-service-scrape;

# Output:
# vmservicescrape.operator.victoriametrics.com/operator-service-scrape created
# vmservicescrape.operator.victoriametrics.com/operator-service-scrape condition met
    

You can check if the operator metrics are collected correctly by using the vmagent UI. Note, It may take a minute or two for vmagent to load the new scrape config and begin collecting the metrics.

You can find instructions for accessing the vmagent UI in the Quick Start - Scraping section.

Conversion of prometheus-operator objects #

You can read detailed instructions about configuring prometheus-objects conversion in this document .

Helm-charts #

In Helm charts some important configuration parameters are implemented as separate flags in values.yaml:

victoria-metrics-k8s-stack #

For possible values refer to parameters .

Also, checkout here possible ENV variables to configure operator behaviour. ENV variables can be set in the victoria-metrics-operator.env section.

      # values.yaml

victoria-metrics-operator:
  image:
    # -- Image repository
    repository: victoriametrics/operator
    # -- Image tag
    tag: v0.35.0
    # -- Image pull policy
    pullPolicy: IfNotPresent

  # -- Tells helm to remove CRD after chart remove
  cleanupCRD: true
  cleanupImage:
    repository: gcr.io/google_containers/hyperkube
    tag: v1.18.0
    pullPolicy: IfNotPresent

  operator:
    # -- By default, operator converts prometheus-operator objects.
    disable_prometheus_converter: false
    # -- Compare-options and sync-options for prometheus objects converted by operator for properly use with ArgoCD
    prometheus_converter_add_argocd_ignore_annotations: false
    # -- Enables ownership reference for converted prometheus-operator objects,
    # it will remove corresponding victoria-metrics objects in case of deletion prometheus one.
    enable_converter_ownership: false
    # -- By default, operator creates psp for its objects.
    psp_auto_creation_enabled: true
    # -- Enables custom config-reloader, bundled with operator.
    # It should reduce  vmagent and vmauth config sync-time and make it predictable.
    useCustomConfigReloader: false

  # -- extra settings for the operator deployment. full list Ref: https://docs.victoriametrics.com/operator/vars
  env:
    # -- default version for vmsingle
    - name: VM_VMSINGLEDEFAULT_VERSION
      value: v1.43.0
    # -- container registry name prefix, e.g. docker.io
    - name: VM_CONTAINERREGISTRY
      value: ""
    # -- image for custom reloader (see the useCustomConfigReloader parameter)
    - name: VM_CUSTOMCONFIGRELOADERIMAGE
      value: victoriametrics/operator:config-reloader-v0.32.0

  # By default, the operator will watch all the namespaces
  # If you want to override this behavior, specify the namespace it needs to watch separated by a comma.
  # Ex: my_namespace1,my_namespace2
  watchNamespace: ""

  # Count of operator instances (can be increased for HA mode)
  replicaCount: 1

  # -- VM operator log level
  # -- possible values: info and error.
  logLevel: "info"

  # -- Resource object
  resources:
    {}
    # limits:
    #   cpu: 120m
    #   memory: 320Mi
    # requests:
    #   cpu: 80m
    #   memory: 120Mi
    

victoria-metrics-operator #

For possible values refer to parameters .

Also, checkout here possible ENV variables to configure operator behaviour. ENV variables can be set in the env section.

      # values.yaml

image:
  # -- Image repository
  repository: victoriametrics/operator
  # -- Image tag
  tag: v0.35.0
  # -- Image pull policy
  pullPolicy: IfNotPresent

operator:
  # -- By default, operator converts prometheus-operator objects.
  disable_prometheus_converter: false
  # -- Compare-options and sync-options for prometheus objects converted by operator for properly use with ArgoCD
  prometheus_converter_add_argocd_ignore_annotations: false
  # -- Enables ownership reference for converted prometheus-operator objects,
  # it will remove corresponding victoria-metrics objects in case of deletion prometheus one.
  enable_converter_ownership: false
  # -- By default, operator creates psp for its objects.
  psp_auto_creation_enabled: true
  # -- Enables custom config-reloader, bundled with operator.
  # It should reduce  vmagent and vmauth config sync-time and make it predictable.
  useCustomConfigReloader: false

# -- extra settings for the operator deployment. full list Ref: https://docs.victoriametrics.com/operator/vars
env:
  # -- default version for vmsingle
  - name: VM_VMSINGLEDEFAULT_VERSION
    value: v1.43.0
  # -- container registry name prefix, e.g. docker.io
  - name: VM_CONTAINERREGISTRY
    value: ""
  # -- image for custom reloader (see the useCustomConfigReloader parameter)
  - name: VM_CUSTOMCONFIGRELOADERIMAGE
    value: victoriametrics/operator:config-reloader-v0.32.0

# By default, the operator will watch all the namespaces
# If you want to override this behavior, specify the namespace it needs to watch separated by a comma.
# Ex: my_namespace1,my_namespace2
watchNamespace: ""

# Count of operator instances (can be increased for HA mode)
replicaCount: 1

# -- VM operator log level
# -- possible values: info and error.
logLevel: "info"

# -- Resource object
resources:
  {}
  # limits:
  #   cpu: 120m
  #   memory: 320Mi
  # requests:
  #   cpu: 80m
  #   memory: 120Mi
    

Namespaced mode #

By default, the operator will watch all namespaces, but it can be configured to watch only specific namespace or multiple namespaces.

If you want to override this behavior, specify the namespace:

in the WATCH_NAMESPACE environment variable.
in the watchNamespace field in the values.yaml file of helm-charts.

The operator supports comma separated namespace names for this setting.

If namespaced mode is enabled, operator uses a limited set of features:

it cannot make any cluster wide API calls.
it cannot assign rbac permissions for vmagent. It must be done manually via serviceAccount for vmagent.
it ignores namespaceSelector fields at CRD objects and uses WATCH_NAMESPACE value for object matching.

At each namespace operator must have a set of required permissions, an example can be found at this file .

Monitoring of cluster components #

By default, operator creates VMServiceScrape object for each component that it manages.

You can disable this behaviour with VM_DISABLESELFSERVICESCRAPECREATION environment variable:

      VM_DISABLESELFSERVICESCRAPECREATION=false
    

Also, you can override default configuration for self-scraping with ServiceScrapeSpec field in each deployable resource (vmcluster/select, vmcluster/insert, vmcluster/storage, vmagent, vmalert, vmalertmanager, vmauth, vmsingle):

CRD Validation #

Operator supports validation admission webhook docs

It checks resources configuration and returns errors to caller before resource will be created at kubernetes api. This should reduce errors and simplify debugging.

Validation hooks at operator side must be enabled with flags:

      ./operator
    --webhook.enable
    # optional configuration for certDir and tls names.
    --webhook.certDir=/tmp/k8s-webhook-server/serving-certs/
    --webhook.keyName=tls.key
    --webhook.certName=tls.crt
    

You have to mount correct certificates at give directory. It can be simplified with cert-manager and kustomize command:

      kustomize build config/deployments/webhook/

Requirements #

Valid certificate with key must be provided to operator
Valid CABundle must be added to the ValidatingWebhookConfiguration

Useful links #

k8s admission webhooks

The following legacy links are retained for historical reference.

List of command-line flags #

Moved to operator/configuration/#flags

Previous Security Next Migration from Prometheus