Victoria Metrics Operator
Prerequisites #
- Install the follow packages:
git
,kubectl
,helm
,helm-docs
. See this tutorial. - PV support on underlying infrastructure.
ArgoCD issues #
When running operator using ArgoCD without Cert Manager (.Values.admissionWebhooks.certManager.enabled: false
) it will rerender webhook certificates
on each sync since Helm lookup
function is not respected by ArgoCD. To prevent this please update you operator Application spec.syncPolicy
and spec.ignoreDifferences
with a following:
apiVersion: argoproj.io/v1alpha1
kind: Application
...
spec:
...
destination:
...
namespace: <operator-namespace>
...
syncPolicy:
syncOptions:
# https://argo-cd.readthedocs.io/en/stable/user-guide/sync-options/#respect-ignore-difference-configs
# argocd must also ignore difference during apply stage
# otherwise it ll silently override changes and cause a problem
- RespectIgnoreDifferences=true
ignoreDifferences:
- group: ""
kind: Secret
name: <fullname>-validation
namespace: <operator-namespace>
jsonPointers:
- /data
- group: admissionregistration.k8s.io
kind: ValidatingWebhookConfiguration
name: <fullname>-admission
jqPathExpressions:
- '.webhooks[]?.clientConfig.caBundle'
where <fullname>
is output of {{ include "vm-operator.fullname" }}
for your setup
Upgrade guide #
During release an issue with helm CRD was discovered. So for upgrade from version less then 0.1.3 you have to two options:
- use helm management for CRD, enabled by default.
- use own management system, need to add variable: –set createCRD=false.
If you choose helm management, following steps must be done before upgrade:
- define namespace and helm release name variables
export NAMESPACE=default
export RELEASE_NAME=operator
execute kubectl commands:
kubectl get crd | grep victoriametrics.com | awk '{print $1 }' | xargs -i kubectl label crd {} app.kubernetes.io/managed-by=Helm --overwrite
kubectl get crd | grep victoriametrics.com | awk '{print $1 }' | xargs -i kubectl annotate crd {} meta.helm.sh/release-namespace="$NAMESPACE" meta.helm.sh/release-name="$RELEASE_NAME" --overwrite
run helm upgrade command.
Chart Details #
This chart will do the following:
- Rollout victoria metrics operator
How to install #
Access a Kubernetes cluster.
Setup chart repository (can be omitted for OCI repositories) #
Add a chart helm repository with follow commands:
helm repo add vm https://victoriametrics.github.io/helm-charts/
helm repo update
List versions of vm/victoria-metrics-operator
chart available to installation:
helm search repo vm/victoria-metrics-operator -l
Install victoria-metrics-operator
chart
#
Export default values of victoria-metrics-operator
chart to file values.yaml
:
For HTTPS repository
helm show values vm/victoria-metrics-operator > values.yaml
For OCI repository
helm show values oci://ghcr.io/victoriametrics/helm-charts/victoria-metrics-operator > values.yaml
Change the values according to the need of the environment in values.yaml
file.
Test the installation with command:
For HTTPS repository
helm install vmo vm/victoria-metrics-operator -f values.yaml -n NAMESPACE --debug --dry-run
For OCI repository
helm install vmo oci://ghcr.io/victoriametrics/helm-charts/victoria-metrics-operator -f values.yaml -n NAMESPACE --debug --dry-run
Install chart with command:
For HTTPS repository
helm install vmo vm/victoria-metrics-operator -f values.yaml -n NAMESPACE
For OCI repository
helm install vmo oci://ghcr.io/victoriametrics/helm-charts/victoria-metrics-operator -f values.yaml -n NAMESPACE
Get the pods lists by running this commands:
kubectl get pods -A | grep 'vmo'
Get the application by running this command:
helm list -f vmo -n NAMESPACE
See the history of versions of vmo
application with command.
helm history vmo -n NAMESPACE
Validation webhook #
Its possible to use validation of created resources with operator. For now, you need cert-manager to easily certificate management https://cert-manager.io/docs/
admissionWebhooks:
enabled: true
# what to do in case, when operator not available to validate request.
certManager:
# enables cert creation and injection by cert-manager
enabled: true
How to uninstall #
Remove application with command.
helm uninstall vmo -n NAMESPACE
Documentation of Helm Chart #
Install helm-docs
following the instructions on this tutorial.
Generate docs with helm-docs
command.
cd charts/victoria-metrics-operator
helm-docs
The markdown generation is entirely go template driven. The tool parses metadata from charts and generates a number of sub-templates that can be referenced in a template file (by default README.md.gotmpl
). If no template file is provided, the tool has a default internal template that will generate a reasonably formatted README.
Disabling automatic ServiceAccount token mount #
There are cases when it is required to disable automatic ServiceAccount token mount due to hardening reasons. To disable it, set the following values:
serviceAccount:
automountServiceAccountToken: false
extraVolumes:
- name: operator
projected:
sources:
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
- configMap:
name: kube-root-ca.crt
- serviceAccountToken:
expirationSeconds: 7200
path: token
extraVolumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: operator
This configuration disables the automatic ServiceAccount token mount and mounts the token explicitly.
Enable hostNetwork on operator #
When running managed Kubernetes such as EKS with custom CNI solution like Cilium or Calico, EKS control plane cannot communicate with CNI’s pod CIDR. In that scenario, we need to run webhook service i.e operator with hostNetwork so that it can share node’s network namespace.
hostNetwork: true
Parameters #
The following tables lists the configurable parameters of the chart and their default values.
Change the values according to the need of the environment in victoria-metrics-operator/values.yaml
file.
Key | Type | Default | Description |
---|---|---|---|
admissionWebhooks | object |
| Configures resource validation |
admissionWebhooks.certManager | object |
| Enables custom ca bundle, if you are not using cert-manager. In case of custom ca, you have to create secret - {chart-name}-validation with keys: tls.key, tls.crt, ca.crt |
admissionWebhooks.certManager.ca | object |
| Certificate Authority parameters |
admissionWebhooks.certManager.cert | object |
| Certificate parameters |
admissionWebhooks.certManager.enabled | bool |
| Enables cert creation and injection by cert-manager. |
admissionWebhooks.certManager.issuer | object |
| If needed, provide own issuer. Operator will create self-signed if empty. |
admissionWebhooks.enabled | bool |
| Enables validation webhook. |
admissionWebhooks.policy | string |
| What to do in case, when operator not available to validate request. |
affinity | object |
| Pod affinity |
annotations | object |
| Annotations to be added to the all resources |
crds.cleanup.enabled | bool |
| Tells helm to clean up all the vm resources under this release’s namespace when uninstalling |
crds.cleanup.image | object |
| Image configuration for CRD cleanup Job |
crds.cleanup.resources | object |
| Cleanup hook resources |
crds.enabled | bool |
| manages CRD creation. Disables CRD creation only in combination with |
crds.plain | bool |
| check if plain or templated CRDs should be created. with this option set to |
env | list |
| Extra settings for the operator deployment. Full list here |
envFrom | list |
| Specify alternative source for env variables |
extraArgs | object |
| Operator container additional commandline arguments |
extraContainers | list |
| Extra containers to run in a pod with operator |
extraHostPathMounts | list |
| Additional hostPath mounts |
extraLabels | object |
| Labels to be added to the all resources |
extraObjects | list |
| Add extra specs dynamically to this chart |
extraVolumeMounts | list |
| Extra Volume Mounts for the container |
extraVolumes | list |
| Extra Volumes for the pod |
fullnameOverride | string |
| Overrides the full name of server component resources |
global.cluster.dnsDomain | string |
| K8s cluster domain suffix, uses for building storage pods’ FQDN. Details are here |
global.compatibility | object |
| Openshift security context compatibility configuration |
global.image.registry | string |
| Image registry, that can be shared across multiple helm charts |
global.imagePullSecrets | list |
| Image pull secrets, that can be shared across multiple helm charts |
hostNetwork | bool |
| Enable hostNetwork on operator deployment |
image | object |
| operator image configuration |
image.pullPolicy | string |
| Image pull policy |
image.registry | string |
| Image registry |
image.repository | string |
| Image repository |
image.tag | string |
| Image tag override Chart.AppVersion |
imagePullSecrets | list |
| Secret to pull images |
lifecycle | object |
| Operator lifecycle. See this article for details. |
logLevel | string |
| VM operator log level. Possible values: info and error. |
nameOverride | string |
| Override chart name |
nodeSelector | object |
| Pod’s node selector. Details are here |
operator.disable_prometheus_converter | bool |
| By default, operator converts prometheus-operator objects. |
operator.enable_converter_ownership | bool |
| Enables ownership reference for converted prometheus-operator objects, it will remove corresponding victoria-metrics objects in case of deletion prometheus one. |
operator.prometheus_converter_add_argocd_ignore_annotations | bool |
| Compare-options and sync-options for prometheus objects converted by operator for properly use with ArgoCD |
operator.useCustomConfigReloader | bool |
| Enables custom config-reloader, bundled with operator. It should reduce vmagent and vmauth config sync-time and make it predictable. |
podDisruptionBudget | object |
| See |
podLabels | object |
| extra Labels for Pods only |
podSecurityContext | object |
| Pod’s security context. Details are here |
priorityClassName | string |
| Name of Priority Class |
probe.liveness | object |
| Liveness probe |
probe.readiness | object |
| Readiness probe |
probe.startup | object |
| Startup probe |
rbac.aggregatedClusterRoles | object |
| Create aggregated clusterRoles for CRD readonly and admin permissions |
rbac.aggregatedClusterRoles.labels | object |
| Labels attached to according clusterRole |
rbac.create | bool |
| Specifies whether the RBAC resources should be created |
replicaCount | int |
| Number of operator replicas |
resources | object |
| Resource object |
securityContext | object |
| Security context to be added to server pods |
service.annotations | object |
| Service annotations |
service.clusterIP | string |
| Service ClusterIP |
service.externalIPs | string |
| Service external IPs. Check here for details |
service.externalTrafficPolicy | string |
| Service external traffic policy. Check here for details |
service.healthCheckNodePort | string |
| Health check node port for a service. Check here for details |
service.ipFamilies | list |
| List of service IP families. Check here for details. |
service.ipFamilyPolicy | string |
| Service IP family policy. Check here for details. |
service.labels | object |
| Service labels |
service.loadBalancerIP | string |
| Service load balancer IP |
service.loadBalancerSourceRanges | list |
| Load balancer source range |
service.servicePort | int |
| Service port |
service.type | string |
| Service type |
service.webhookPort | int |
| Service webhook port |
serviceAccount.automountServiceAccountToken | bool |
| Whether to automount the service account token. Note that token needs to be mounted manually if this is disabled. |
serviceAccount.create | bool |
| Specifies whether a service account should be created |
serviceAccount.name | string |
| The name of the service account to use. If not set and create is true, a name is generated using the fullname template |
serviceMonitor | object |
| Configures monitoring with serviceScrape. VMServiceScrape must be pre-installed |
terminationGracePeriodSeconds | int |
| Graceful pod termination timeout. See this article for details. |
tolerations | list |
| Array of tolerations object. Spec is here |
topologySpreadConstraints | list |
| Pod Topology Spread Constraints. Spec is here |
watchNamespaces | list |
| By default, the operator will watch all the namespaces If you want to override this behavior, specify the namespace. Operator supports multiple namespaces for watching. |