Migration Guide
Migrate the Read Operator (zxporter) from the Prometheus-based architecture to the new nodemon-based collection pipeline.
Migration Guide
The new zxporter collects all metrics via the nodemon DaemonSet (polling kubelet directly) instead of Prometheus. This guide walks you through upgrading an existing installation.
| Before | After | |
|---|---|---|
| Metrics collection | Prometheus server | nodemon DaemonSet (bundled) |
| Removed | Prometheus, kube-state-metrics, node-exporter, dz-metrics-server | — |
| Unchanged | Cluster token, cluster identity on dashboard | — |
Quick Upgrade
For most clusters, one command is enough:
curl -XPOST -H 'Authorization: Bearer <YOUR_PAT>' \
"https://dakr.devzero.io/dakr/installer-updater" \
| kubectl apply -f -This auto-detects your namespace, preserves your ConfigMap/Secret, deploys nodemon, and cleans up old Prometheus resources.
If the quick upgrade doesn't work for your setup, follow the full manual migration below.
Full Manual Migration
Use this when you want full control, or the automated path doesn't work.
- Time required: 10–15 minutes
- Data gap: ~2–5 minutes between deleting old pods and new pods sending data
- Risk: Zero if you save the token correctly in Step 2
Download the new manifest
Get the install command from the DevZero UI. Do NOT run it yet — save the manifest to a file first:
curl -XPOST \
-H "Authorization: Bearer <YOUR_PAT_OR_BEARER_TOKEN>" \
-H "X-Kube-Context-Name: <CLUSTER_NAME>" \
"<DAKR_URL>/dakr/installer-updater" \
> /tmp/new-zxporter.yamlVerify the file is valid:
wc -l /tmp/new-zxporter.yaml
# Should be 500+ lines. If it's 0 or just an error message, the curl failed.Backup your current cluster config
Find where the old zxporter is running and save everything you need.
Find the namespace:
export OLD_NS=$(kubectl get deployment -A -l control-plane=controller-manager \
-o jsonpath='{.items[0].metadata.namespace}' 2>/dev/null)
echo "Old zxporter namespace: $OLD_NS"Save the cluster token (most important — do not lose this):
# Try Secret first (most common)
export CLUSTER_TOKEN=$(kubectl get secret devzero-zxporter-token -n $OLD_NS \
-o jsonpath='{.data.CLUSTER_TOKEN}' 2>/dev/null | base64 -d)
# If empty, try ConfigMap
if [ -z "$CLUSTER_TOKEN" ]; then
export CLUSTER_TOKEN=$(kubectl get configmap devzero-zxporter-env-config -n $OLD_NS \
-o jsonpath='{.data.CLUSTER_TOKEN}' 2>/dev/null)
fi
# If still empty, try PAT token
if [ -z "$CLUSTER_TOKEN" ]; then
export PAT_TOKEN=$(kubectl get secret devzero-zxporter-credentials -n $OLD_NS \
-o jsonpath='{.data.PAT_TOKEN}' 2>/dev/null | base64 -d)
fi
echo "Cluster token: ${CLUSTER_TOKEN:-(not found)}"
echo "PAT token: ${PAT_TOKEN:-(not found)}"Stop here if both are empty. Go to the DevZero dashboard, find your cluster, and copy the token before continuing.
Save other config values:
export DAKR_URL=$(kubectl get configmap devzero-zxporter-env-config -n $OLD_NS \
-o jsonpath='{.data.DAKR_URL}')
export CLUSTER_NAME=$(kubectl get configmap devzero-zxporter-env-config -n $OLD_NS \
-o jsonpath='{.data.KUBE_CONTEXT_NAME}')
export K8S_PROVIDER=$(kubectl get configmap devzero-zxporter-env-config -n $OLD_NS \
-o jsonpath='{.data.K8S_PROVIDER}')
export LOG_LEVEL=$(kubectl get configmap devzero-zxporter-env-config -n $OLD_NS \
-o jsonpath='{.data.LOG_LEVEL}')Verify everything:
echo "Old namespace: $OLD_NS"
echo "Cluster token: ${CLUSTER_TOKEN:-(NOT SET)}"
echo "PAT token: ${PAT_TOKEN:-(NOT SET)}"
echo "DAKR URL: $DAKR_URL"
echo "Cluster name: $CLUSTER_NAME"
echo "Provider: $K8S_PROVIDER"
echo "Log level: ${LOG_LEVEL:-(default)}"You need at least a token (cluster or PAT), DAKR URL, and cluster name.
Validate the new manifest
The installer_updater.yaml does not contain the ConfigMap or Secret (those are handled separately). Verify the things that are in it:
echo "Namespace used:"
grep "namespace:" /tmp/new-zxporter.yaml | sort -u
echo "Images:"
grep "image:" /tmp/new-zxporter.yaml
echo "Resource kinds:"
grep "^kind:" /tmp/new-zxporter.yaml | sort | uniq -c| Field | What to look for | Problem if wrong |
|---|---|---|
| namespace | devzero-system | Resources go to wrong namespace |
| image | New zxporter/nodemon version (not ttl.sh) | Pods won't start |
| kinds | Deployment, DaemonSet, ServiceAccount, ClusterRole, etc. | Manifest is incomplete |
You will not see CLUSTER_TOKEN, DAKR_URL, or KUBE_CONTEXT_NAME in this file. Those live in the ConfigMap and Secret which you export in Step 4 and restore in Step 5.
Fix the namespace (if needed):
If the manifest says devzero-zxporter but you want devzero-system:
sed "s|namespace: devzero-zxporter|namespace: devzero-system|g" /tmp/new-zxporter.yaml \
| sed "s|name: devzero-zxporter$|name: devzero-system|g" \
> /tmp/new-zxporter-patched.yaml && mv /tmp/new-zxporter-patched.yaml /tmp/new-zxporter.yamlAdjust resource requests based on cluster size:
echo "Nodes: $(kubectl get nodes --no-headers | wc -l)"
echo "Pods: $(kubectl get pods -A --no-headers | wc -l)"| Cluster Size | Nodes | Pods | CPU Request | Memory Request | CPU Limit | Memory Limit |
|---|---|---|---|---|---|---|
| Small | 1–10 | < 100 | 100m | 128Mi | 200m | 256Mi |
| Medium | 10–50 | 100–500 | 200m | 256Mi | 400m | 512Mi |
| Large | 50–200 | 500–2000 | 300m | 512Mi | 600m | 1Gi |
| XL | 200+ | 2000+ | 500m | 1Gi | 1000m | 2Gi |
To check current values:
awk '/kind: Deployment/,/^---/' /tmp/new-zxporter.yaml | grep -A4 "resources:"Export config and clean up old zxporter
Read this before doing anything.
- If old zxporter is in
devzero-zxporter(or any dedicated namespace) — you can delete the entire namespace. - If old zxporter is in
devzero-system— delete only zxporter resources by name. Do not runkubectl delete all --all— the DAKR operator and other components live here too.
Export ConfigMap and Secret first (before deleting anything):
export NEW_NS=devzero-system
kubectl get configmap devzero-zxporter-env-config -n $OLD_NS -o yaml \
| grep -v "resourceVersion\|uid\|creationTimestamp\|selfLink\|namespace:" \
| sed "s|^ name:| namespace: $NEW_NS\n name:|" \
> /tmp/zxporter-configmap.yaml
kubectl get secret devzero-zxporter-token -n $OLD_NS -o yaml 2>/dev/null \
| grep -v "resourceVersion\|uid\|creationTimestamp\|selfLink\|namespace:" \
| sed "s|^ name:| namespace: $NEW_NS\n name:|" \
> /tmp/zxporter-secret.yaml 2>/dev/null
kubectl get secret devzero-zxporter-credentials -n $OLD_NS -o yaml 2>/dev/null \
| grep -v "resourceVersion\|uid\|creationTimestamp\|selfLink\|namespace:" \
| sed "s|^ name:| namespace: $NEW_NS\n name:|" \
> /tmp/zxporter-credentials.yaml 2>/dev/nullUninstall Helm releases (if applicable):
helm uninstall zxporter -n $OLD_NS 2>/dev/null || true
helm uninstall zxporter-nodemon -n $OLD_NS 2>/dev/null || trueDelete cluster-scoped resources:
# ZXporter RBAC
for r in devzero-zxporter-collectionpolicy-editor-role devzero-zxporter-collectionpolicy-viewer-role \
devzero-zxporter-manager-role devzero-zxporter-metrics-auth-role devzero-zxporter-metrics-reader; do
kubectl delete clusterrole "$r" --ignore-not-found
done
for r in devzero-zxporter-manager-rolebinding devzero-zxporter-metrics-auth-rolebinding; do
kubectl delete clusterrolebinding "$r" --ignore-not-found
done
# Prometheus RBAC
for r in prometheus-dz-prometheus-server prometheus-kube-state-metrics; do
kubectl delete clusterrole "$r" --ignore-not-found
kubectl delete clusterrolebinding "$r" --ignore-not-found
done
# Nodemon + Metrics-server RBAC
kubectl delete clusterrole zxporter-nodemon --ignore-not-found
kubectl delete clusterrolebinding zxporter-nodemon --ignore-not-found
kubectl delete clusterrole system:dz-metrics-server-aggregated-reader system:dz-metrics-server --ignore-not-found
kubectl delete clusterrolebinding dz-metrics-server:system:auth-delegator system:dz-metrics-server --ignore-not-found
kubectl delete rolebinding dz-metrics-server-auth-reader -n kube-system --ignore-not-found
kubectl delete priorityclass devzero-zxporter-devzero-zxporter-critical --ignore-not-foundNow pick one of the two cleanup paths:
Safe to delete the entire namespace:
kubectl delete all --all -n $OLD_NS
kubectl delete configmap --all -n $OLD_NS
kubectl delete secret --all -n $OLD_NS
kubectl delete pdb --all -n $OLD_NS
kubectl delete role,rolebinding --all -n $OLD_NS
kubectl delete namespace $OLD_NSIf the namespace is stuck in Terminating:
kubectl get namespace $OLD_NS -o json \
| jq '.spec.finalizers = []' \
| kubectl replace --raw "/api/v1/namespaces/$OLD_NS/finalize" -f -Verify cleanup:
kubectl get all -n ${OLD_NS} 2>/dev/null \
| grep -iE "zxporter|prometheus-dz|prometheus-kube|dz-metrics|node-exporter" || echo "(none — good)"
kubectl get clusterrole,clusterrolebinding \
| grep -iE "zxporter|prometheus-dz|prometheus-kube" || echo "(none — good)"Install new zxporter
Create the namespace and restore config:
kubectl create namespace $NEW_NS 2>/dev/null || true
# Restore ConfigMap
kubectl apply -f /tmp/zxporter-configmap.yaml
# Restore Secret (if exported)
if [ -f /tmp/zxporter-secret.yaml ] && [ -s /tmp/zxporter-secret.yaml ]; then
kubectl apply -f /tmp/zxporter-secret.yaml
else
# Create Secret from ConfigMap token value
BACKUP_TOKEN=$(grep "CLUSTER_TOKEN:" /tmp/zxporter-configmap.yaml | head -1 | awk '{print $2}' | tr -d '"')
kubectl create secret generic devzero-zxporter-token -n $NEW_NS \
--from-literal=CLUSTER_TOKEN="$BACKUP_TOKEN"
fi
# Restore credentials secret if it was exported
if [ -f /tmp/zxporter-credentials.yaml ] && [ -s /tmp/zxporter-credentials.yaml ]; then
kubectl apply -f /tmp/zxporter-credentials.yaml
fiApply the manifest:
kubectl apply -f /tmp/new-zxporter.yamlDo not restore ConfigMap/Secret before Helm install. The Helm chart creates its own from the --set values. Pre-existing non-Helm resources will cause failures.
helm dependency update ./helm-chart/zxporter/
helm install zxporter ./helm-chart/zxporter \
--namespace devzero-system --create-namespace \
--set zxporter.useSecretForToken=true \
--set zxporter.clusterToken="$CLUSTER_TOKEN" \
--set zxporter.kubeContextName="$CLUSTER_NAME" \
--set zxporter.k8sProvider="$K8S_PROVIDER" \
--set zxporter.dakrUrl="$DAKR_URL" \
--set zxporter.logLevel="${LOG_LEVEL:-error}" \
--set zxporter-nodemon.provider="$K8S_PROVIDER"For PAT token instead of cluster token, replace --set zxporter.clusterToken=... with --set zxporter.patToken="$PAT_TOKEN".
Wait for pods to come up
export NS=devzero-system
kubectl rollout status deployment/devzero-zxporter-controller-manager -n $NS --timeout=180s
kubectl rollout status daemonset -l app.kubernetes.io/name=zxporter-nodemon -n $NS --timeout=180s 2>/dev/null || true
kubectl get pods -n $NS -o wideExpected output:
devzero-zxporter-controller-manager-xxx 1/1 Running 0 30s
devzero-zxporter-controller-manager-yyy 1/1 Running 0 30s
zxporter-nodemon-aaa 2/2 Running 0 30s (one per node)
zxporter-nodemon-bbb 2/2 Running 0 30s| Problem | Command | Fix |
|---|---|---|
zxporter 0/1 | kubectl logs deploy/devzero-zxporter-controller-manager -n $NS --tail=20 | invalid token = bad token, connection refused = wrong DAKR URL |
nodemon 0/2 | kubectl describe pod -n $NS -l app.kubernetes.io/name=zxporter-nodemon | Usually missing ConfigMaps — reapply manifest |
ImagePullBackOff | kubectl describe pod -n $NS -l app.kubernetes.io/name=zxporter-nodemon | grep Image | Wrong image tag |
CrashLoopBackOff | kubectl logs -n $NS -l app.kubernetes.io/name=zxporter-nodemon -c zxporter-nodemon --tail=20 | Check startup errors |
Verify data is flowing
Check zxporter logs:
kubectl logs deploy/devzero-zxporter-controller-manager -n $NS --tail=30 \
| grep -E "Successfully sent|container_resource|node_resource|error" \
| tail -10You should see:
Splitting resources into batches resourceType: container_resource
Successfully sent batch batchSize: 80Check nodemon is serving metrics:
NODEMON_IP=$(kubectl get pods -n $NS -l app.kubernetes.io/name=zxporter-nodemon \
-o jsonpath='{.items[0].status.podIP}')
kubectl run verify --rm -i --restart=Never --image=curlimages/curl -n $NS \
-- curl -s "http://$NODEMON_IP:6061/v2/container/metrics" | head -c 500You should see JSON with cpu_usage_nanocores, memory_working_set_bytes, etc.
Verify on the DevZero dashboard:
- Open the DevZero dashboard and find your cluster
- Cluster overview — CPU/Memory utilization graphs should show data within 2–3 minutes
- Workloads — CPU/Memory columns should be non-zero
- Nodes — Network and disk I/O should be visible
Update DAKR operator (if namespace changed)
Skip this step if your old zxporter was already in devzero-system.
The DAKR operator connects to zxporter's MPA gRPC service using a URL that includes the namespace. If you moved from devzero-zxporter to devzero-system, the operator can't find zxporter.
Check current setting:
kubectl get deployment -n dakr-operator -l app.kubernetes.io/name=dakr-operator \
-o jsonpath='{.items[0].spec.template.spec.containers[0].args}' 2>/dev/null \
| tr ',' '\n' | grep zxporter-addrVerify where the MPA service is now:
kubectl get service -A | grep mpaUpdate if namespaces don't match:
helm upgrade dakr <your-dakr-operator-chart> \
--namespace dakr-operator \
--reuse-values \
--set operator.zxporterAddr="devzero-zxporter-controller-manager-mpa.devzero-system.svc.cluster.local:50051"Verify connection:
kubectl rollout status deployment -n dakr-operator -l app.kubernetes.io/name=dakr-operator --timeout=120s
kubectl logs deployment/$(kubectl get deployment -n dakr-operator -l app.kubernetes.io/name=dakr-operator \
-o jsonpath='{.items[0].metadata.name}') -n dakr-operator --tail=20 \
| grep -iE "mpa|rule.*eval|metrics.*batch"You should see Initializing Rule Evaluator Controller (unified MPA) and Received metrics batch.
Clean up temp files
rm -f /tmp/new-zxporter.yaml /tmp/zxporter-configmap.yaml /tmp/zxporter-secret.yaml /tmp/zxporter-credentials.yamlResources Deleted by This Migration
| Component | Resources |
|---|---|
| Prometheus Server | Deployment, Service, ServiceAccount, ConfigMap, ClusterRole, ClusterRoleBinding (all named prometheus-dz-prometheus-server) |
| Kube-State-Metrics | Deployment, Service, ServiceAccount, ClusterRole, ClusterRoleBinding (all named prometheus-kube-state-metrics) |
| Node-Exporter | DaemonSet, Service, ServiceAccount (all named dz-prometheus-node-exporter) |
| Metrics-Server | Deployment, Service, ServiceAccount (named dz-metrics-server), ClusterRoles (system:dz-metrics-server, system:dz-metrics-server-aggregated-reader) |
FAQ
Will this affect my other Prometheus installation?
No. Only deletes resources by exact name in the zxporter namespace.
How long is the data gap?
2–5 minutes between Step 4 (delete) and Step 6 (new pods start sending).
Do I need to change my cluster token?
No. The token is tied to your cluster record, not the namespace.
Can I still use kubectl top after migration?
Yes, if your cluster has a managed metrics-server (EKS/GKE/AKS all do). If it breaks:
kubectl patch apiservice v1beta1.metrics.k8s.io --type merge \
-p '{"spec":{"service":{"name":"metrics-server","namespace":"kube-system","port":443}}}'What about PROMETHEUS_URL or ENABLE_NODEMON_METRICS in the old ConfigMap?
Ignored. The new binary doesn't read them.
The namespace is stuck in Terminating.
Force-remove the finalizer:
kubectl get namespace $OLD_NS -o json \
| jq '.spec.finalizers = []' \
| kubectl replace --raw "/api/v1/namespaces/$OLD_NS/finalize" -f -If still stuck, fix the stale v1beta1.metrics.k8s.io APIService:
kubectl patch apiservice v1beta1.metrics.k8s.io --type merge \
-p '{"spec":{"service":{"name":"metrics-server","namespace":"kube-system","port":443}}}'