Kubernetes (Helm)
Ahnlich ships official Helm charts for running both services on Kubernetes:
ahnlich-dbβ the in-memory vector store (StatefulSet, client gRPC on port1369)ahnlich-aiβ the embedding/AI proxy that sits in front of the DB (StatefulSet, client gRPC on port1370)ahnlichβ an umbrella chart that deploys both, pre-wired so the AI proxy talks to the DB out of the box, with an optional in-cluster tracing backend
Reach for Kubernetes when you want self-healing pods, rolling upgrades, persistent volumes managed by the cluster, and horizontal scaling. For a single box or local experimentation, the Docker Compose setup is simpler.
The charts are not published to a Helm/OCI registry at this time. You install them
straight from the source tree, as shown below. (helm repo add is not available yet.)
Prerequisitesβ
-
A running Kubernetes cluster and a
kubectlcontext pointing at it. Any cluster works β Rancher Desktop, kind, or minikube locally; EKS / GKE / AKS in production. -
Helm 3.8+ (or 4.x).
-
Kubernetes 1.20+ (the Services set
appProtocol: grpc; the Gateway API path additionally needs the Gateway API CRDs and a controller β see External access). -
kubectl, configured for the target cluster:kubectl config current-context
kubectl get nodes
Get the Chartsβ
There is no chart repository yet, so fetch the charts from GitHub. You only need the
charts/ directory β not the rest of the codebase.
Clone the whole repo:
git clone https://github.com/deven96/ahnlich.git
cd ahnlich
Or grab only charts/ with a sparse checkout (smaller, no source code):
git clone --depth 1 --filter=blob:none --sparse https://github.com/deven96/ahnlich.git
cd ahnlich
git sparse-checkout set charts
Then vendor the umbrella's sub-chart dependencies once:
helm dependency build charts/ahnlich
This resolves the ahnlich-db and ahnlich-ai sub-charts that the umbrella depends on.
Re-run it whenever a sub-chart version changes.
Install (Umbrella Chart)β
The umbrella is the recommended path: it installs ahnlich-db and ahnlich-ai together
and wires the AI proxy to the DB automatically.
kubectl create namespace ahnlich
helm install ahnlich charts/ahnlich \
--namespace ahnlich \
--set 'ahnlich-ai.models.supported={all-minilm-l6-v2}'
ahnlichThe umbrella's default DB wiring (ahnlich-ai.db.host=ahnlich-ahnlich-db) assumes the
release is named ahnlich. Install under any other name and the chart fails at
template time with the exact override to use. To use a different name, also pass
--set ahnlich-ai.db.host=<release>-ahnlich-db.
The AI proxy downloads its embedding models on first start. The example overrides
ahnlich-ai.models.supported to just all-minilm-l6-v2 so the initial model pull is
quick. Drop the --set to get the chart default (all-minilm-l6-v2, resnet-50), or
list whatever models you need.
The AI pod blocks on a wait-for-db init container until the DB is reachable, so it is
normal for ahnlich-ahnlich-ai-0 to sit in Init:0/1 for a moment while the DB comes up
and the model downloads. Watch progress with:
kubectl get pods -n ahnlich -w
Verify the Deploymentβ
Once both StatefulSets report ready:
kubectl get pods,sts -n ahnlich
Send a PING to each service from inside its pod:
# DB
kubectl exec -n ahnlich sts/ahnlich-ahnlich-db -- \
sh -c "echo PING | ahnlich-cli ahnlich --agent db --host 127.0.0.1 --port 1369 --no-interactive"
# AI
kubectl exec -n ahnlich sts/ahnlich-ahnlich-ai -- \
sh -c "echo PING | ahnlich-cli ahnlich --agent ai --host 127.0.0.1 --port 1370 --no-interactive"
Prove the full AI β DB path by creating a store through the AI proxy and confirming it exists in the DB:
# Create a store via AI (AI proxies the write to DB)
kubectl exec -n ahnlich sts/ahnlich-ahnlich-ai -- \
sh -c "echo 'CREATESTORE smoke QUERYMODEL all-minilm-l6-v2 INDEXMODEL all-minilm-l6-v2' | ahnlich-cli ahnlich --agent ai --host 127.0.0.1 --port 1370 --no-interactive"
# Confirm it landed in DB
kubectl exec -n ahnlich sts/ahnlich-ahnlich-db -- \
sh -c "echo LISTSTORES | ahnlich-cli ahnlich --agent db --host 127.0.0.1 --port 1369 --no-interactive"
The LISTSTORES output should include smoke.
Connect from Clientsβ
In-cluster β other workloads in the cluster reach the services by their Service DNS:
ahnlich-ahnlich-db.ahnlich.svc.cluster.local:1369
ahnlich-ahnlich-ai.ahnlich.svc.cluster.local:1370
Point an Ahnlich client at the AI proxy's address (or the DB address for raw vector operations).
From your laptop β port-forward for local testing:
kubectl port-forward -n ahnlich svc/ahnlich-ahnlich-ai 1370:1370
# now connect a client to 127.0.0.1:1370
For permanent external access, see External access below.
Configurationβ
Override any value with --set key=value or, better for real deployments, a values file:
helm install ahnlich charts/ahnlich -n ahnlich -f my-values.yaml
Umbrella values are namespaced per sub-chart β set DB values under ahnlich-db.* and AI
values under ahnlich-ai.*, e.g. --set ahnlich-db.persistence.size=50Gi.
Common knobs:
| What | Key | Default |
|---|---|---|
| DB snapshot PVC size | ahnlich-db.persistence.size | 10Gi |
| AI model cache PVC size | ahnlich-ai.models.cache.size | 20Gi |
| PVC StorageClass | ahnlich-db.persistence.storageClass, ahnlich-ai.models.cache.storageClass | "" (cluster default) |
| DB snapshot interval (ms) | ahnlich-db.persistence.intervalMs | 30000 |
| Models served by AI | ahnlich-ai.models.supported | [all-minilm-l6-v2, resnet-50] |
| Container resources | ahnlich-db.resources, ahnlich-ai.resources | {} (unset) |
| Log level | ahnlich-db.logLevel, ahnlich-ai.logLevel | binary default (info,hf_hub=warn) |
| Extra env vars | ahnlich-db.env, ahnlich-ai.env | [] |
| Bulk env from ConfigMap/Secret | ahnlich-db.envFrom, ahnlich-ai.envFrom | [] |
Persistence is on by default for both services: each writes a periodic snapshot
(db.dat / ai.dat) to its PVC every persistence.intervalMs (default 30s) and reloads
it on startup. Data survives pod restarts, but an ungraceful kill can lose up to one
interval's worth of the most recent writes. The full set of values, with descriptions,
lives in each chart's README:
ahnlich-ai's image bundles the CUDA ONNX Runtime. To run embeddings on GPU nodes, add
nvidia.com/gpu to ahnlich-ai.resources.limits and a matching ahnlich-ai.nodeSelector
(the cluster needs the NVIDIA device plugin).
--log-level, not RUST_LOGSet verbosity via ahnlich-db.logLevel / ahnlich-ai.logLevel (mapped to the binary's
--log-level). Setting RUST_LOG through env has no effect.
External Accessβ
The umbrella imposes no global external-access pattern; each sub-chart's service.type,
ingress.*, and gateway.* blocks work independently and apply to both ahnlich-db
and ahnlich-ai. Typically you expose only the AI proxy (it fronts the DB), but the
same knobs expose the DB on 1369 if you need raw vector access from outside.
LoadBalancer Service (Simplest)β
helm upgrade ahnlich charts/ahnlich -n ahnlich --reuse-values \
--set ahnlich-ai.service.type=LoadBalancer
Your cloud provider assigns an external IP (locally, Rancher Desktop / k3s servicelb
uses the node IP). Connect a gRPC client to <EXTERNAL-IP>:1370. Set
ahnlich-db.service.type=LoadBalancer likewise to expose DB on 1369.
Gateway API (GRPCRoute) β Recommended for gRPCβ
When you set gateway.enabled=true, the chart contributes only a GRPCRoute that
attaches the ahnlich-ai / ahnlich-db Service to a Gateway you already run. The chart
does not install a gateway, and intentionally so β the gateway is shared cluster
infrastructure that you own and operate. Before enabling it, your cluster must already
have:
- A Gateway API controller β the component that actually proxies traffic (Envoy Gateway, NGINX Gateway Fabric, Contour, Istio, or your cloud's implementation).
- The Gateway API CRDs (
gateway.networking.k8s.io) β normally installed with the controller. - A
GatewayClassregistered by that controller. - A
Gatewaywith a listener on the port your clients connect to, using a gRPC-capable protocol (HTTPfor h2c, orHTTPSwith TLS).
The chart binds to that Gateway through gateway.parentRefs (and sectionName to pick a
listener). If any of the above is missing, the GRPCRoute is created but stays unattached
and no traffic flows. If you don't run a Gateway, use the
LoadBalancer Service path instead.
The walkthrough below uses Envoy Gateway as a concrete, working example; any
conformant controller follows the same shape (install controller β GatewayClass β
Gateway β enable the chart's route).
-
Install a controller:
helm install eg oci://docker.io/envoyproxy/gateway-helm \
-n envoy-gateway-system --create-namespace
kubectl wait --for=condition=available deploy --all -n envoy-gateway-system --timeout=180s -
Create a
GatewayClassand aGatewaywith a gRPC (HTTP/h2c) listener on the AI port:# gateway.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: eg
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: ahnlich-eg
namespace: ahnlich
spec:
gatewayClassName: eg
listeners:
- name: ai
protocol: HTTP # h2c β gRPC over cleartext HTTP/2
port: 1370
allowedRoutes:
namespaces:
from: Samekubectl apply -f gateway.yaml -
Point the chart's
GRPCRouteat that listener:helm upgrade ahnlich charts/ahnlich -n ahnlich --reuse-values \
--set ahnlich-ai.service.type=ClusterIP \
--set ahnlich-ai.gateway.enabled=true \
--set 'ahnlich-ai.gateway.parentRefs[0].name=ahnlich-eg' \
--set 'ahnlich-ai.gateway.parentRefs[0].sectionName=ai'sectionNamebinds the route to a named listener. We set it explicitly here; it is strictly required only when the Gateway has multiple listeners (e.g. a DB1369and an AI1370listener side by side). Keep the Service atClusterIPso it doesn't contend for the node port the Gateway now owns.Overriding a multi-GatewayparentRefsparentRefsis a list with a one-element default. Under--set/--reuse-valuesHelm merges it by index (it patches[0], it does not replace the list). To attach to several parent Gateways, pass the whole list via-f values.yamlinstead of--set. -
Verify and connect:
kubectl get gateway ahnlich-eg -n ahnlich # PROGRAMMED=True, ADDRESS assigned
kubectl get grpcroute -n ahnlich # Accepted / ResolvedRefs TruePoint your client at the Gateway's address on the listener port.
Ingressβ
helm upgrade ahnlich charts/ahnlich -n ahnlich --reuse-values \
--set ahnlich-ai.ingress.enabled=true \
--set ahnlich-ai.ingress.className=<your-ingress-class> \
--set ahnlich-ai.ingress.host=ai.example.com
Ahnlich speaks gRPC (HTTP/2). Through a typical Ingress controller without h2c/TLS the
controller answers HTTP/1.1 and the client fails with an invalid compression flag /
500 error. For real use, enable TLS (ahnlich-ai.ingress.tls.enabled=true, with a cert
via cert-manager or a Secret) and set the controller's gRPC backend hint β for
nginx-ingress that is nginx.ingress.kubernetes.io/backend-protocol: GRPC. The
Gateway API path above is the more reliable choice for gRPC. ingress and gateway
are mutually exclusive.
Tracingβ
Enable the bundled in-cluster Jaeger backend and send both services' traces to it:
helm upgrade ahnlich charts/ahnlich -n ahnlich --reuse-values \
--set tracing.enabled=true \
--set tracing.backend.enabled=true \
--set ahnlich-db.tracing.enabled=true \
--set ahnlich-ai.tracing.enabled=true
# open the Jaeger UI
kubectl port-forward -n ahnlich svc/ahnlich-tracing-backend 16686:16686
The bundled Jaeger all-in-one runs as a single replica with in-memory span storage and
no auth. For production, set tracing.backend.enabled=false and point
ahnlich-db.tracing.otelEndpoint / ahnlich-ai.tracing.otelEndpoint at your own OTLP
collector (Tempo, Honeycomb, an OpenTelemetry Collector, etc.).
Installing the Sub-Charts Individuallyβ
You don't have to use the umbrella. Install either service on its own β useful when
ahnlich-db and ahnlich-ai live in different namespaces or you manage them separately.
# DB only
helm install my-db charts/ahnlich-db -n ahnlich
# AI only β db.host is required and must resolve to a reachable DB Service
helm install my-ai charts/ahnlich-ai -n ahnlich \
--set db.host=my-db \
--set 'models.supported={all-minilm-l6-v2}'
With the standalone charts there is no enforced release name; the AI's db.host is
whatever you point it at.
Operationsβ
Upgrade β change values and re-apply:
helm upgrade ahnlich charts/ahnlich -n ahnlich --reuse-values --set ahnlich-db.persistence.size=50Gi
Uninstall β and clean up the PVCs (they are retained by default, holding the cached models and persisted data):
helm uninstall ahnlich -n ahnlich
kubectl delete pvc -n ahnlich -l app.kubernetes.io/instance=ahnlich
Operational caveats:
- Toggling
persistence.enabledbetween upgrades fails. A StatefulSet'svolumeClaimTemplatesis immutable. Flipping persistence on or off requireshelm uninstallthen a fresh install (then clean up orphaned PVCs as above). :latestimages pullAlways. When an image tag islatest, the chart setsimagePullPolicy: Alwayssohelm upgradeactually pulls the newest digest. Pin a real tag (--set ahnlich-db.image.tag=<version>) for reproducible deploys; tagged images default toIfNotPresent.
Cluster (Raft) Mode β In Developmentβ
The charts already carry the wiring for multi-replica Raft clusters β a headless Service, per-replica RocksDB volumes, a PodDisruptionBudget, and bootstrap/join logic β enabled like this:
# NOT yet functional β shown for reference only
helm install ahnlich charts/ahnlich -n ahnlich \
--set ahnlich-db.cluster.enabled=true \
--set ahnlich-ai.cluster.enabled=true \
--set 'ahnlich-ai.models.supported={all-minilm-l6-v2}'
Cluster/Raft mode is still in development. The server binary does not yet accept the
--cluster-* flags the chart passes, so enabling it will not produce a working cluster.
It is also excluded from the project's automated tests. Run the standalone (default)
mode for now and track the repository for cluster-mode availability.
Troubleshootingβ
-
AI pod stuck in
Init:0/1(wait-for-db). The init container blocks until the DB Service port is reachable. Check the DB pod is running and thatahnlich-ai.db.hostresolves to the DB Service (<release>-ahnlich-dbunder the umbrella). Inspect withkubectl logs -n ahnlich ahnlich-ahnlich-ai-0 -c wait-for-db. -
helm installfails with a release-name error. You installed under a name other thanahnlichwithout overridingahnlich-ai.db.host. Use the override the error message prints, or name the releaseahnlich. -
AI takes a long time to become ready on first start. It is downloading models. Restrict
ahnlich-ai.models.supportedto only what you need, and give the model cache PVC (ahnlich-ai.models.cache.size) enough room. The cache persists across restarts. -
AI pod stays
Runningbut notReadyfor a while on first install. Expected: astartupProbeholds liveness and readiness until the embedding model has downloaded and the server answers aPING, so the pod is not killed mid-download. The default tolerance is 5 minutes (ahnlich-ai.probes.startup.periodSecondsΓfailureThreshold). If a large model or a slow network needs longer, raiseahnlich-ai.probes.startup.failureThresholdβ otherwise the pod restarts once the window is exceeded. The model cache persists on the PVC, so later restarts are fast. -
Inspect a wedged install:
kubectl get pods,svc,pvc,events -n ahnlich
kubectl logs -n ahnlich sts/ahnlich-ahnlich-db --tail 200
kubectl logs -n ahnlich sts/ahnlich-ahnlich-ai --tail 200