Demo setup¶
Follow these steps after Installation and when your cluster meets the Prerequisites. You will run a minimal httpbin workload behind ingress or a gateway, attach an ElastiService, apply it, and curl through the edge to see scale-from-zero and idle scale-down.
For NGINX, Istio, or Envoy Gateway install and routing details, use the Gateway and ingress integrations guides.
1. Deploy a Target Application¶
Before creating an ElastiService, you need a target deployment, service, and a route from your ingress or gateway. The example below uses NGINX Ingress; for other edges, deploy the Deployment and Service from this manifest, then apply the route from Istio or Envoy Gateway.
Create a file named target-deployment.yaml:
The upstream-vhost annotation sets the routing header the resolver expects. See NGINX Ingress integration and Request routing.
Apply the manifest:
Verify the target is running:
2. Define an ElastiService¶
You turn scale-to-zero on for a workload by creating an ElastiService object: it points at your Kubernetes Service and scale target, defines Prometheus triggers, and (optionally) links a KEDA ScaledObject so KubeElasti can pause it while the workload is at zero. For a full walkthrough of every field and option, read the Configuration guide.
The manifest below matches the httpbin example from the previous step. Save it as elasti-service.yaml; you will apply it in the next section.
If your ingress, load balancer, or platform sends health checks while the workload is at zero replicas, add probeResponse rules for those paths. Requests that match are served by the resolver and do not scale the deployment up.
3. Apply the KubeElasti service configuration¶
Apply the configuration to your Kubernetes cluster:
The pod will be scaled down to 0 replicas if there is no traffic.
4. Test the setup¶
Port-forward your edge Service to localhost (commands per integration):
Start a watch on the target deployment.
Send a request to the service.
You should see the pods being created and scaled up to 1 replica. A response from the target service should be visible for the curl command.
The target service should be scaled down to 0 replicas if there is no traffic for cooldownPeriod seconds.