Demo setup¶
Follow these steps after Installation and when your cluster meets the Prerequisites. You will run a minimal httpbin workload behind ingress, attach an ElastiService, apply it, and curl through the ingress to see scale-from-zero and idle scale-down.
1. Deploy a Target Application¶
Before creating an ElastiService, you need a target deployment, service, and ingress for KubeElasti to manage. Below is a sample httpbin application you can use.
Create a file named target-deployment.yaml:
The nginx.ingress.kubernetes.io/upstream-vhost annotation above is
important for elasti: it sets the Host header on the request NGINX
forwards upstream, and the resolver reads that header to decide which
target service to route to. See
Resolver Architecture > Request Routing
for details on how routing works and how to override the header.
Apply it:
Verify the target is running:
2. Define an ElastiService¶
You turn scale-to-zero on for a workload by creating an ElastiService object: it points at your Kubernetes Service and scale target, defines Prometheus triggers, and (optionally) links a KEDA ScaledObject so KubeElasti can pause it while the workload is at zero. For a full walkthrough of every field and option, read the Configuration guide.
The manifest below matches the httpbin example from the previous step. Save it as elasti-service.yaml; you will apply it in the next section.
If your ingress, load balancer, or platform sends health checks while the workload is at zero replicas, add probeResponse rules for those paths. Requests that match are served by the resolver and do not scale the deployment up.
3. Apply the KubeElasti service configuration¶
Apply the configuration to your Kubernetes cluster:
The pod will be scaled down to 0 replicas if there is no traffic.
4. Test the setup¶
You can test the setup by sending requests to the nginx load balancer service.
# For NGINX
kubectl port-forward svc/nginx-ingress-ingress-nginx-controller -n nginx 8080:80
# For Istio
kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80
Start a watch on the target deployment.
Send a request to the service.
You should see the pods being created and scaled up to 1 replica. A response from the target service should be visible for the curl command.
The target service should be scaled down to 0 replicas if there is no traffic for cooldownPeriod seconds.