Simple load balancing
Decide how to load balance incoming requests to backend services.
About simple load balancing
Kgateway supports multiple load balancing algorithms for selecting backend services to forward incoming requests to. By default, incoming requests are forwarded to the instance with the least requests. You can change this behavior and instead use a round robin algorithm to forward the request to a backend service.
Least request
The least request load balancer algorithm generally selects the host with the fewest requests. The following rules apply:
- If no
localityType
is set, send the request to the host with the fewest requests. - If
localityType
is set, a weighted round robin approach is used, in which higher weighted endpoints are considered more often in the round robin rotation to achieve the selected weight.
Round robin
The round robin load balancer algorithm selects a backend host in a round robin order. If a localityType
is also defined, a weighted round robin algorithm is used, in which higher weighted endpoints are considered more often in the round robin rotation to achieve the selected weight.
Random
The random load balancer algorithm selects a random host that is available. This load balancing algorithm usually performs better than the round robin algorithm if no health checks are configured for the backend host.
Other load balancing options
Learn about other load balancing options that you can set in the load balancer policy.
Healthy panic threshold
By default, kgateway only considers services that are healthy and available when load balancing incoming requests among backend services. In the case that the number of healthy backend services becomes too low, you can instruct kgateway to disregard the backend health status and either load balance requests among all or no hosts by using the healthy_panic_threshold
setting. If not set, the threshold defaults to 50%. To disable panic mode, set this field to 0.
To learn more about this setting and when to use it, see the Envoy documentation.
Update merge window
Sometimes, your deployments might have health checks and metadata updates that use a lot of CPU and memory. In such cases, you can use the update_merge_window
setting. This way, kgateway merges all updates together within a specific timeframe. For more information about this setting, see the Envoy documentation. If not set, the update merge window defaults to 1000ms. To disable the update merge window, set this field to 0s.
Before you begin
-
Follow the Get started guide to install kgateway.
-
Follow the Sample app guide to create a gateway proxy with an HTTP listener and deploy the httpbin sample app.
-
Get the external address of the gateway and save it in an environment variable.
export INGRESS_GW_ADDRESS=$(kubectl get svc -n kgateway-system http -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}") echo $INGRESS_GW_ADDRESS
kubectl port-forward deployment/http -n kgateway-system 8080:8080
Set up a load balancing algorithm
Define the load balancing algorithm that you want to use for your backend app in a BackendConfigPolicy.
-
Create a BackendConfigPolicy to configure your load balancing algorithm for the httpbin app.
kubectl apply -f- <<EOF kind: BackendConfigPolicy apiVersion: gateway.kgateway.dev/v1alpha1 metadata: name: httpbin-lb-policy namespace: httpbin spec: targetRefs: - name: httpbin group: "" kind: Service loadBalancer: leastRequest: choiceCount: 3 slowStart: window: 10s aggression: "1.5" minWeightPercent: 10 EOF
Review the following table to understand this configuration.
Setting Description choiceCount
The number of random available backend hosts to consider when choosing the host with the fewest requests. Deafults to 2. slowStart
When you add new backend hosts to the pool of endpoints to route to, slow start mode progressively increases the amount of traffic that is routed to those backends. This can be useful for services that require warm-up time to serve full production loads, to prevent request timeouts, loss of data, and deteriorated user experience. slowStart.window
The duration of the slow start window. If set, a newly-created backend endpoint remains in slow start mode from its creation time until the duration of the slow start window has elapsed. slowStart.aggression
The rate of traffic increase over the duration of the slow start window. Defaults to 1.0, so that the endpoint receives a linearly-increasing amount of traffic. For more information about fine-tuning this value, see the API docs. slowStart.minWeightPercent
The minimum percentage of weight that an endpoint must have in the caluclation of aggression. This helps prevent weights that are so small that endpoints receive no traffic during the slow start window. Defaults to 10%. kubectl apply -f- <<EOF kind: BackendConfigPolicy apiVersion: gateway.kgateway.dev/v1alpha1 metadata: name: httpbin-lb-policy namespace: httpbin spec: targetRefs: - name: httpbin group: "" kind: Service loadBalancer: roundRobin: slowStart: window: 10s aggression: "1.5" minWeightPercent: 10 EOF
Review the following table to understand this configuration.
Setting Description slowStart
When you add new backend hosts to the pool of endpoints to route to, slow start mode progressively increases the amount of traffic that is routed to those backends. This can be useful for services that require warm-up time to serve full production loads, to prevent request timeouts, loss of data, and deteriorated user experience. slowStart.window
The duration of the slow start window. If set, any newly-created backend endpoints remain in slow start mode from its creation time until the duration of the slow start window has elapsed. slowStart.aggression
The rate of traffic increase over the duration of the slow start window. Defaults to 1.0, so that the endpoint receives a linearly-increasing amount of traffic. For more information about fine-tuning this value, see the API docs. slowStart.minWeightPercent
The minimum percentage of weight that an endpoint must have in the caluclation of aggression. This helps prevent weights that are so small that endpoints receive no traffic during the slow start window. Defaults to 10%. Note that no further settings are required because the load balancing is random.
kubectl apply -f- <<EOF kind: BackendConfigPolicy apiVersion: gateway.kgateway.dev/v1alpha1 metadata: name: httpbin-lb-policy namespace: httpbin spec: targetRefs: - name: httpbin group: "" kind: Service loadBalancer: random: {} EOF
-
Verify that your configuration is applied by reviewing the Envoy configuration.
- Port forward the
http
deployment on port 19000.kubectl port-forward deploy/http -n kgateway-system 19000 &
- Open the
config_dump
endpoint.open http://localhost:19000/config_dump
- Search for the
dynamic_warming_clusters
section, and verify that the policy that you set is listed in thelb_policy
field. For example, the following shows theLEAST_REQUEST
policy, with thechoice_count
field set to3
."dynamic_warming_clusters": [ { "version_info": "10665027509180737706", "cluster": { "@type": "type.googleapis.com/envoy.config.cluster.v3.Cluster", "name": "kube_httpbin_httpbin_8000", "type": "EDS", "eds_cluster_config": { "eds_config": { "ads": {}, "resource_api_version": "V3" } }, "connect_timeout": "5s", "lb_policy": "LEAST_REQUEST", "metadata": {}, "common_lb_config": { "consistent_hashing_lb_config": {} }, "ignore_health_on_host_removal": true, "least_request_lb_config": { "choice_count": 3 } }, ...
- Stop port-forwarding the
http
deployment.lsof -ti:19000 | xargs kill -9
- Port forward the
Cleanup
You can remove the resources that you created in this guide.kubectl delete BackendConfigPolicy httpbin-lb-policy -n httpbin