Retries
Specify the number of times and duration for the gateway to try a connection to an unresponsive backend service.
The Kubernetes Gateway API provides a way to configure retries on your HTTPRoutes. You might commonly use retries alongside Timeouts to ensure that your apps are available even if they are temporarily unavailable.
About
A retry is the number of times a request is retried if it fails. This setting can be useful to avoid your apps from failing if they are temporarily unavailable. With retries, calls are retried a certain number of times before they are considered failed. Retries can enhance your app’s availability by making sure that calls don’t fail permanently because of transient problems, such as a temporarily overloaded service or network.
For more information, see the Gateway API docs.
Before you begin
-
Follow the Get started guide to install kgateway.
-
Follow the Sample app guide to create a gateway proxy with an HTTP listener and deploy the httpbin sample app.
-
Get the external address of the gateway and save it in an environment variable.
export INGRESS_GW_ADDRESS=$(kubectl get svc -n kgateway-system http -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}") echo $INGRESS_GW_ADDRESS
kubectl port-forward deployment/http -n kgateway-system 8080:8080
Step 1: Set up your environment for retries
To use retries, you need to install the experimental channel. You can also set up two things that help you test retries: a sample app that can simulate a failure and an access log policy that tracks whether the request was retried.
-
Install the experimental Kubernetes Gateway API CRDs.
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.1/experimental-install.yaml
-
Install a sample app that you can simulate a failure for, such as adding a
sleep
command to the Bookinfo reviews app.kubectl apply -f- <<EOF --- apiVersion: v1 kind: Service metadata: name: reviews namespace: default labels: app: reviews service: reviews spec: ports: - port: 9080 name: http selector: app: reviews --- apiVersion: v1 kind: ServiceAccount metadata: name: bookinfo-reviews namespace: default labels: account: reviews --- apiVersion: apps/v1 kind: Deployment metadata: name: reviews-v1 namespace: default labels: app: reviews version: v1 spec: replicas: 1 selector: matchLabels: app: reviews version: v1 template: metadata: labels: app: reviews version: v1 spec: serviceAccountName: bookinfo-reviews containers: - name: reviews image: docker.io/istio/examples-bookinfo-reviews-v1:1.20.3 imagePullPolicy: IfNotPresent env: - name: LOG_DIR value: "/tmp/logs" ports: - containerPort: 9080 volumeMounts: - name: tmp mountPath: /tmp - name: wlp-output mountPath: /opt/ibm/wlp/output volumes: - name: wlp-output emptyDir: {} - name: tmp emptyDir: {} EOF
-
Apply an access log policy to the gateway that tracks the number of retries. The key log in the following example is
response_flags
, which is used to verify that the request was retried. For more information, see the Access logging guide and the Envoy access logs response flags docs.kubectl apply -f- <<EOF apiVersion: gateway.kgateway.dev/v1alpha1 kind: HTTPListenerPolicy metadata: name: access-logs namespace: kgateway-system spec: targetRefs: - group: gateway.networking.k8s.io kind: Gateway name: http accessLog: - fileSink: path: /dev/stdout jsonFormat: start_time: "%START_TIME%" method: "%REQ(:METHOD)%" path: "%REQ(:PATH)%" response_code: "%RESPONSE_CODE%" response_flags: "%RESPONSE_FLAGS%" upstream_host: "%UPSTREAM_HOST%" upstream_cluster: "%UPSTREAM_CLUSTER%" EOF
Step 2: Set up retries
Set up retries to the reviews app on the HTTPRoute resource.
-
Create an HTTPRoute resource to specify your retry rules.
kubectl apply -f- <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: retry namespace: default spec: hostnames: - retry.example parentRefs: - group: gateway.networking.k8s.io kind: Gateway name: http namespace: kgateway-system rules: - matches: - path: type: PathPrefix value: / backendRefs: - group: "" kind: Service name: reviews port: 9080 retry: attempts: 3 backoff: 1s timeouts: request: "20s" EOF
Field Description hostnames
The hostnames to match the request, such as retry.example
.parentRefs
The gateway to which the request is sent. In this example, you select the http
gateway that you set up before you began.rules
The rules to apply to requests. matches
The path to match the request. In this example, you match any requests to the reviews app with /
.path
The path to match the request. In this example, you match the request to the /reviews/1
path.backendRefs
The backend service to which the request is sent. In this example, you select the reviews
service that you set up in the previous step.retry.attempts
The number of times to retry the request. In this example, you retry the request 3 times. retry.backoff
The duration to wait before retrying the request. In this example, you wait 1 second before retrying the request. timeouts
The duration to wait before the request times out. This value is higher than the backoff value so that the request can be retried before it times out. In this example, you set the timeout to 20 seconds. -
Verify that the gateway proxy is configured to retry the request.
-
Port-forward the gateway proxy on port 19000.
kubectl port-forward deployment/http -n kgateway-system 19000
-
Get the configuration of your gateway proxy as a config dump.
curl -X POST 127.0.0.1:19000/config_dump\?include_eds > gateway-config.json
-
Open the config dump and find the route configuration for the
kube_default_reviews_9080
Envoy cluster on thelistener~8080~retry_example
virtual host. Verify that the retry policy is set as you configured it.Example
jq
command:jq '.configs[] | select(."@type" == "type.googleapis.com/envoy.admin.v3.RoutesConfigDump") | .dynamic_route_configs[].route_config.virtual_hosts[] | select(.routes[].route.cluster == "kube_default_reviews_9080")' gateway-config.json
Example output:
{ "name": "listener~8080~retry_example", "domains": [ "retry.example" ], "routes": [ { "match": { "prefix": "/" }, "route": { "cluster": "kube_default_reviews_9080", "timeout": "20s", "retry_policy": { "retry_on": "gateway-error,connect-failure,reset", "num_retries": 3, "per_try_timeout": "1s", "retriable_status_codes": [ 404 ], "retry_back_off": { "base_interval": "0.025s" } }, "cluster_not_found_response_code": "INTERNAL_SERVER_ERROR" }, "name": "listener~8080~retry_example-route-0-httproute-retry-default-0-0-matcher-0" } ] } ...
-
-
Send a request to the reviews app. Verify that the request succeeds.
curl -vi http://$INGRESS_GW_ADDRESS:8080/reviews/1 -H "host: retry.example:8080"
curl -vi localhost:8080/reviews/1 -H "host: retry.example"
Example output for a successful response:
HTTP/1.1 200 OK ... {"id": "1","podname": "reviews-v1-598b896c9d-l7d8l","clustername": "null","reviews": [{ "reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!"},{ "reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare."}]}
-
Check the gateway’s access logs to verify that the request was not retried.
kubectl logs -n kgateway-system -l gateway.networking.k8s.io/gateway-name=http | tail -1
Example output: Note that the
response_flags
field is-
, which means that the request was not retried.{ "method": "GET", "path": "/reviews/1", "response_code": 200, "response_flags": "-", "start_time": "2025-06-16T17:24:04.268Z", "upstream_cluster": "kube_default_reviews_9080", "upstream_host": "10.244.0.24:9080" }
Step 3: Trigger a retry
Simulate a failure for the reviews app so that you can verify that the request is retried.
-
Send the reviews app to sleep, to simulate an app failure.
kubectl -n default patch deploy reviews-v1 --patch '{"spec":{"template":{"spec":{"containers":[{"name":"reviews","command":["sleep","20h"]}]}}}}'
-
Send another request to the reviews app. This time, the request fails.
curl -vi http://$INGRESS_GW_ADDRESS:8080/reviews/1 -H "host: retry.example:80"
curl -vi localhost:8080/reviews/1 -H "host: retry.example"
Example output:
HTTP/1.1 503 Service Unavailable ... upstream connect error or disconnect/reset before headers. retried and the latest reset reason: remote connection failure, transport failure reason: delayed connect error: Connection refused
-
Check the gateway’s access logs to verify that the request was retried.
kubectl logs -n kgateway-system -l gateway.networking.k8s.io/gateway-name=http | tail -1 | jq
Example output: Note that the
response_flags
field now has values as follows:URX
meansUpstreamRetryLimitExceeded
, which verifies that the request was retried.UF
meansUpstreamOverflow
, which verifies that the request failed.
{ "method": "GET", "path": "/reviews/1", "response_code": 503, "response_flags": "URX,UF", "start_time": "2025-06-16T17:26:07.287Z", "upstream_cluster": "kube_default_reviews_9080", "upstream_host": "10.244.0.25:9080" }
Cleanup
You can remove the resources that you created in this guide.-
Delete the HTTPRoute resource.
kubectl delete httproute retry -n default
-
Delete the reviews app.
kubectl delete deploy reviews-v1 -n default kubectl delete svc reviews -n default kubectl delete sa bookinfo-reviews -n default
-
Delete the access log policy.
kubectl delete httplistenerpolicy access-logs -n kgateway-system