Backend health checks

Automatically monitor the status of backends by configuring health checks.

Health checks periodically and automatically assess the readiness of the Backend to receive requests. You can configure several settings, such as health thresholds and check intervals, that kgateway uses to determine whether a service is marked as healthy or unhealthy. For more information, see the Envoy health checking documentation.

Before you begin

  1. Follow the Get started guide to install kgateway.

  2. Follow the Sample app guide to create a gateway proxy with an HTTP listener and deploy the httpbin sample app.

  3. Get the external address of the gateway and save it in an environment variable.

    export INGRESS_GW_ADDRESS=$(kubectl get svc -n kgateway-system http -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}")
    echo $INGRESS_GW_ADDRESS  
    kubectl port-forward deployment/http -n kgateway-system 8080:8080

Configure a health check for a Backend

In the healthCheck section of a BackendConfigPolicy resource, specify settings for how you want the health check to perform for a Backend or Kubernetes service. The following example configures a simple set of HTTP health check settings for a service to get you started.

apiVersion: gateway.kgateway.dev/v1alpha1
kind: BackendConfigPolicy
metadata:
  name: healthcheck-policy
  namespace: kgateway-system
spec:
  targetRefs:
    - name: my-backend
      group: ""
      kind: Service
  healthCheck:
    healthyThreshold: 1
    http:
      path: /status/200
    interval: 30s
    timeout: 10s
    unhealthyThreshold: 1
Review the following table to understand this configuration.
Setting Description
healthyThreshold The number of successful health checks required before a Backend is marked as healthy. Note that during startup, only a single successful health check is required to mark a Backend healthy.
http Configuration for an HTTP health check.
http.host The host header in the HTTP health check request. If unset, defaults to the name of the Backend that this health check is associated with.
http.path The path on your app that you want kgateway to send the health check request to.
http.method The HTTP method for the health check to use. If unset, defaults to GET.
interval The amount of time between sending health checks to the Backend. You can increase this value to ensure that you don’t overload your Backend service.
timeout The time to wait for a health check response. If the timeout is reached, the health check is considered unsuccessful.
unhealthyThreshold The number of unsuccessful health checks required before a Backend is marked unhealthy. Note that for HTTP health checking, if a Backend responds with 503 Service Unavailable, this threshold is ignored and the Backend is immediately considered unhealthy.
grpc Optional configuration for a gRPC health check. The example omits this field because the Backend is not a gRPC service.
grpc.authority The authority header in the gRPC health check request. If unset, defaults to the name of the Backend that this health check is associated with.
grpc.serviceName Optional: Name of the gRPC service to check. The example omits this field because the Backend is not a gRPC service.

Verify the health check configuration

To try out an active health check policy, you can follow these steps to create a BackendConfigPolicy for the httpbin sample app and check the endpoint status in the Envoy service directory.

  1. Create a BackendConfigPolicy resource that configures a health check on the httpbin path /status/503. This path always returns a 503 Service Unavailable HTTP response code, which kgateway interprets as a failing request.

    kubectl apply -f- <<EOF
    apiVersion: gateway.kgateway.dev/v1alpha1
    kind: BackendConfigPolicy
    metadata:
      name: httpbin-healthcheck
      namespace: httpbin
    spec:
      targetRefs:
        - name: httpbin
          group: ""
          kind: Service
      healthCheck:
        healthyThreshold: 1
        http:
          path: /status/503
        interval: 2s
        timeout: 1s
        unhealthyThreshold: 1
    EOF
  2. Check the endpoint in the Envoy service directory.

    1. Port-forward the http gateway deployment on port 19000.
      kubectl port-forward deploy/http -n kgateway-system 19000 &
    2. Send an HTTP GET request to the /clusters endpoint.
      curl -X GET 127.0.0.1:19000/clusters
    3. In the output, search for /failed_active_hc, which indicates that the Backend failed its active health check. For example, you might see a line such as the following.
      httpbin_httpbin::10.XX.X.XX:8080::health_flags::/failed_active_hc
  3. Check the Envoy logs for health check failures.

    1. Get the logs for the http gateway deployment.
      kubectl logs -f deploy/http -n kgateway-system > gateway-proxy.log
    2. In the output gateway-proxy.log file, search for events such as health_check_failure_event as shown in the following example log lines.
      {"health_checker_type":"HTTP","host":{"socket_address":{"protocol":"TCP","address":"10.XX.X.XX","port_value":8080,"resolver_name":"","ipv4_compat":false}},"cluster_name":"httpbin_httpbin","timestamp":"2024-08-20T18:13:47.577Z","health_check_failure_event":{"failure_type":"ACTIVE","first_check":false},"metadata":{"filter_metadata":{"envoy.lb":{"version":"v1","app":"httpbin","pod-template-hash":"f46cc8b9b"}},"typed_filter_metadata":{}},"locality":{"region":"","zone":"","sub_zone":""}}

Cleanup

You can remove the resources that you created in this guide.
kubectl delete BackendConfigPolicy httpbin-healthcheck -n httpbin