API reference
Packages
gateway.kgateway.dev/v1alpha1
Resource Types
- AgentgatewayBackend
- AgentgatewayPolicy
- Backend
- BackendConfigPolicy
- DirectResponse
- GatewayExtension
- GatewayParameters
- HTTPListenerPolicy
- TrafficPolicy
AIBackend
AIBackend specifies the AI backend configuration
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
provider LLMProvider |
provider specifies configuration for how to reach the configured LLM provider. | ||
groups PriorityGroup array |
groups specifies a list of groups in priority order where each group defines a set of LLM providers. The priority determines the priority of the backend endpoints chosen. Note: provider names must be unique across all providers in all priority groups. Backend policies may target a specific provider by name using targetRefs[].sectionName. Example configuration with two priority groups: yaml<br />groups:<br />- providers:<br /> - azureopenai:<br /> deploymentName: gpt-4o-mini<br /> apiVersion: 2024-02-15-preview<br /> endpoint: ai-gateway.openai.azure.com<br />- providers:<br /> - azureopenai:<br /> deploymentName: gpt-4o-mini-2<br /> apiVersion: 2024-02-15-preview<br /> endpoint: ai-gateway-2.openai.azure.com<br /> policies:<br /> auth:<br /> secretRef:<br /> name: azure-secret<br />TODO: enable this rule when we don’t need to support older k8s versions where this rule breaks // +kubebuilder:validation:XValidation:message=“provider names must be unique across groups”,rule=“self.map(pg, pg.providers.map(pp, pp.name)).map(p, self.map(pg, pg.providers.map(pp, pp.name)).filter(cp, cp != p).exists(cp, p.exists(pn, pn in cp))).exists(p, !p)” |
MaxItems: 32 MinItems: 1 |
AIPromptEnrichment
AIPromptEnrichment defines the config to enrich requests sent to the LLM provider by appending and prepending system prompts.
Prompt enrichment allows you to add additional context to the prompt before sending it to the model. Unlike RAG or other dynamic context methods, prompt enrichment is static and is applied to every request.
Note: Some providers, including Anthropic, do not support SYSTEM role messages, and instead have a dedicated
system field in the input JSON. In this case, use the defaults setting to set the system field.
The following example prepends a system prompt of Answer all questions in French.
and appends Describe the painting as if you were a famous art critic from the 17th century.
to each request that is sent to the openai HTTPRoute.
name: openai-opt
namespace: kgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: openai
ai:
promptEnrichment:
prepend:
- role: SYSTEM
content: "Answer all questions in French."
append:
- role: USER
content: "Describe the painting as if you were a famous art critic from the 17th century."Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
prepend Message array |
A list of messages to be prepended to the prompt sent by the client. | ||
append Message array |
A list of messages to be appended to the prompt sent by the client. |
AIPromptGuard
AIPromptGuard configures a prompt guards to block unwanted requests to the LLM provider and mask sensitive data. Prompt guards can be used to reject requests based on the content of the prompt, as well as mask responses based on the content of the response.
This example rejects any request prompts that contain the string “credit card”, and masks any credit card numbers in the response.
promptGuard:
request:
- response:
message: "Rejected due to inappropriate content"
regex:
action: REJECT
matches:
- pattern: "credit card"
name: "CC"
response:
- regex:
builtins:
- CREDIT_CARD
action: MASKAppears in:
| Field | Description | Default | Validation |
|---|---|---|---|
request PromptguardRequest array |
Prompt guards to apply to requests sent by the client. | MaxItems: 8 MinItems: 1 |
|
response PromptguardResponse array |
Prompt guards to apply to responses returned by the LLM provider. | MaxItems: 8 MinItems: 1 |
APIKeyAuthenticationMode
Underlying type: string
Validation:
- Enum: [Strict Optional]
Appears in:
| Field | Description |
|---|---|
Strict |
A valid API Key must be present. This is the default option. |
Optional |
If an API Key exists, validate it. Warning: this allows requests without an API Key! |
AWSGuardrailConfig
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
identifier string |
GuardrailIdentifier is the identifier of the Guardrail policy to use for the backend. | ||
version string |
GuardrailVersion is the version of the Guardrail policy to use for the backend. |
AWSLambdaPayloadTransformMode
Underlying type: string
AWSLambdaPayloadTransformMode defines the transformation mode for the payload in the request before it is sent to the AWS Lambda function.
Validation:
- Enum: [None Envoy]
Appears in:
| Field | Description |
|---|---|
None |
AWSLambdaPayloadTransformNone indicates that the payload will not be transformed using Envoy’s built-in transformation before it is sent to the Lambda function. Note: Transformation policies configured on the route will still apply. |
Envoy |
AWSLambdaPayloadTransformEnvoy indicates that the payload will be transformed using Envoy’s built-in transformation. Refer to https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/aws_lambda_filter#configuration-as-a-listener-filter for more details on how Envoy transforms the payload. |
AccessLog
AccessLog represents the top-level access log configuration.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
fileSink FileSink |
Output access logs to local file | ||
grpcService AccessLogGrpcService |
Send access logs to gRPC service | ||
openTelemetry OpenTelemetryAccessLogService |
Send access logs to an OTel collector | ||
filter AccessLogFilter |
Filter access logs configuration | MaxProperties: 1 MinProperties: 1 |
AccessLogFilter
AccessLogFilter represents the top-level filter structure. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-accesslogfilter
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
andFilter FilterType array |
Performs a logical “and” operation on the result of each individual filter. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-andfilter |
MaxProperties: 1 MinItems: 2 MinProperties: 1 |
|
orFilter FilterType array |
Performs a logical “or” operation on the result of each individual filter. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-orfilter |
MaxProperties: 1 MinItems: 2 MinProperties: 1 |
AccessLogGrpcService
AccessLogGrpcService represents the gRPC service configuration for access logs. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/access_loggers/grpc/v3/als.proto#envoy-v3-api-msg-extensions-access-loggers-grpc-v3-httpgrpcaccesslogconfig
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendRef |
The backend gRPC service. Can be any type of supported backend (Kubernetes Service, kgateway Backend, etc..) | ||
authority string |
The :authority header in the grpc request. If this field is not set, the authority header value will be cluster_name. Note that this authority does not override the SNI. The SNI is provided by the transport socket of the cluster. |
||
maxReceiveMessageLength integer |
Maximum gRPC message size that is allowed to be received. If a message over this limit is received, the gRPC stream is terminated with the RESOURCE_EXHAUSTED error. Defaults to 0, which means unlimited. |
Minimum: 1 |
|
skipEnvoyHeaders boolean |
This provides gRPC client level control over envoy generated headers. If false, the header will be sent but it can be overridden by per stream option. If true, the header will be removed and can not be overridden by per stream option. Default to false. | ||
timeout Duration |
The timeout for the gRPC request. This is the timeout for a specific request | ||
initialMetadata HeaderValue array |
Additional metadata to include in streams initiated to the GrpcService. This can be used for scenarios in which additional ad hoc authorization headers (e.g. x-foo-bar: baz-key) are to be injected |
||
retryPolicy RetryPolicy |
Indicates the retry policy for re-establishing the gRPC stream. If max interval is not provided, it will be set to ten times the provided base interval |
||
logName string |
name of log stream | ||
additionalRequestHeadersToLog string array |
Additional request headers to log in the access log | ||
additionalResponseHeadersToLog string array |
Additional response headers to log in the access log | ||
additionalResponseTrailersToLog string array |
Additional response trailers to log in the access log |
Action
Underlying type: string
Action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. PromptguardResponse matches are always masked by default.
Appears in:
| Field | Description |
|---|---|
MASK |
Mask the matched data in the request. |
REJECT |
Reject the request if the regex matches content in the request. |
AgentAPIKeyAuthentication
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mode APIKeyAuthenticationMode |
Validation mode for api key authentication. | Strict | Enum: [Strict Optional] |
secretRef LocalObjectReference |
secretRef references a Kubernetes secret storing a set of API Keys. If there are many keys, ‘secretSelector’ can be used instead. Each entry in the Secret represents one API Key. The key is an arbitrary identifier. The value can either be: * A string, representing the API Key. * A JSON object, with two fields, key and metadata. key contains the API Key. metadata contains arbitrary JSONmetadata associated with the key, which may be used by other policies. For example, you may write an authorization policy allow apiKey.group == 'sales'.Example: apiVersion: v1 kind: Secret metadata: name: api-key stringData: client1: | { “key”: “k-123”, “metadata”: { “group”: “sales”, “created_at”: “2024-10-01T12:00:00Z”, } } client2: “k-456” |
||
secretSelector SecretSelector |
secretSelector selects multiple secrets containing API Keys. If the same key is defined in multiple secrets, the behavior is undefined. Each entry in the Secret represents one API Key. The key is an arbitrary identifier. The value can either be: * A string, representing the API Key. * A JSON object, with two fields, key and metadata. key contains the API Key. metadata contains arbitrary JSONmetadata associated with the key, which may be used by other policies. For example, you may write an authorization policy allow apiKey.group == 'sales'.Example: apiVersion: v1 kind: Secret metadata: name: api-key stringData: client1: | { “key”: “k-123”, “metadata”: { “group”: “sales”, “created_at”: “2024-10-01T12:00:00Z”, } } client2: “k-456” |
AgentAccessLog
accessLogs specifies how per-request access logs are emitted.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
filter CELExpression |
filter specifies a CEL expression that is used to filter logs. A log will only be emitted if the expression evaluates to ’true’. |
MaxLength: 16384 MinLength: 1 |
|
attributes AgentLogTracingFields |
attributes specifies customizations to the key-value pairs that are logged |
AgentAttributeAdd
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
|||
expression CELExpression |
MaxLength: 16384 MinLength: 1 |
AgentBasicAuthentication
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mode BasicAuthenticationMode |
validation mode for basic auth authentication. | Strict | Enum: [Strict Optional] |
realm string |
realm specifies the ‘realm’ to return in the WWW-Authenticate header for failed authentication requests. If unset, “Restricted” will be used. |
||
users string array |
users provides an inline list of username/password pairs that will be accepted. Each entry represents one line of the htpasswd format: https://httpd.apache.org/docs/2.4/programs/htpasswd.html. Note: passwords should be the hash of the password, not the raw password. Use the htpasswd or similar commandsto generate a hash. MD5, bcrypt, crypt, and SHA-1 are supported. Example: users: - “user1:$apr1$ivPt0D4C$DmRhnewfHRSrb3DQC.WHC." - “user2:$2y$05$r3J4d3VepzFkedkd/q1vI.pBYIpSqjfN0qOARV3ScUHysatnS0cL2” |
MaxItems: 256 MinItems: 1 |
|
secretRef LocalObjectReference |
secretRef references a Kubernetes secret storing the .htaccess file. The Secret must have a key named ‘.htaccess’, and should contain the complete .htaccess file. Note: passwords should be the hash of the password, not the raw password. Use the htpasswd or similar commandsto generate a hash. MD5, bcrypt, crypt, and SHA-1 are supported. Example: apiVersion: v1 kind: Secret metadata: name: basic-auth stringData: .htaccess: | alice:$apr1$3zSE0Abt$IuETi4l5yO87MuOrbSE4V. bob:$apr1$Ukb5LgRD$EPY2lIfY.A54jzLELNIId/ |
AgentCSRFPolicy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
additionalOrigins string array |
additionalOrigin specifies additional source origins that will be allowed in addition to the destination origin. TheOrigin consists of a scheme and a host, with an optional port, and takes the form <scheme>://<host>(:<port>). |
MaxItems: 16 MinItems: 1 |
AgentCorsPolicy
Appears in:
AgentDynamicForwardProxyBackend
Appears in:
AgentExtAuthBody
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
maxSize integer |
maxSize specifies how large in bytes the largest body that will be buffered and sent to the authorization server. If the body size is larger than maxSize, then the request will be rejected with a response. |
Minimum: 1 |
AgentExtAuthPolicy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the External Authorization server to reach. Supported types: Service and Backend. |
||
forwardBody AgentExtAuthBody |
forwardBody configures whether to include the HTTP body in the request. If enabled, the request body will be buffered. |
||
contextExtensions object (keys:string, values:string) |
contextExtensions specifies additional arbitrary key-value pairs to send to the authorization server. | MaxProperties: 64 |
AgentExtProcPolicy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the External Processor server to reach. Supported types: Service and Backend. |
AgentHeaderName
Underlying type: string
AgentHeaderName is the name of a header.
Validation:
- MaxLength: 256
- MinLength: 1
- Pattern:
^:?[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$
Appears in:
AgentHeaderTransformation
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name AgentHeaderName |
the name of the header to add. | MaxLength: 256 MinLength: 1 Pattern: ^:?[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$ |
|
value CELExpression |
value is the CEL expression to apply to generate the output value for the header. | MaxLength: 16384 MinLength: 1 |
AgentHostnameRewrite
Underlying type: string
Appears in:
| Field | Description |
|---|---|
Auto |
|
None |
AgentHostnameRewriteConfig
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mode AgentHostnameRewrite |
mode sets the hostname rewrite mode. The following may be specified: * Auto: automatically set the Host header based on the destination. * None: do not rewrite the Host header. The original Host header will be passed through. This setting defaults to Auto when connecting to hostname-based Backend types, and None otherwise (for Service or IP-based Backends). |
AgentJWKS
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
remote AgentRemoteJWKS |
remote specifies how to reach the JSON Web Key Set from a remote address. | ||
inline string |
inline specifies an inline JSON Web Key Set used validate the signature of the JWT. | MaxLength: 65536 MinLength: 2 |
AgentJWTAuthentication
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mode JWTAuthenticationMode |
validation mode for JWT authentication. | Strict | Enum: [Strict Optional Permissive] |
providers AgentJWTProvider array |
MaxItems: 64 MinItems: 1 |
AgentJWTProvider
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
issuer string |
issuer identifies the IdP that issued the JWT. This corresponds to the ‘iss’ claim (https://tools.ietf.org/html/rfc7519#section-4.1.1). | ||
audiences string array |
audiences specifies the list of allowed audiences that are allowed access. This corresponds to the ‘aud’ claim (https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.3). If unset, any audience is allowed. |
MaxItems: 64 MinItems: 1 |
|
jwks AgentJWKS |
jwks defines the JSON Web Key Set used to validate the signature of the JWT. |
AgentLocalRateLimitPolicy
AgentLocalRateLimitPolicy represents a policy for local rate limiting. It defines the configuration for rate limiting using a token bucket mechanism.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
requests integer |
requests specifies the number of HTTP requests per unit of time that are allowed. Requests exceeding this limit will fail with a 429 error. |
Minimum: 1 |
|
tokens integer |
tokens specifies the number of LLM tokens per unit of time that are allowed. Requests exceeding this limit will fail with a 429 error. Both input and output tokens are counted. However, token counts are not known until the request completes. As a result, token-based rate limits will apply to future requests only. |
Minimum: 1 |
|
unit LocalRateLimitUnit |
unit specifies the unit of time that requests are limited based on. | Enum: [Seconds Minutes Hours] |
|
burst integer |
burst specifies an allowance of requests above the request-per-unit that should be allowed within a short period of time. |
AgentLogTracingFields
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
remove string array |
remove lists the default fields that should be removed. For example, “http.method”. | MaxItems: 32 MinItems: 1 |
|
add AgentAttributeAdd array |
add specifies additional key-value pairs to be added to each entry. The value is a CEL expression. If the CEL expression fails to evaluate, the pair will be excluded. |
MinItems: 1 |
AgentRateLimit
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
local AgentLocalRateLimitPolicy array |
Local defines a local rate limiting policy. | MaxItems: 16 MinItems: 1 |
|
global AgentRateLimitPolicy |
Global defines a global rate limiting policy using an external service. |
AgentRateLimitDescriptor
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
entries AgentRateLimitDescriptorEntry array |
entries are the individual components that make up this descriptor. | MaxItems: 16 MinItems: 1 |
|
unit RateLimitUnit |
unit defines what to use as the cost function. If unspecified, Requests is used. | Enum: [Requests Tokens] |
AgentRateLimitPolicy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the Rate Limit server to reach. Supported types: Service and Backend. |
||
domain string |
domain specifies the domain under which this limit should apply. This is an arbitrary string that enables a rate limit server to distinguish between different applications. |
||
descriptors AgentRateLimitDescriptor array |
Descriptors define the dimensions for rate limiting. These values are passed to the rate limit service which applies configured limits based on them. Each descriptor represents a single rate limit rule with one or more entries. |
MaxItems: 16 MinItems: 1 |
AgentRemoteJWKS
Appears in:
AgentStaticBackend
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
host string |
host to connect to. | ||
port integer |
port to connect to. | Maximum: 65535 Minimum: 1 |
AgentTimeouts
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
request Duration |
request specifies a timeout for an individual request from the gateway to a backend. This covers the time from when the request first starts being sent from the gateway to when the full response has been received from the backend. |
AgentTracing
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the OTLP server to reach. Supported types: Service and Backend. |
||
protocol TracingProtocol |
protocol specifies the OTLP protocol variant to use. | HTTP | Enum: [HTTP GRPC] |
attributes AgentLogTracingFields |
attributes specifies customizations to the key-value pairs that are included in the trace | ||
randomSampling CELExpression |
randomSampling is an expression to determine the amount of random sampling. Random sampling will initiate a new trace span if the incoming request does not have a trace initiated already. This should evaluate to a float between 0.0-1.0, or a boolean (true/false) If unspecified, random sampling is disabled. |
MaxLength: 16384 MinLength: 1 |
|
clientSampling CELExpression |
clientSampling is an expression to determine the amount of client sampling. Client sampling determines whether to initiate a new trace span if the incoming request does have a trace already. This should evaluate to a float between 0.0-1.0, or a boolean (true/false) If unspecified, client sampling is 100% enabled. |
MaxLength: 16384 MinLength: 1 |
AgentTransform
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
set AgentHeaderTransformation array |
set is a list of headers and the value they should be set to. | MaxItems: 16 MinItems: 1 |
|
add AgentHeaderTransformation array |
add is a list of headers to add to the request and what that value should be set to. If there is already a header with these values then append the value as an extra entry. |
MaxItems: 16 MinItems: 1 |
|
remove AgentHeaderName array |
Remove is a list of header names to remove from the request/response. | MaxItems: 16 MaxLength: 256 MinItems: 1 MinLength: 1 Pattern: ^:?[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$ |
|
body CELExpression |
body controls manipulation of the HTTP body. | MaxLength: 16384 MinLength: 1 |
AgentTransformationPolicy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
request AgentTransform |
request is used to modify the request path. | ||
response AgentTransform |
response is used to modify the response path. |
Agentgateway
Agentgateway configures the agentgateway dataplane integration to be enabled if the agentgateway GatewayClass is used.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Whether to enable the extension. | ||
logLevel string |
Log level for the agentgateway. Defaults to info. Levels include “trace”, “debug”, “info”, “error”, “warn”. See: https://docs.rs/tracing/latest/tracing/struct.Level.html |
||
image Image |
The agentgateway container image. See https://kubernetes.io/docs/concepts/containers/images for details. Default values, which may be overridden individually: registry: ghcr.io/agentgateway repository: agentgateway tag: pullPolicy: IfNotPresent |
||
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
||
env EnvVar array |
The container environment variables. | ||
customConfigMapName string |
Name of the custom configmap to use instead of the default generated one. When set, the agent gateway will use this configmap instead of creating the default one. The configmap must contain a ‘config.yaml’ key with the agent gateway configuration. |
||
extraVolumeMounts VolumeMount array |
Additional volume mounts to add to the container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#volumemount-v1-core for details. |
AgentgatewayBackend
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
AgentgatewayBackend |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AgentgatewayBackendSpec |
spec defines the desired state of AgentgatewayBackend. | ||
status AgentgatewayBackendStatus |
status defines the current state of AgentgatewayBackend. |
AgentgatewayBackendSpec
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
static AgentStaticBackend |
static represents a static hostname. | ||
ai AIBackend |
ai represents a LLM backend. | ||
mcp MCPBackend |
mcp represents an MCP backend | ||
dynamicForwardProxy AgentDynamicForwardProxyBackend |
dynamicForwardProxy configures the proxy to dynamically send requests to the destination based on the incoming request HTTP host header, or TLS SNI for TLS traffic. Note: this Backend type enables users to send trigger the proxy to send requests to arbitrary destinations. Proper access controls must be put in place when using this backend type. |
||
policies AgentgatewayPolicyBackendFull |
policies controls policies for communicating with this backend. Policies may also be set in AgentgatewayPolicy; policies are merged on a field-level basis, with policies on the Backend (this field) taking precedence. |
AgentgatewayBackendStatus
AgentgatewayBackend defines the observed state of AgentgatewayBackend.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
conditions Condition array |
Conditions is the list of conditions for the backend. | MaxItems: 8 |
AgentgatewayKeepalive
TCP Keepalive settings
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
retries integer |
retries specifies the maximum number of keep-alive probes to send before dropping the connection. If unset, this defaults to 9. |
Maximum: 64 Minimum: 1 |
|
time Duration |
time specifies the number of seconds a connection needs to be idle before keep-alive probes start being sent. If unset, this defaults to 180s. |
||
interval Duration |
interval specifies the number of seconds between keep-alive probes. If unset, this defaults to 180s. |
AgentgatewayPolicy
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
AgentgatewayPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AgentgatewayPolicySpec |
spec defines the desired state of AgentgatewayPolicy. | ||
status PolicyStatus |
status defines the current state of AgentgatewayPolicy. |
AgentgatewayPolicyBackendAI
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
ai BackendAI |
ai specifies settings for AI workloads. This is only applicable when connecting to a Backend of type ‘ai’. |
AgentgatewayPolicyBackendFull
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
ai BackendAI |
ai specifies settings for AI workloads. This is only applicable when connecting to a Backend of type ‘ai’. | ||
mcp BackendMCP |
mcp specifies settings for MCP workloads. This is only applicable when connecting to a Backend of type ‘mcp’. |
AgentgatewayPolicyBackendMCP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mcp BackendMCP |
mcp specifies settings for MCP workloads. This is only applicable when connecting to a Backend of type ‘mcp’. |
AgentgatewayPolicyBackendSimple
Appears in:
- AgentgatewayPolicyBackendAI
- AgentgatewayPolicyBackendFull
- AgentgatewayPolicyBackendMCP
- OpenAIModeration
AgentgatewayPolicyFrontend
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
tcp FrontendTCP |
tcp defines settings on managing incoming TCP connections. | ||
tls FrontendTLS |
tls defines settings on managing incoming TLS connections. | ||
http FrontendHTTP |
http defines settings on managing incoming HTTP requests. | ||
accessLog AgentAccessLog |
AccessLoggingConfig contains access logging configuration | ||
tracing AgentTracing |
Tracing contains various settings for OpenTelemetry tracer. TODO: not currently implemented |
AgentgatewayPolicySpec
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
targetRefs LocalPolicyTargetReferenceWithSectionName array |
targetRefs specifies the target resources by reference to attach the policy to. | MaxItems: 16 MinItems: 1 |
|
targetSelectors LocalPolicyTargetSelectorWithSectionName array |
targetSelectors specifies the target selectors to select resources to attach the policy to. | MaxItems: 16 MinItems: 1 |
|
frontend AgentgatewayPolicyFrontend |
frontend defines settings for how to handle incoming traffic. A frontend policy can only target a Gateway. Listener and ListenerSet are not valid targets. When multiple policies are selected for a given request, they are merged on a field-level basis, but not a deep merge. For example, policy A sets ’tcp’ and ’tls’, and policy B sets ’tls’, the effective policy would be ’tcp’ from policy A, and ’tls’ from policy B. |
||
traffic AgentgatewayPolicyTraffic |
traffic defines settings for how process traffic. A traffic policy can target a Gateway (optionally, with a sectionName indicating the listener), ListenerSet, Route (optionally, with a sectionName indicating the route rule). When multiple policies are selected for a given request, they are merged on a field-level basis, but not a deep merge. Precedence is given to more precise policies: Gateway < Listener < Route < Route Rule. For example, policy A sets ’timeouts’ and ‘retries’, and policy B sets ‘retries’, the effective policy would be ’timeouts’ from policy A, and ‘retries’ from policy B. |
||
backend AgentgatewayPolicyBackendFull |
backend defines settings for how to connect to destination backends. A backend policy can target a Gateway (optionally, with a sectionName indicating the listener), ListenerSet, Route (optionally, with a sectionName indicating the route rule), or a Service/Backend (optionally, with a sectionName indicating the port (for Service) or sub-backend (for Backend). Note that a backend policy applies when connecting to a specific destination backend. Targeting a higher level resource, like Gateway, is just a way to easily apply a policy to a group of backends. When multiple policies are selected for a given request, they are merged on a field-level basis, but not a deep merge. Precedence is given to more precise policies: Gateway < Listener < Route < Route Rule < Backend/Service. For example, if a Gateway policy sets ’tcp’ and ’tls’, and a Backend policy sets ’tls’, the effective policy would be ‘tcp’ from the Gateway, and ’tls’ from the Backend. |
AgentgatewayPolicyTraffic
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
phase PolicyPhase |
The phase to apply the traffic policy to. If the phase is PreRouting, the targetRef must be a Gateway or a Listener. PreRouting is typically used only when a policy needs to influence the routing decision. Even when using PostRouting mode, the policy can target the Gateway/Listener. This is a helper for applying the policy to all routes under that Gateway/Listener, and follows the merging logic described above. Note: PreRouting and PostRouting rules do not merge together. These are independent execution phases. That is, all PreRouting rules will merge and execute, then all PostRouting rules will merge and execute. If unset, this defaults to PostRouting. |
Enum: [PreRouting PostRouting] |
|
transformation AgentTransformationPolicy |
transformation is used to mutate and transform requests and responses before forwarding them to the destination. |
||
extProc AgentExtProcPolicy |
extProc specifies the external processing configuration for the policy. | ||
extAuth AgentExtAuthPolicy |
extAuth specifies the external authentication configuration for the policy. This controls what external server to send requests to for authentication. |
||
rateLimit AgentRateLimit |
rateLimit specifies the rate limiting configuration for the policy. This controls the rate at which requests are allowed to be processed. |
||
cors AgentCorsPolicy |
cors specifies the CORS configuration for the policy. | ||
csrf AgentCSRFPolicy |
csrf specifies the Cross-Site Request Forgery (CSRF) policy for this traffic policy. The CSRF policy has the following behavior: * Safe methods (GET, HEAD, OPTIONS) are automatically allowed * Requests without Sec-Fetch-Site or Origin headers are assumed to be same-origin or non-browser requests and are allowed. * Otherwise, the Sec-Fetch-Site header is checked, with a fallback to comparing the Origin header to the Host header. |
||
headerModifiers HeaderModifiers |
headerModifiers defines the policy to modify request and response headers. | ||
hostRewrite AgentHostnameRewriteConfig |
hostRewrite specifies how to rewrite the Host header for requests. If the HTTPRoute urlRewrite filter already specifies a host rewrite, this setting is ignored. |
Enum: [Auto None] |
|
timeouts AgentTimeouts |
timeouts defines the timeouts for requests It is applicable to HTTPRoutes and ignored for other targeted kinds. |
||
retry Retry |
retry defines the policy for retrying requests. | ||
authorization Authorization |
authorization specifies the access rules based on roles and permissions. If multiple authorization rules are applied across different policies (at the same, or different, attahcment points), all rules are merged. |
||
jwtAuthentication AgentJWTAuthentication |
jwtAuthentication authenticates users based on JWT tokens. | ||
basicAuthentication AgentBasicAuthentication |
basicAuthentication authenticates users based on the “Basic” authentication scheme (RFC 7617), where a username and password are encoded in the request. |
||
apiKeyAuthentication AgentAPIKeyAuthentication |
apiKeyAuthentication authenticates users based on a configured API Key. |
AlwaysOnConfig
Underlying type: struct{}
AlwaysOnConfig specified the AlwaysOn samplerc
Appears in:
AnthropicConfig
AnthropicConfig settings for the Anthropic LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
Optional: Override the model name, such as gpt-4o-mini.If unset, the model name is taken from the request. |
AnyValue
AnyValue is used to represent any type of attribute value. AnyValue may contain a primitive value such as a string or integer or it may contain an arbitrary nested object containing arrays, key-value lists and primitives. This is limited to string and nested values as OTel only supports them
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
stringValue string |
|||
arrayValue AnyValue array |
TODO: Add support for ArrayValue && KvListValue | MaxProperties: 1 MinProperties: 1 |
AppProtocol
Underlying type: string
AppProtocol defines the application protocol to use when communicating with the backend.
Validation:
- Enum: [http2 grpc grpc-web kubernetes.io/h2c kubernetes.io/ws]
Appears in:
| Field | Description |
|---|---|
http2 |
AppProtocolHttp2 is the http2 app protocol. |
grpc |
AppProtocolGrpc is the grpc app protocol. |
grpc-web |
AppProtocolGrpcWeb is the grpc-web app protocol. |
kubernetes.io/h2c |
AppProtocolKubernetesH2C is the kubernetes.io/h2c app protocol. |
kubernetes.io/ws |
AppProtocolKubernetesWs is the kubernetes.io/ws app protocol. |
Authorization
Authorization defines the configuration for role-based access control.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
policy AuthorizationPolicy |
Policy specifies the Authorization rule to evaluate. A policy matches when any of the conditions evaluates to true. |
||
action AuthorizationPolicyAction |
Action defines whether the rule allows or denies the request if matched. If unspecified, the default is “Allow”. |
Allow | Enum: [Allow Deny] |
AuthorizationPolicy
AuthorizationPolicy defines a single Authorization rule.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
matchExpressions CELExpression array |
MatchExpressions defines a set of conditions that must be satisfied for the rule to match. These expression should be in the form of a Common Expression Language (CEL) expression. |
MaxItems: 256 MaxLength: 16384 MinItems: 1 MinLength: 1 |
AuthorizationPolicyAction
Underlying type: string
AuthorizationPolicyAction defines the action to take when the RBACPolicies matches.
Appears in:
| Field | Description |
|---|---|
Allow |
AuthorizationPolicyActionAllow defines the action to take when the RBACPolicies matches. |
Deny |
AuthorizationPolicyActionDeny denies the action to take when the RBACPolicies matches. |
AwsAuth
AwsAuth specifies the authentication method to use for the backend.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
type AwsAuthType |
Type specifies the authentication method to use for the backend. | Enum: [Secret] |
|
secretRef LocalObjectReference |
SecretRef references a Kubernetes Secret containing the AWS credentials. The Secret must have keys “accessKey”, “secretKey”, and optionally “sessionToken”. |
AwsAuthType
Underlying type: string
AwsAuthType specifies the authentication method to use for the backend.
Appears in:
| Field | Description |
|---|---|
Secret |
AwsAuthTypeSecret uses credentials stored in a Kubernetes Secret. |
AwsBackend
AwsBackend is the AWS backend configuration.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
lambda AwsLambda |
Lambda configures the AWS lambda service. | ||
accountId string |
AccountId is the AWS account ID to use for the backend. | MaxLength: 12 MinLength: 1 Pattern: ^[0-9]\{12\}$ |
|
auth AwsAuth |
Auth specifies an explicit AWS authentication method for the backend. When omitted, the following credential providers are tried in order, stopping when one of them returns an access key ID and a secret access key (the session token is optional): 1. Environment variables: when the environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN are set. 2. AssumeRoleWithWebIdentity API call: when the environment variables AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_ARN are set. 3. EKS Pod Identity: when the environment variable AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE is set. See the Envoy docs for more info: https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/aws_request_signing_filter#credentials |
||
region string |
Region is the AWS region to use for the backend. Defaults to us-east-1 if not specified. |
us-east-1 | MaxLength: 63 MinLength: 1 Pattern: ^[a-z0-9-]+$ |
AwsLambda
AwsLambda configures the AWS lambda service.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
endpointURL string |
EndpointURL is the URL or domain for the Lambda service. This is primarily useful for testing and development purposes. When omitted, the default lambda hostname will be used. |
MaxLength: 2048 Pattern: ^https?://[-a-zA-Z0-9@:%.+~#?&/=]+$ |
|
functionName string |
FunctionName is the name of the Lambda function to invoke. | Pattern: ^[A-Za-z0-9-_]\{1,140\}$ |
|
invocationMode string |
InvocationMode defines how to invoke the Lambda function. Defaults to Sync. |
Sync | Enum: [Sync Async] |
qualifier string |
Qualifier is the alias or version for the Lambda function. Valid values include a numeric version (e.g. “1”), an alias name (alphanumeric plus “-” or “_”), or the special literal “$LATEST”. |
$LATEST | Pattern: ^(\$LATEST|[0-9]+|[A-Za-z0-9-_]\{1,128\})$ |
payloadTransformMode AWSLambdaPayloadTransformMode |
PayloadTransformation specifies payload transformation mode before it is sent to the Lambda function. Defaults to Envoy. |
Envoy | Enum: [None Envoy] |
AzureOpenAIConfig
AzureOpenAIConfig settings for the Azure OpenAI LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
endpoint string |
The endpoint for the Azure OpenAI API to use, such as my-endpoint.openai.azure.com.If the scheme is included, it is stripped. |
MinLength: 1 |
|
deploymentName string |
The name of the Azure OpenAI model deployment to use. For more information, see the Azure OpenAI model docs. This is required if ApiVersion is not ‘v1’. For v1, the model can be set in the request. |
MinLength: 1 |
|
apiVersion string |
The version of the Azure OpenAI API to use. For more information, see the Azure OpenAI API version reference. If unset, defaults to “v1” |
Backend
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
Backend |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec BackendSpec |
|||
status BackendStatus |
BackendAuthPassthrough
Appears in:
BackendConfigPolicy
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
BackendConfigPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec BackendConfigPolicySpec |
|||
status PolicyStatus |
BackendConfigPolicySpec
BackendConfigPolicySpec defines the desired state of BackendConfigPolicy.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
targetRefs LocalPolicyTargetReference array |
TargetRefs specifies the target references to attach the policy to. | MaxItems: 16 MinItems: 1 |
|
targetSelectors LocalPolicyTargetSelector array |
TargetSelectors specifies the target selectors to select resources to attach the policy to. | ||
connectTimeout Duration |
The timeout for new network connections to hosts in the cluster. | ||
perConnectionBufferLimitBytes integer |
Soft limit on the size of the cluster’s connections read and write buffers. If unspecified, an implementation-defined default is applied (1MiB). |
Minimum: 0 |
|
tcpKeepalive TCPKeepalive |
Configure OS-level TCP keepalive checks. | ||
commonHttpProtocolOptions CommonHttpProtocolOptions |
Additional options when handling HTTP requests upstream, applicable to both HTTP1 and HTTP2 requests. |
||
http1ProtocolOptions Http1ProtocolOptions |
Additional options when handling HTTP1 requests upstream. | ||
http2ProtocolOptions Http2ProtocolOptions |
Http2ProtocolOptions contains the options necessary to configure HTTP/2 backends. Note: Http2ProtocolOptions can only be applied to HTTP/2 backends. See Envoy documentation for more details. |
||
tls TLS |
TLS contains the options necessary to configure a backend to use TLS origination. See Envoy documentation for more details. |
||
loadBalancer LoadBalancer |
LoadBalancer contains the options necessary to configure the load balancer. | ||
healthCheck HealthCheck |
HealthCheck contains the options necessary to configure the health check. | ||
outlierDetection OutlierDetection |
OutlierDetection contains the options necessary to configure passive health checking. |
BackendSpec
BackendSpec defines the desired state of Backend.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
type BackendType |
Type indicates the type of the backend to be used. | Enum: [AWS Static DynamicForwardProxy] |
|
aws AwsBackend |
Aws is the AWS backend configuration. The Aws backend type is only supported with envoy-based gateways, it is not supported in agentgateway. |
||
static StaticBackend |
Static is the static backend configuration. | ||
dynamicForwardProxy DynamicForwardProxyBackend |
DynamicForwardProxy is the dynamic forward proxy backend configuration. |
BackendStatus
BackendStatus defines the observed state of Backend.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
conditions Condition array |
Conditions is the list of conditions for the backend. | MaxItems: 8 |
BackendType
Underlying type: string
BackendType indicates the type of the backend.
Appears in:
| Field | Description |
|---|---|
AWS |
BackendTypeAWS is the type for AWS backends. |
Static |
BackendTypeStatic is the type for static backends. |
DynamicForwardProxy |
BackendTypeDynamicForwardProxy is the type for dynamic forward proxy backends. |
BasicAuthenticationMode
Underlying type: string
Validation:
- Enum: [Strict Optional]
Appears in:
| Field | Description |
|---|---|
Strict |
A valid username and password must be present. This is the default option. |
Optional |
If a username and password exists, validate it. Warning: this allows requests without a username! |
BedrockConfig
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
region string |
Region is the AWS region to use for the backend. Defaults to us-east-1 if not specified. |
us-east-1 | MaxLength: 63 MinLength: 1 Pattern: ^[a-z0-9-]+$ |
model string |
Optional: Override the model name, such as gpt-4o-mini.If unset, the model name is taken from the request. |
||
guardrail AWSGuardrailConfig |
Guardrail configures the Guardrail policy to use for the backend. See https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html If not specified, the AWS Guardrail policy will not be used. |
BodyParseBehavior
Underlying type: string
BodyparseBehavior defines how the body should be parsed If set to json and the body is not json then the filter will not perform the transformation.
Validation:
- Enum: [AsString AsJson]
Appears in:
| Field | Description |
|---|---|
AsString |
BodyParseBehaviorAsString will parse the body as a string. |
AsJson |
BodyParseBehaviorAsJSON will parse the body as a json object. |
BodyTransformation
BodyTransformation controls how the body should be parsed and transformed.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
parseAs BodyParseBehavior |
ParseAs defines what auto formatting should be applied to the body. This can make interacting with keys within a json body much easier if AsJson is selected. |
AsString | Enum: [AsString AsJson] |
value InjaTemplate |
Value is the template to apply to generate the output value for the body. Only Inja templates are supported. |
Buffer
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
maxRequestSize Quantity |
MaxRequestSize sets the maximum size in bytes of a message body to buffer. Requests exceeding this size will receive HTTP 413. Example format: “1Mi”, “512Ki”, “1Gi” |
||
disable PolicyDisable |
Disable the buffer filter. Can be used to disable buffer policies applied at a higher level in the config hierarchy. |
BuiltIn
Underlying type: string
BuiltIn regex patterns for specific types of strings in prompts.
For example, if you specify CREDIT_CARD, any credit card numbers
in the request or response are matched.
Validation:
- Enum: [SSN CREDIT_CARD PHONE_NUMBER EMAIL]
Appears in:
| Field | Description |
|---|---|
SSN |
Default regex matching for Social Security numbers. |
CREDIT_CARD |
Default regex matching for credit card numbers. |
PHONE_NUMBER |
Default regex matching for phone numbers. |
EMAIL |
Default regex matching for email addresses. |
CELExpression
Underlying type: string
A Common Expression Language (CEL) expression. See https://agentgateway.dev/docs/reference/cel/ for more info.
Validation:
- MaxLength: 16384
- MinLength: 1
Appears in:
- AgentAccessLog
- AgentAttributeAdd
- AgentHeaderTransformation
- AgentRateLimitDescriptorEntry
- AgentTracing
- AgentTransform
- AuthorizationPolicy
CELFilter
CELFilter filters requests based on Common Expression Language (CEL).
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
match string |
The CEL expressions to evaluate. AccessLogs are only emitted when the CEL expressions evaluates to true. see: https://www.envoyproxy.io/docs/envoy/v1.33.0/xds/type/v3/cel.proto.html#common-expression-language-cel-proto |
CSRFPolicy
CSRFPolicy can be used to set percent of requests for which the CSRF filter is enabled, enable shadow-only mode where policies will be evaluated and tracked, but not enforced and add additional source origins that will be allowed in addition to the destination origin.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
percentageEnabled integer |
Specifies the percentage of requests for which the CSRF filter is enabled. | Maximum: 100 Minimum: 0 |
|
percentageShadowed integer |
Specifies that CSRF policies will be evaluated and tracked, but not enforced. | Maximum: 100 Minimum: 0 |
|
additionalOrigins StringMatcher array |
Specifies additional source origins that will be allowed in addition to the destination origin. | MaxItems: 16 |
CommonAccessLogGrpcService
Common configuration for gRPC access logs. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/access_loggers/grpc/v3/als.proto#envoy-v3-api-msg-extensions-access-loggers-grpc-v3-commongrpcaccesslogconfig
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendRef |
The backend gRPC service. Can be any type of supported backend (Kubernetes Service, kgateway Backend, etc..) | ||
authority string |
The :authority header in the grpc request. If this field is not set, the authority header value will be cluster_name. Note that this authority does not override the SNI. The SNI is provided by the transport socket of the cluster. |
||
maxReceiveMessageLength integer |
Maximum gRPC message size that is allowed to be received. If a message over this limit is received, the gRPC stream is terminated with the RESOURCE_EXHAUSTED error. Defaults to 0, which means unlimited. |
Minimum: 1 |
|
skipEnvoyHeaders boolean |
This provides gRPC client level control over envoy generated headers. If false, the header will be sent but it can be overridden by per stream option. If true, the header will be removed and can not be overridden by per stream option. Default to false. | ||
timeout Duration |
The timeout for the gRPC request. This is the timeout for a specific request | ||
initialMetadata HeaderValue array |
Additional metadata to include in streams initiated to the GrpcService. This can be used for scenarios in which additional ad hoc authorization headers (e.g. x-foo-bar: baz-key) are to be injected |
||
retryPolicy RetryPolicy |
Indicates the retry policy for re-establishing the gRPC stream. If max interval is not provided, it will be set to ten times the provided base interval |
||
logName string |
name of log stream |
CommonGrpcService
Common gRPC service configuration created by setting `envoy_grpc“ as the gRPC client Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/grpc_service.proto#envoy-v3-api-msg-config-core-v3-grpcservice Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/grpc_service.proto#envoy-v3-api-msg-config-core-v3-grpcservice-envoygrpc
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendRef |
The backend gRPC service. Can be any type of supported backend (Kubernetes Service, kgateway Backend, etc..) | ||
authority string |
The :authority header in the grpc request. If this field is not set, the authority header value will be cluster_name. Note that this authority does not override the SNI. The SNI is provided by the transport socket of the cluster. |
||
maxReceiveMessageLength integer |
Maximum gRPC message size that is allowed to be received. If a message over this limit is received, the gRPC stream is terminated with the RESOURCE_EXHAUSTED error. Defaults to 0, which means unlimited. |
Minimum: 1 |
|
skipEnvoyHeaders boolean |
This provides gRPC client level control over envoy generated headers. If false, the header will be sent but it can be overridden by per stream option. If true, the header will be removed and can not be overridden by per stream option. Default to false. | ||
timeout Duration |
The timeout for the gRPC request. This is the timeout for a specific request | ||
initialMetadata HeaderValue array |
Additional metadata to include in streams initiated to the GrpcService. This can be used for scenarios in which additional ad hoc authorization headers (e.g. x-foo-bar: baz-key) are to be injected |
||
retryPolicy RetryPolicy |
Indicates the retry policy for re-establishing the gRPC stream. If max interval is not provided, it will be set to ten times the provided base interval |
CommonHttpProtocolOptions
CommonHttpProtocolOptions are options that are applicable to both HTTP1 and HTTP2 requests. See Envoy documentation for more details.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
idleTimeout Duration |
The idle timeout for connections. The idle timeout is defined as the period in which there are no active requests. When the idle timeout is reached the connection will be closed. If the connection is an HTTP/2 downstream connection a drain sequence will occur prior to closing the connection. Note that request based timeouts mean that HTTP/2 PINGs will not keep the connection alive. If not specified, this defaults to 1 hour. To disable idle timeouts explicitly set this to 0. Disabling this timeout has a highly likelihood of yielding connection leaks due to lost TCP FIN packets, etc. |
||
maxHeadersCount integer |
Specifies the maximum number of headers that the connection will accept. If not specified, the default of 100 is used. Requests that exceed this limit will receive a 431 response for HTTP/1.x and cause a stream reset for HTTP/2. |
Minimum: 0 |
|
maxStreamDuration Duration |
Total duration to keep alive an HTTP request/response stream. If the time limit is reached the stream will be reset independent of any other timeouts. If not specified, this value is not set. |
||
maxRequestsPerConnection integer |
Maximum requests for a single upstream connection. If set to 0 or unspecified, defaults to unlimited. |
Minimum: 0 |
ComparisonFilter
Underlying type: struct{Op Op “json:"op"”; Value int32 “json:"value"”}
ComparisonFilter represents a filter based on a comparison. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-comparisonfilter
Appears in:
Cookie
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name of the cookie. | MinLength: 1 |
|
path string |
Path is the name of the path for the cookie. | ||
ttl Duration |
TTL specifies the time to live of the cookie. If specified, a cookie with the TTL will be generated if the cookie is not present. If the TTL is present and zero, the generated cookie will be a session cookie. |
||
secure boolean |
Secure specifies whether the cookie is secure. If true, the cookie will only be sent over HTTPS. |
||
httpOnly boolean |
HttpOnly specifies whether the cookie is HTTP only, i.e. not accessible to JavaScript. | ||
sameSite string |
SameSite controls cross-site sending of cookies. Supported values are Strict, Lax, and None. |
Enum: [Strict Lax None] |
CorsPolicy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
disable PolicyDisable |
Disable the CORS filter. Can be used to disable CORS policies applied at a higher level in the config hierarchy. |
CustomAttribute
Describes attributes for the active span. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/type/tracing/v3/custom_tag.proto#envoy-v3-api-msg-type-tracing-v3-customtag
Validation:
- MaxProperties: 2
- MinProperties: 1
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
The name of the attribute | ||
literal CustomAttributeLiteral |
A literal attribute value. | ||
environment CustomAttributeEnvironment |
An environment attribute value. | ||
requestHeader CustomAttributeHeader |
A request header attribute value. | ||
metadata CustomAttributeMetadata |
Refer to Kubernetes API documentation for fields of metadata. |
CustomAttributeEnvironment
Environment type attribute with environment name and default value. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/type/tracing/v3/custom_tag.proto#type-tracing-v3-customtag-environment
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Environment variable name to obtain the value to populate the attribute value. | ||
defaultValue string |
When the environment variable is not found, the attribute value will be populated with this default value if specified, otherwise no attribute will be populated. |
CustomAttributeHeader
Header type attribute with header name and default value. https://www.envoyproxy.io/docs/envoy/latest/api-v3/type/tracing/v3/custom_tag.proto#type-tracing-v3-customtag-header
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Header name to obtain the value to populate the attribute value. | ||
defaultValue string |
When the header does not exist, the attribute value will be populated with this default value if specified, otherwise no attribute will be populated. |
CustomAttributeLiteral
Literal type attribute with a static value. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/type/tracing/v3/custom_tag.proto#type-tracing-v3-customtag-literal
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
value string |
Static literal value to populate the attribute value. |
CustomAttributeMetadata
Metadata type attribute using MetadataKey to retrieve the protobuf value from Metadata, and populate the attribute value with the canonical JSON representation of it. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/type/tracing/v3/custom_tag.proto#type-tracing-v3-customtag-metadata
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
kind MetadataKind |
Specify what kind of metadata to obtain attribute value from | Enum: [Request Route Cluster Host] |
|
metadataKey MetadataKey |
Metadata key to define the path to retrieve the attribute value. | ||
defaultValue string |
When no valid metadata is found, the attribute value would be populated with this default value if specified, otherwise no attribute would be populated. |
CustomResponse
CustomResponse configures a response to return to the client if request content
is matched against a regex pattern and the action is REJECT.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
message string |
A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
The request was rejected due to inappropriate content | |
statusCode integer |
The status code to return to the client. Defaults to 403. | 403 | Maximum: 599 Minimum: 200 |
DirectResponse
DirectResponse contains configuration for defining direct response routes.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
DirectResponse |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec DirectResponseSpec |
|||
status DirectResponseStatus |
DirectResponseSpec
DirectResponseSpec describes the desired state of a DirectResponse.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
status integer |
StatusCode defines the HTTP status code to return for this route. | Maximum: 599 Minimum: 200 |
|
body string |
Body defines the content to be returned in the HTTP response body. The maximum length of the body is restricted to prevent excessively large responses. If this field is omitted, no body is included in the response. |
MaxLength: 4096 MinLength: 1 |
DirectResponseStatus
DirectResponseStatus defines the observed state of a DirectResponse.
Appears in:
DurationFilter
Underlying type: ComparisonFilter
DurationFilter filters based on request duration. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-durationfilter
Appears in:
DynamicForwardProxyBackend
DynamicForwardProxyBackend is the dynamic forward proxy backend configuration.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enableTls boolean |
EnableTls enables TLS. When true, the backend will be configured to use TLS. System CA will be used for validation. The hostname will be used for SNI and auto SAN validation. |
EnvironmentResourceDetectorConfig
Underlying type: struct{}
EnvironmentResourceDetectorConfig specified the EnvironmentResourceDetector
Appears in:
EnvoyBootstrap
EnvoyBootstrap configures the Envoy proxy instance that is provisioned from a Kubernetes Gateway.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
logLevel string |
Envoy log level. Options include “trace”, “debug”, “info”, “warn”, “error”, “critical” and “off”. Defaults to “info”. See https://www.envoyproxy.io/docs/envoy/latest/start/quick-start/run-envoy#debugging-envoy for more information. |
||
componentLogLevels object (keys:string, values:string) |
Envoy log levels for specific components. The keys are component names and the values are one of “trace”, “debug”, “info”, “warn”, “error”, “critical”, or “off”, e.g. yaml<br /> componentLogLevels:<br /> upstream: debug<br /> connection: trace<br /> These will be converted to the --component-log-level Envoy argumentvalue. See https://www.envoyproxy.io/docs/envoy/latest/start/quick-start/run-envoy#debugging-envoy for more information. Note: the keys and values cannot be empty, but they are not otherwise validated. |
EnvoyContainer
EnvoyContainer configures the container running Envoy.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
bootstrap EnvoyBootstrap |
Initial envoy configuration. | ||
image Image |
The envoy container image. See https://kubernetes.io/docs/concepts/containers/images for details. Default values, which may be overridden individually: registry: quay.io/solo-io repository: envoy-wrapper tag: pullPolicy: IfNotPresent |
||
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
||
env EnvVar array |
The container environment variables. | ||
extraVolumeMounts VolumeMount array |
Additional volume mounts to add to the container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#volumemount-v1-core for details. |
EnvoyHealthCheck
EnvoyHealthCheck represents configuration for Envoy’s health check filter. The filter will be configured in No pass through mode, and will only match requests with the specified path.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
path string |
Path defines the exact path that will be matched for health check requests. | MaxLength: 2048 Pattern: ^/[-a-zA-Z0-9@:%.+~#?&/=_]+$ |
ExtAuthBufferSettings
ExtAuthBufferSettings configures how the request body should be buffered.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
maxRequestBytes integer |
MaxRequestBytes sets the maximum size of a message body to buffer. Requests exceeding this size will receive HTTP 413 and not be sent to the auth service. |
Minimum: 1 |
|
allowPartialMessage boolean |
AllowPartialMessage determines if partial messages should be allowed. When true, requests will be sent to the auth service even if they exceed maxRequestBytes. The default behavior is false. |
false | |
packAsBytes boolean |
PackAsBytes determines if the body should be sent as raw bytes. When true, the body is sent as raw bytes in the raw_body field. When false, the body is sent as UTF-8 string in the body field. The default behavior is false. |
false |
ExtAuthPolicy
ExtAuthPolicy configures external authentication/authorization for a route. This policy will determine the ext auth server to use and how to talk to it. Note that most of these fields are passed along as is to Envoy. For more details on particular fields please see the Envoy ExtAuth documentation. https://raw.githubusercontent.com/envoyproxy/envoy/f910f4abea24904aff04ec33a00147184ea7cffa/api/envoy/extensions/filters/http/ext_authz/v3/ext_authz.proto
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
extensionRef NamespacedObjectReference |
ExtensionRef references the GatewayExtension that should be used for auth. | ||
withRequestBody ExtAuthBufferSettings |
WithRequestBody allows the request body to be buffered and sent to the auth service. Warning buffering has implications for streaming and therefore performance. |
||
contextExtensions object (keys:string, values:string) |
Additional context for the auth service. | ||
disable PolicyDisable |
Disable all external auth filters. Can be used to disable external auth policies applied at a higher level in the config hierarchy. |
ExtAuthProvider
ExtAuthProvider defines the configuration for an ExtAuth provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
grpcService ExtGrpcService |
GrpcService is the GRPC service that will handle the auth. | ||
failOpen boolean |
FailOpen determines if requests are allowed when the ext auth service is unavailable. Defaults to false, meaning requests will be denied if the ext auth service is unavailable. |
false | |
clearRouteCache boolean |
ClearRouteCache determines if the route cache should be cleared to allow the external authentication service to correctly affect routing decisions. |
false | |
withRequestBody ExtAuthBufferSettings |
WithRequestBody allows the request body to be buffered and sent to the auth service. Warning: buffering has implications for streaming and therefore performance. |
||
statusOnError integer |
StatusOnError sets the HTTP status response code that is returned to the client when the auth server returns an error or cannot be reached. Must be in the range of 100-511 inclusive. The default matches the deny response code of 403 Forbidden. |
403 | Maximum: 511 Minimum: 100 |
statPrefix string |
StatPrefix is an optional prefix to include when emitting stats from the extauthz filter, enabling different instances of the filter to have unique stats. |
MinLength: 1 |
ExtGrpcService
ExtGrpcService defines the GRPC service that will handle the processing.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendRef |
BackendRef references the backend GRPC service. | ||
authority string |
Authority is the authority header to use for the GRPC service. | ||
requestTimeout Duration |
RequestTimeout is the timeout for the gRPC request. This is the timeout for a specific request. | ||
retry GRPCRetryPolicy |
Retry specifies the retry policy for gRPC streams associated with the service. |
ExtProcPolicy
ExtProcPolicy defines the configuration for the Envoy External Processing filter.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
extensionRef NamespacedObjectReference |
ExtensionRef references the GatewayExtension that should be used for external processing. | ||
processingMode ProcessingMode |
ProcessingMode defines how the filter should interact with the request/response streams | ||
disable PolicyDisable |
Disable all external processing filters. Can be used to disable external processing policies applied at a higher level in the config hierarchy. |
ExtProcProvider
ExtProcProvider defines the configuration for an ExtProc provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
grpcService ExtGrpcService |
GrpcService is the GRPC service that will handle the processing. | ||
failOpen boolean |
FailOpen determines if requests are allowed when the ext proc service is unavailable. Defaults to true, meaning requests are allowed upstream even if the ext proc service is unavailable. |
true | |
processingMode ProcessingMode |
ProcessingMode defines how the filter should interact with the request/response streams. | ||
messageTimeout Duration |
MessageTimeout is the timeout for each message sent to the external processing server. | ||
maxMessageTimeout Duration |
MaxMessageTimeout specifies the upper bound of override_message_timeout that may be sent from the external processing server. The default value 0, which effectively disables the override_message_timeout API. |
||
statPrefix string |
StatPrefix is an optional prefix to include when emitting stats from the extproc filter, enabling different instances of the filter to have unique stats. |
MinLength: 1 |
|
routeCacheAction ExtProcRouteCacheAction |
RouteCacheAction describes the route cache action to be taken when an external processor response is received in response to request headers. The default behavior is “FromResponse” which will only clear the route cache when an external processing response has the clear_route_cache field set. |
FromResponse | Enum: [FromResponse Clear Retain] |
metadataOptions MetadataOptions |
MetadataOptions allows configuring metadata namespaces to forwarded or received from the external processing server. |
ExtProcRouteCacheAction
Underlying type: string
Appears in:
| Field | Description |
|---|---|
FromResponse |
RouteCacheActionFromResponse is the default behavior, which clears the route cache only when the clear_route_cache field is set in an external processor response. |
Clear |
RouteCacheActionClear always clears the route cache irrespective of the clear_route_cache field in the external processor response. |
Retain |
RouteCacheActionRetain never clears the route cache irrespective of the clear_route_cache field in the external processor response. |
FieldDefault
FieldDefault provides default values for specific fields in the JSON request body sent to the LLM provider. These defaults are merged with the user-provided request to ensure missing fields are populated.
User input fields here refer to the fields in the JSON request body that a client sends when making a request to the LLM provider.
Defaults set here do not override those user-provided values unless you explicitly set override to true.
Example: Setting a default system field for Anthropic, which does not support system role messages:
defaults:
- field: "system"
value: "answer all questions in French"Example: Setting a default temperature and overriding max_tokens:
defaults:
- field: "temperature"
value: "0.5"
- field: "max_tokens"
value: "100"
override: trueExample: Setting custom lists fields:
defaults:
- field: "custom_integer_list"
value: [1,2,3]
overrides:
- field: "custom_string_list"
value: ["one","two","three"]Note: The field values correspond to keys in the JSON request body, not fields in this CRD.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
field string |
The name of the field. | MinLength: 1 |
|
value JSON |
The field default value, which can be any JSON Data Type. |
FileSink
FileSink represents the file sink configuration for access logs.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
path string |
the file path to which the file access logging service will sink | ||
stringFormat string |
the format string by which envoy will format the log lines https://www.envoyproxy.io/docs/envoy/v1.33.0/configuration/observability/access_log/usage#format-strings |
||
jsonFormat RawExtension |
the format object by which to envoy will emit the logs in a structured way. https://www.envoyproxy.io/docs/envoy/v1.33.0/configuration/observability/access_log/usage#format-dictionaries |
FilterType
FilterType represents the type of filter to apply (only one of these should be set). Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#envoy-v3-api-msg-config-accesslog-v3-accesslogfilter
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
statusCodeFilter StatusCodeFilter |
|||
durationFilter DurationFilter |
|||
notHealthCheckFilter boolean |
Filters for requests that are not health check requests. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-nothealthcheckfilter |
||
traceableFilter boolean |
Filters for requests that are traceable. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-traceablefilter |
||
headerFilter HeaderFilter |
|||
responseFlagFilter ResponseFlagFilter |
|||
grpcStatusFilter GrpcStatusFilter |
|||
celFilter CELFilter |
FrontendHTTP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
maxBufferSize integer |
maxBufferSize defines the maximum size HTTP body that will be buffered into memory. Bodies will only be buffered for policies which require buffering. If unset, this defaults to 2mb. |
Minimum: 1 |
|
http1MaxHeaders integer |
http1MaxHeaders defines the maximum number of headers that are allowed in HTTP/1.1 requests. If unset, this defaults to 100. |
Maximum: 4096 Minimum: 1 |
|
http1IdleTimeout Duration |
http1IdleTimeout defines the timeout before an unused connection is closed. If unset, this defaults to 10 minutes. |
||
http2WindowSize integer |
http2WindowSize indicates the initial window size for stream-level flow control for received data. | Minimum: 1 |
|
http2ConnectionWindowSize integer |
http2ConnectionWindowSize indicates the initial window size for connection-level flow control for received data. | Minimum: 1 |
|
http2FrameSize integer |
http2FrameSize sets the maximum frame size to use. If unset, this defaults to 16kb |
Maximum: 1.677215e+06 Minimum: 16384 |
|
http2KeepaliveInterval Duration |
|||
http2KeepaliveTimeout Duration |
FrontendTCP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
keepalive AgentgatewayKeepalive |
keepalive defines settings for enabling TCP keepalives on the connection. |
FrontendTLS
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
handshakeTimeout Duration |
handshakeTimeout specifies the deadline for a TLS handshake to complete. If unset, this defaults to 15s. |
GRPCRetryBackoff
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
baseInterval Duration |
BaseInterval specifies the base interval used with a fully jittered exponential back-off between retries. | ||
maxInterval Duration |
MaxInterval specifies the maximum interval between retry attempts. Defaults to 10 times the BaseInterval if not set. |
GRPCRetryPolicy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
attempts integer |
Attempts specifies the number of retry attempts for a request. Defaults to 1 attempt if not set. A value of 0 effectively disables retries. |
1 | Minimum: 0 |
backoff GRPCRetryBackoff |
Backoff specifies the retry backoff strategy. If not set, a default backoff with a base interval of 1000ms is used. The default max interval is 10 times the base interval. |
GatewayExtension
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
GatewayExtension |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec GatewayExtensionSpec |
|||
status GatewayExtensionStatus |
GatewayExtensionSpec
GatewayExtensionSpec defines the desired state of GatewayExtension.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
type GatewayExtensionType |
Deprecated: Setting this field has no effect. Type indicates the type of the GatewayExtension to be used. |
Enum: [ExtAuth ExtProc RateLimit JWTProviders] |
|
extAuth ExtAuthProvider |
ExtAuth configuration for ExtAuth extension type. | ||
extProc ExtProcProvider |
ExtProc configuration for ExtProc extension type. | ||
rateLimit RateLimitProvider |
RateLimit configuration for RateLimit extension type. | ||
jwtProviders NamedJWTProvider array |
JWTProviders configures named JWT providers. If multiple providers are specified for a given JWT policy, the providers will be OR-ed together and will allow validation to any of the providers. |
MaxItems: 32 |
GatewayExtensionStatus
GatewayExtensionStatus defines the observed state of GatewayExtension.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
conditions Condition array |
Conditions is the list of conditions for the GatewayExtension. | MaxItems: 8 |
GatewayExtensionType
Underlying type: string
GatewayExtensionType indicates the type of the GatewayExtension.
Appears in:
| Field | Description |
|---|---|
ExtAuth |
GatewayExtensionTypeExtAuth is the type for Extauth extensions. |
ExtProc |
GatewayExtensionTypeExtProc is the type for ExtProc extensions. |
RateLimit |
GatewayExtensionTypeRateLimit is the type for RateLimit extensions. |
JWTProviders |
GatewayExtensionTypeJWTProvider is the type for the JWT Provider extensions |
GatewayParameters
A GatewayParameters contains configuration that is used to dynamically provision kgateway’s data plane (Envoy or agentgateway proxy instance), based on a Kubernetes Gateway.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
GatewayParameters |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec GatewayParametersSpec |
|||
status GatewayParametersStatus |
GatewayParametersSpec
A GatewayParametersSpec describes the type of environment/platform in which the proxy will be provisioned.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
kube KubernetesProxyConfig |
The proxy will be deployed on Kubernetes. | ||
selfManaged SelfManagedGateway |
The proxy will be self-managed and not auto-provisioned. |
GatewayParametersStatus
The current conditions of the GatewayParameters. This is not currently implemented.
Appears in:
GeminiConfig
GeminiConfig settings for the Gemini LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
Optional: Override the model name, such as gemini-2.5-pro.If unset, the model name is taken from the request. |
GracefulShutdownSpec
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Enable grace period before shutdown to finish current requests while Envoy health checks fail to e.g. notify external load balancers. NOTE: This will not have any effect if you have not defined health checks via the health check filter | ||
sleepTimeSeconds integer |
Time (in seconds) for the preStop hook to wait before allowing Envoy to terminate | Maximum: 3.1536e+07 Minimum: 0 |
GrpcStatusFilter
GrpcStatusFilter filters gRPC requests based on their response status. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#enum-config-accesslog-v3-grpcstatusfilter-status
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
statuses GrpcStatus array |
Enum: [OK CANCELED UNKNOWN INVALID_ARGUMENT DEADLINE_EXCEEDED NOT_FOUND ALREADY_EXISTS PERMISSION_DENIED RESOURCE_EXHAUSTED FAILED_PRECONDITION ABORTED OUT_OF_RANGE UNIMPLEMENTED INTERNAL UNAVAILABLE DATA_LOSS UNAUTHENTICATED] MinItems: 1 |
||
exclude boolean |
HTTPListenerPolicy
HTTPListenerPolicy is intended to be used for configuring the Envoy HttpConnectionManager and any other config or policy
that should map 1-to-1 with a given HTTP listener, such as the Envoy health check HTTP filter.
Currently these policies can only be applied per Gateway but support for Listener attachment may be added in the future.
See https://github.com/kgateway-dev/kgateway/issues/11786 for more details.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
HTTPListenerPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec HTTPListenerPolicySpec |
|||
status PolicyStatus |
HTTPListenerPolicySpec
HTTPListenerPolicySpec defines the desired state of a HTTP listener policy.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
targetRefs LocalPolicyTargetReference array |
TargetRefs specifies the target resources by reference to attach the policy to. | MaxItems: 16 MinItems: 1 |
|
targetSelectors LocalPolicyTargetSelector array |
TargetSelectors specifies the target selectors to select resources to attach the policy to. | ||
accessLog AccessLog array |
AccessLoggingConfig contains various settings for Envoy’s access logging service. See here for more information: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto |
MaxItems: 16 |
|
tracing Tracing |
Tracing contains various settings for Envoy’s OpenTelemetry tracer. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/trace/v3/opentelemetry.proto.html |
||
upgradeConfig UpgradeConfig |
UpgradeConfig contains configuration for HTTP upgrades like WebSocket. See here for more information: https://www.envoyproxy.io/docs/envoy/v1.34.1/intro/arch_overview/http/upgrades.html |
||
useRemoteAddress boolean |
UseRemoteAddress determines whether to use the remote address for the original client. Note: If this field is omitted, it will fallback to the default value of ’true’, which we set for all Envoy HCMs. Thus, setting this explicitly to true is unnecessary (but will not cause any harm). When true, Envoy will use the remote address of the connection as the client address. When false, Envoy will use the X-Forwarded-For header to determine the client address. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto#envoy-v3-api-field-extensions-filters-network-http-connection-manager-v3-httpconnectionmanager-use-remote-address |
||
xffNumTrustedHops integer |
XffNumTrustedHops is the number of additional ingress proxy hops from the right side of the X-Forwarded-For HTTP header to trust when determining the origin client’s IP address. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto#envoy-v3-api-field-extensions-filters-network-http-connection-manager-v3-httpconnectionmanager-xff-num-trusted-hops |
Minimum: 0 |
|
serverHeaderTransformation ServerHeaderTransformation |
ServerHeaderTransformation determines how the server header is transformed. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto#envoy-v3-api-field-extensions-filters-network-http-connection-manager-v3-httpconnectionmanager-server-header-transformation |
Enum: [Overwrite AppendIfAbsent PassThrough] |
|
streamIdleTimeout Duration |
StreamIdleTimeout is the idle timeout for HTTP streams. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto#envoy-v3-api-field-extensions-filters-network-http-connection-manager-v3-httpconnectionmanager-stream-idle-timeout |
||
idleTimeout Duration |
IdleTimeout is the idle timeout for connnections. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/protocol.proto#envoy-v3-api-msg-config-core-v3-httpprotocoloptions |
||
healthCheck EnvoyHealthCheck |
HealthCheck configures Envoy health checks | ||
preserveHttp1HeaderCase boolean |
PreserveHttp1HeaderCase determines whether to preserve the case of HTTP1 request headers. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/header_casing |
||
acceptHttp10 boolean |
AcceptHTTP10 determines whether to accept incoming HTTP/1.0 and HTTP 0.9 requests. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/protocol.proto#config-core-v3-http1protocoloptions |
||
defaultHostForHttp10 string |
DefaultHostForHttp10 specifies a default host for HTTP/1.0 requests. This is highly suggested if acceptHttp10 is true and a no-op if acceptHttp10 is false. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/protocol.proto#config-core-v3-http1protocoloptions |
MinLength: 1 |
HTTPVersion
Underlying type: string
Appears in:
| Field | Description |
|---|---|
HTTP1 |
|
HTTP2 |
HashPolicy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
header Header |
Header specifies a header’s value as a component of the hash key. | ||
cookie Cookie |
Cookie specifies a given cookie as a component of the hash key. | ||
sourceIP SourceIP |
SourceIP specifies whether to use the request’s source IP address as a component of the hash key. | ||
terminal boolean |
Terminal, if set, and a hash key is available after evaluating this policy, will cause Envoy to skip the subsequent policies and use the key as it is. This is useful for defining “fallback” policies and limiting the time Envoy spends generating hash keys. |
Header
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name is the name of the header to use as a component of the hash key. | MinLength: 1 |
HeaderFilter
HeaderFilter filters requests based on headers. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-headerfilter
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
header HTTPHeaderMatch |
HeaderModifiers
HeaderModifiers can be used to define the policy to modify request and response headers.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
request HTTPHeaderFilter |
Request modifies request headers. | ||
response HTTPHeaderFilter |
Response modifies response headers. |
HeaderName
Underlying type: string
Appears in:
HeaderSource
HeaderSource configures how to retrieve a JWT from a header
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
header string |
Header is the name of the header. for example, “Authorization” | MaxLength: 2048 MinLength: 1 |
|
prefix string |
Prefix before the token. for example, “Bearer " | MaxLength: 2048 MinLength: 1 |
HeaderTransformation
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name HeaderName |
Name is the name of the header to interact with. | ||
value InjaTemplate |
Value is the Inja template to apply to generate the output value for the header. |
HeaderValue
Header name/value pair. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/base.proto#envoy-v3-api-msg-config-core-v3-headervalue
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
key string |
Header name. | ||
value string |
Header value. |
HealthCheck
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
timeout Duration |
Timeout is time to wait for a health check response. If the timeout is reached the health check attempt will be considered a failure. |
||
interval Duration |
Interval is the time between health checks. | ||
unhealthyThreshold integer |
UnhealthyThreshold is the number of consecutive failed health checks that will be considered unhealthy. Note that for HTTP health checks, if a host responds with a code not in ExpectedStatuses or RetriableStatuses, this threshold is ignored and the host is considered immediately unhealthy. |
Minimum: 0 |
|
healthyThreshold integer |
HealthyThreshold is the number of healthy health checks required before a host is marked healthy. Note that during startup, only a single successful health check is required to mark a host healthy. |
Minimum: 0 |
|
http HealthCheckHttp |
Http contains the options to configure the HTTP health check. | ||
grpc HealthCheckGrpc |
Grpc contains the options to configure the gRPC health check. |
HealthCheckGrpc
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
serviceName string |
ServiceName is the optional name of the service to check. | ||
authority string |
Authority is the authority header used to make the gRPC health check request. If unset, the name of the cluster this health check is associated with will be used. |
HealthCheckHttp
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
host string |
Host is the value of the host header in the HTTP health check request. If unset, the name of the cluster this health check is associated with will be used. |
||
path string |
Path is the HTTP path requested. | ||
method string |
Method is the HTTP method to use. If unset, GET is used. |
Enum: [GET HEAD POST PUT DELETE OPTIONS TRACE PATCH] |
Host
Host defines a static backend host.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
host string |
Host is the host name to use for the backend. | MinLength: 1 |
|
port integer |
Port is the port to use for the backend. |
Http1ProtocolOptions
See Envoy documentation for more details.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enableTrailers boolean |
Enables trailers for HTTP/1. By default the HTTP/1 codec drops proxied trailers. Note: Trailers must also be enabled at the gateway level in order for this option to take effect |
||
preserveHttp1HeaderCase boolean |
PreserveHttp1HeaderCase determines whether to preserve the case of HTTP1 response headers. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/header_casing |
||
overrideStreamErrorOnInvalidHttpMessage boolean |
Allows invalid HTTP messaging. When this option is false, then Envoy will terminate HTTP/1.1 connections upon receiving an invalid HTTP message. However, when this option is true, then Envoy will leave the HTTP/1.1 connection open where possible. |
Http2ProtocolOptions
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
initialStreamWindowSize Quantity |
InitialStreamWindowSize is the initial window size for the stream. Valid values range from 65535 (2^16 - 1, HTTP/2 default) to 2147483647 (2^31 - 1, HTTP/2 maximum). Defaults to 268435456 (256 * 1024 * 1024). Values can be specified with units like “64Ki”. |
||
initialConnectionWindowSize Quantity |
InitialConnectionWindowSize is similar to InitialStreamWindowSize, but for the connection level. Same range and default value as InitialStreamWindowSize. Values can be specified with units like “64Ki”. |
||
maxConcurrentStreams integer |
The maximum number of concurrent streams that the connection can have. | Minimum: 0 |
|
overrideStreamErrorOnInvalidHttpMessage boolean |
Allows invalid HTTP messaging and headers. When disabled (default), then the whole HTTP/2 connection is terminated upon receiving invalid HEADERS frame. When enabled, only the offending stream is terminated. |
Image
A container image. See https://kubernetes.io/docs/concepts/containers/images for details.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
registry string |
The image registry. | ||
repository string |
The image repository (name). | ||
tag string |
The image tag. | ||
digest string |
The hash digest of the image, e.g. sha256:12345... |
||
pullPolicy PullPolicy |
The image pull policy for the container. See https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy for details. |
InjaTemplate
Underlying type: string
Appears in:
InsecureTLSMode
Underlying type: string
Appears in:
| Field | Description |
|---|---|
All |
InsecureTLSModeInsecure disables all TLS verification |
Hostname |
InsecureTLSModeHostname enables verifying the CA certificate, but disables verification of the hostname/SAN. Note this is still, generally, very “insecure” as the name suggests. |
IstioContainer
IstioContainer configures the container running the istio-proxy.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
image Image |
The container image. See https://kubernetes.io/docs/concepts/containers/images for details. |
||
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
||
logLevel string |
Log level for istio-proxy. Options include “info”, “debug”, “warning”, and “error”. Default level is info Default is “warning”. |
||
istioDiscoveryAddress string |
The address of the istio discovery service. Defaults to “istiod.istio-system.svc:15012”. | ||
istioMetaMeshId string |
The mesh id of the istio mesh. Defaults to “cluster.local”. | ||
istioMetaClusterId string |
The cluster id of the istio cluster. Defaults to “Kubernetes”. |
IstioIntegration
IstioIntegration configures the Istio integration settings used by kgateway’s data plane
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
istioProxyContainer IstioContainer |
Configuration for the container running istio-proxy. Note that if Istio integration is not enabled, the istio container will not be injected into the gateway proxy deployment. |
||
customSidecars Container array |
do not use slice of pointers: https://github.com/kubernetes/code-generator/issues/166 Override the default Istio sidecar in gateway-proxy with a custom container. |
JWKS
JWKS (JSON Web Key Set) configures the source for the JWKS
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
local LocalJWKS |
LocalJWKS configures getting the public keys to validate the JWT from a Kubernetes configmap, or inline (raw string) JWKS. |
JWTAuthentication
JWTAuthentication defines the providers used to configure JWT authentication
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
extensionRef NamespacedObjectReference |
ExtensionRef references a GatewayExtension that provides the jwt providers |
JWTAuthenticationMode
Underlying type: string
Validation:
- Enum: [Strict Optional Permissive]
Appears in:
| Field | Description |
|---|---|
Strict |
A valid token, issued by a configured issuer, must be present. This is the default option. |
Optional |
If a token exists, validate it. Warning: this allows requests without a JWT token! |
Permissive |
Requests are never rejected. This is useful for usage of claims in later steps (authorization, logging, etc). Warning: this allows requests without a JWT token! |
JWTClaimToHeader
JWTClaimToHeader allows copying verified claims to headers sent upstream
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name is the JWT claim name, for example, “sub”. | MaxLength: 2048 MinLength: 1 |
|
header string |
Header is the header the claim will be copied to, for example, “x-sub”. | MaxLength: 2048 MinLength: 1 |
JWTProvider
JWTProvider configures the JWT Provider
If multiple providers are specified for a given JWT policy, the providers will be OR-ed together and will allow validation to any of the providers.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
issuer string |
Issuer of the JWT. the ‘iss’ claim of the JWT must match this. | MaxLength: 2048 |
|
audiences string array |
Audiences is the list of audiences to be used for the JWT provider. If specified an incoming JWT must have an ‘aud’ claim, and it must be in this list. If not specified, the audiences will not be checked in the token. |
MaxItems: 32 MinItems: 1 |
|
tokenSource JWTTokenSource |
TokenSource configures where to find the JWT of the current provider. | ||
claimsToHeaders JWTClaimToHeader array |
ClaimsToHeaders is the list of claims to headers to be used for the JWT provider. Optionally set the claims from the JWT payload that you want to extract and add as headers to the request before the request is forwarded to the upstream destination. Note: if ClaimsToHeaders is set, the Envoy route cache will be cleared. This allows the JWT filter to correctly affect routing decisions. |
MaxItems: 32 MinItems: 1 |
|
jwks JWKS |
JWKS is the source for the JSON Web Keys to be used to validate the JWT. | ||
keepToken KeepToken |
KeepToken configures if the token is forwarded upstream. If Remove, the header containing the token will be removed. If Forward, the header containing the token will be forwarded upstream. |
Remove | Enum: [Forward Remove] |
JWTTokenSource
JWTTokenSource configures the source for the JWTToken Exactly one of HeaderSource or QueryParameter must be specified.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
header HeaderSource |
HeaderSource configures retrieving token from a header | ||
queryParameter string |
QueryParameter configures retrieving token from the query parameter |
KeepToken
Underlying type: string
KeepToken configures if the token is forwarded upstream.
Appears in:
| Field | Description |
|---|---|
Forward |
|
Remove |
KubernetesProxyConfig
KubernetesProxyConfig configures the set of Kubernetes resources that will be provisioned for a given Gateway.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
deployment ProxyDeployment |
Use a Kubernetes deployment as the proxy workload type. Currently, this is the only supported workload type. |
||
envoyContainer EnvoyContainer |
Configuration for the container running Envoy. If agentgateway is enabled, the EnvoyContainer values will be ignored. |
||
sdsContainer SdsContainer |
Configuration for the container running the Secret Discovery Service (SDS). | ||
podTemplate Pod |
Configuration for the pods that will be created. | ||
service Service |
Configuration for the Kubernetes Service that exposes the proxy over the network. |
||
serviceAccount ServiceAccount |
Configuration for the Kubernetes ServiceAccount used by the proxy pods. | ||
istio IstioIntegration |
Configuration for the Istio integration. | ||
stats StatsConfig |
Configuration for the stats server. | ||
agentgateway Agentgateway |
Configure the agentgateway integration. If agentgateway is disabled, the EnvoyContainer values will be used by default to configure the data plane proxy. |
||
omitDefaultSecurityContext boolean |
OmitDefaultSecurityContext is used to control whether or notsecurityContext fields should be rendered for the various generatedDeployments/Containers that are dynamically provisioned by the deployer. When set to true, no securityContexts will be provided and will leftto the user/platform to be provided. This should be enabled on platforms such as Red Hat OpenShift where the securityContext will be dynamically added to enforce the appropriatelevel of security. |
LLMProvider
LLMProvider specifies the target large language model provider that the backend should route requests to.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
openai OpenAIConfig |
OpenAI provider | ||
azureopenai AzureOpenAIConfig |
Azure OpenAI provider | ||
anthropic AnthropicConfig |
Anthropic provider | ||
gemini GeminiConfig |
Gemini provider | ||
vertexai VertexAIConfig |
Vertex AI provider | ||
bedrock BedrockConfig |
Bedrock provider | ||
host string |
Host specifies the hostname to send the requests to. If not specified, the default hostname for the provider is used. |
||
port integer |
Port specifies the port to send the requests to. | Maximum: 65535 Minimum: 1 |
|
path string |
Path specifies the URL path to use for the LLM provider API requests. This is useful when you need to route requests to a different API endpoint while maintaining compatibility with the original provider’s API structure. If not specified, the default path for the provider is used. |
LoadBalancer
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
healthyPanicThreshold integer |
HealthyPanicThreshold configures envoy’s panic threshold percentage between 0-100. Once the number of non-healthy hosts reaches this percentage, envoy disregards health information. See Envoy documentation. |
Maximum: 100 Minimum: 0 |
|
updateMergeWindow Duration |
This allows batch updates of endpoints health/weight/metadata that happen during a time window. this help lower cpu usage when endpoint change rate is high. defaults to 1 second. Set to 0 to disable and have changes applied immediately. |
||
leastRequest LoadBalancerLeastRequestConfig |
LeastRequest configures the least request load balancer type. | ||
roundRobin LoadBalancerRoundRobinConfig |
RoundRobin configures the round robin load balancer type. | ||
ringHash LoadBalancerRingHashConfig |
RingHash configures the ring hash load balancer type. | ||
maglev LoadBalancerMaglevConfig |
Maglev configures the maglev load balancer type. | ||
random LoadBalancerRandomConfig |
Random configures the random load balancer type. | ||
localityType LocalityType |
LocalityType specifies the locality config type to use. See https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/load_balancing_policies/common/v3/common.proto#envoy-v3-api-msg-extensions-load-balancing-policies-common-v3-localitylbconfig |
Enum: [WeightedLb] |
|
closeConnectionsOnHostSetChange boolean |
If set to true, the load balancer will drain connections when the host set changes. Ring Hash or Maglev can be used to ensure that clients with the same key are routed to the same upstream host. Distruptions can cause new connections with the same key as existing connections to be routed to different hosts. Enabling this feature will cause the load balancer to drain existing connections when the host set changes, ensuring that new connections with the same key are consistently routed to the same host. Connections are not immediately closed, but are allowed to drain before being closed. |
LoadBalancerLeastRequestConfig
LoadBalancerLeastRequestConfig configures the least request load balancer type.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
choiceCount integer |
How many choices to take into account. Defaults to 2. |
2 | |
slowStart SlowStart |
SlowStart configures the slow start configuration for the load balancer. |
LoadBalancerMaglevConfig
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
useHostnameForHashing boolean |
UseHostnameForHashing specifies whether to use the hostname instead of the resolved IP address for hashing. Defaults to false. |
||
hashPolicies HashPolicy array |
HashPolicies specifies the hash policies for hashing load balancers (RingHash, Maglev). | MaxItems: 16 MinItems: 1 |
LoadBalancerRandomConfig
Appears in:
LoadBalancerRingHashConfig
LoadBalancerRingHashConfig configures the ring hash load balancer type.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
minimumRingSize integer |
MinimumRingSize is the minimum size of the ring. | Minimum: 0 |
|
maximumRingSize integer |
MaximumRingSize is the maximum size of the ring. | Minimum: 0 |
|
useHostnameForHashing boolean |
UseHostnameForHashing specifies whether to use the hostname instead of the resolved IP address for hashing. Defaults to false. |
||
hashPolicies HashPolicy array |
HashPolicies specifies the hash policies for hashing load balancers (RingHash, Maglev). | MaxItems: 16 MinItems: 1 |
LoadBalancerRoundRobinConfig
LoadBalancerRoundRobinConfig configures the round robin load balancer type.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
slowStart SlowStart |
SlowStart configures the slow start configuration for the load balancer. |
LocalJWKS
LocalJWKS configures getting the public keys to validate the JWT from a Kubernetes ConfigMap, or inline (raw string) JWKS.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
inline string |
Inline is the JWKS as the raw, inline JWKS string This can be an individual key, a key set or a pem block public key |
MaxLength: 16384 MinLength: 1 |
|
configMapRef LocalObjectReference |
ConfigMapRef configures storing the JWK in a Kubernetes ConfigMap in the same namespace as the GatewayExtension. The ConfigMap must have a data key named ‘jwks’ that contains the JWKS. |
LocalPolicyTargetReference
Select the object to attach the policy by Group, Kind, and Name. The object must be in the same namespace as the policy. You can target only one object at a time.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
group Group |
The API group of the target resource. For Kubernetes Gateway API resources, the group is gateway.networking.k8s.io. |
||
kind Kind |
The API kind of the target resource, such as Gateway or HTTPRoute. |
||
name ObjectName |
The name of the target resource. |
LocalPolicyTargetReferenceWithSectionName
Select the object to attach the policy by Group, Kind, Name and SectionName. The object must be in the same namespace as the policy. You can target only one object at a time.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
group Group |
The API group of the target resource. For Kubernetes Gateway API resources, the group is gateway.networking.k8s.io. |
||
kind Kind |
The API kind of the target resource, such as Gateway or HTTPRoute. |
||
name ObjectName |
The name of the target resource. | ||
sectionName SectionName |
The section name of the target resource. |
LocalPolicyTargetSelector
LocalPolicyTargetSelector selects the object to attach the policy by Group, Kind, and MatchLabels. The object must be in the same namespace as the policy and match the specified labels. Do not use targetSelectors when reconciliation times are critical, especially if you have a large number of policies that target the same resource. Instead, use targetRefs to attach the policy.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
group Group |
The API group of the target resource. For Kubernetes Gateway API resources, the group is gateway.networking.k8s.io. |
||
kind Kind |
The API kind of the target resource, such as Gateway or HTTPRoute. |
||
matchLabels object (keys:string, values:string) |
Label selector to select the target resource. |
LocalPolicyTargetSelectorWithSectionName
LocalPolicyTargetSelectorWithSectionName the object to attach the policy by Group, Kind, MatchLabels, and optionally SectionName. The object must be in the same namespace as the policy and match the specified labels. Do not use targetSelectors when reconciliation times are critical, especially if you have a large number of policies that target the same resource. Instead, use targetRefs to attach the policy.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
group Group |
The API group of the target resource. For Kubernetes Gateway API resources, the group is gateway.networking.k8s.io. |
||
kind Kind |
The API kind of the target resource, such as Gateway or HTTPRoute. |
||
matchLabels object (keys:string, values:string) |
Label selector to select the target resource. | ||
sectionName SectionName |
The section name of the target resource. |
LocalRateLimitPolicy
LocalRateLimitPolicy represents a policy for local rate limiting. It defines the configuration for rate limiting using a token bucket mechanism.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
tokenBucket TokenBucket |
TokenBucket represents the configuration for a token bucket local rate-limiting mechanism. It defines the parameters for controlling the rate at which requests are allowed. |
LocalRateLimitUnit
Underlying type: string
Appears in:
| Field | Description |
|---|---|
Seconds |
|
Minutes |
|
Hours |
LocalityType
Underlying type: string
Appears in:
| Field | Description |
|---|---|
WeightedLb |
https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/locality_weight#locality-weighted-load-balancing Locality weighted load balancing enables weighting assignments across different zones and geographical locations by using explicit weights. This field is required to enable locality weighted load balancing. |
MCPBackend
MCPBackend configures mcp backends
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
targets McpTargetSelector array |
Targets is a list of MCPBackend targets to use for this backend. Policies targeting MCPBackend targets must use targetRefs[].sectionName to select the target by name. |
MaxItems: 32 MinItems: 1 |
MCPProtocol
Underlying type: string
MCPProtocol defines the protocol to use for the MCPBackend target
Validation:
- Enum: [StreamableHTTP SSE]
Appears in:
| Field | Description |
|---|---|
StreamableHTTP |
MCPProtocolStreamableHTTP specifies Streamable HTTP must be used as the protocol |
SSE |
MCPProtocolSSE specifies Server-Sent Events (SSE) must be used as the protocol |
McpSelector
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
namespaces LabelSelector |
namespace is the label selector in which namespaces Services should be selected from. If unset, only the namespace of the AgentgatewayBackend is searched. |
||
services LabelSelector |
services is the label selector for which Services should be selected. |
McpTarget
McpTarget defines a single MCPBackend target configuration.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
host string |
Host is the hostname or IP address of the MCPBackend target. | ||
port integer |
Port is the port number of the MCPBackend target. | Maximum: 65535 Minimum: 1 |
|
path string |
Path is the URL path of the MCPBackend target endpoint. Defaults to “/sse” for SSE protocol or “/mcp” for StreamableHTTP protocol if not specified. |
||
protocol MCPProtocol |
Protocol is the protocol to use for the connection to the MCPBackend target. | Enum: [StreamableHTTP SSE] |
|
policies AgentgatewayPolicyBackendMCP |
policies controls policies for communicating with this backend. Policies may also be set in AgentgatewayPolicy, or in the top level AgentgatewayBackend. Policies are merged on a field-level basis, with order: AgentgatewayPolicy «br />AgentgatewayBackend < AgentgatewayBackend MCP (this field). |
McpTargetSelector
McpTargetSelector defines the MCPBackend target to use for this backend.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name SectionName |
Name of the MCPBackend target. | ||
selector McpSelector |
selector is a label selector is the selector to use to select Services. If policies are needed on a per-service basis, AgentgatewayPolicy can target the desired Service. |
||
static McpTarget |
static configures a static MCP destination. When connecting to in-cluster Services, it is recommended to use ‘selector’ instead. |
Message
An entry for a message to prepend or append to each prompt.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
role string |
Role of the message. The available roles depend on the backend LLM provider model, such as SYSTEM or USER in the OpenAI API. |
||
content string |
String content of the message. |
MetadataKey
MetadataKey provides a way to retrieve values from Metadata using a key and a path.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
key string |
The key name of the Metadata from which to retrieve the Struct | ||
path MetadataPathSegment array |
The path used to retrieve a specific Value from the Struct. This can be either a prefix or a full path, depending on the use case |
MetadataKind
Underlying type: string
Describes different types of metadata sources. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/type/metadata/v3/metadata.proto#envoy-v3-api-msg-type-metadata-v3-metadatakind-request
Validation:
- Enum: [Request Route Cluster Host]
Appears in:
| Field | Description |
|---|---|
Request |
Request kind of metadata. |
Route |
Route kind of metadata. |
Cluster |
Cluster kind of metadata. |
Host |
Host kind of metadata. |
MetadataNamespaces
MetadataNamespaces configures which metadata namespaces to use. See envoy docs for specifics.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
typed string array |
MinItems: 1 |
||
untyped string array |
MinItems: 1 |
MetadataOptions
MetadataOptions allows configuring metadata namespaces to forward or receive from the external processing server.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
forwarding MetadataNamespaces |
Forwarding defines the typed or untyped dynamic metadata namespaces to forward to the external processing server. |
MetadataPathSegment
Underlying type: struct{Key string “json:"key"”}
Specifies a segment in a path for retrieving values from Metadata.
Appears in:
NamedJWTProvider
NamedJWTProvider is a named JWT provider entry.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name is the unique name of the JWT provider. | MaxLength: 253 MinLength: 1 |
|
issuer string |
Issuer of the JWT. the ‘iss’ claim of the JWT must match this. | MaxLength: 2048 |
|
audiences string array |
Audiences is the list of audiences to be used for the JWT provider. If specified an incoming JWT must have an ‘aud’ claim, and it must be in this list. If not specified, the audiences will not be checked in the token. |
MaxItems: 32 MinItems: 1 |
|
tokenSource JWTTokenSource |
TokenSource configures where to find the JWT of the current provider. | ||
claimsToHeaders JWTClaimToHeader array |
ClaimsToHeaders is the list of claims to headers to be used for the JWT provider. Optionally set the claims from the JWT payload that you want to extract and add as headers to the request before the request is forwarded to the upstream destination. Note: if ClaimsToHeaders is set, the Envoy route cache will be cleared. This allows the JWT filter to correctly affect routing decisions. |
MaxItems: 32 MinItems: 1 |
|
jwks JWKS |
JWKS is the source for the JSON Web Keys to be used to validate the JWT. | ||
keepToken KeepToken |
KeepToken configures if the token is forwarded upstream. If Remove, the header containing the token will be removed. If Forward, the header containing the token will be forwarded upstream. |
Remove | Enum: [Forward Remove] |
NamedLLMProvider
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name SectionName |
Name of the provider. Policies can target this provider by name. | ||
policies AgentgatewayPolicyBackendAI |
policies controls policies for communicating with this backend. Policies may also be set in AgentgatewayPolicy, or in the top level AgentgatewayBackend. policies are merged on a field-level basis, with order: AgentgatewayPolicy «br />AgentgatewayBackend < AgentgatewayBackend LLM provider (this field). |
||
openai OpenAIConfig |
OpenAI provider | ||
azureopenai AzureOpenAIConfig |
Azure OpenAI provider | ||
anthropic AnthropicConfig |
Anthropic provider | ||
gemini GeminiConfig |
Gemini provider | ||
vertexai VertexAIConfig |
Vertex AI provider | ||
bedrock BedrockConfig |
Bedrock provider | ||
host string |
Host specifies the hostname to send the requests to. If not specified, the default hostname for the provider is used. |
||
port integer |
Port specifies the port to send the requests to. | Maximum: 65535 Minimum: 1 |
|
path string |
Path specifies the URL path to use for the LLM provider API requests. This is useful when you need to route requests to a different API endpoint while maintaining compatibility with the original provider’s API structure. If not specified, the default path for the provider is used. |
NamespacedObjectReference
Select the object by Name and Namespace. You can target only one object at a time.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name ObjectName |
The name of the target resource. | ||
namespace Namespace |
The namespace of the target resource. If not set, defaults to the namespace of the parent object. |
MaxLength: 63 MinLength: 1 Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$ |
OpenAIConfig
OpenAIConfig settings for the OpenAI LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
Optional: Override the model name, such as gpt-4o-mini.If unset, the model name is taken from the request. |
OpenAIModeration
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
model specifies the moderation model to use. For example, omni-moderation. |
||
policies AgentgatewayPolicyBackendSimple |
policies controls policies for communicating with OpenAI. |
OpenTelemetryAccessLogService
OpenTelemetryAccessLogService represents the OTel configuration for access logs. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/access_loggers/open_telemetry/v3/logs_service.proto
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
grpcService CommonAccessLogGrpcService |
Send access logs to gRPC service | ||
body string |
OpenTelemetry LogResource fields, following Envoy access logging formatting. | ||
disableBuiltinLabels boolean |
If specified, Envoy will not generate built-in resource labels like log_name, zone_name, cluster_name, node_name. |
OpenTelemetryTracingConfig
OpenTelemetryTracingConfig represents the top-level Envoy’s OpenTelemetry tracer. See here for more information: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/trace/v3/opentelemetry.proto.html
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
grpcService CommonGrpcService |
Send traces to the gRPC service | ||
serviceName string |
The name for the service. This will be populated in the ResourceSpan Resource attributes Defaults to the envoy cluster name. Ie: <gateway-name>.<gateway-namespace> |
||
resourceDetectors ResourceDetector array |
An ordered list of resource detectors. Currently supported values are EnvironmentResourceDetector |
MaxProperties: 1 MinProperties: 1 |
|
sampler Sampler |
Specifies the sampler to be used by the OpenTelemetry tracer. This field can be left empty. In this case, the default Envoy sampling decision is used. Currently supported values are AlwaysOn |
MaxProperties: 1 MinProperties: 1 |
OutlierDetection
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
consecutive5xx integer |
The number of consecutive server-side error responses (for HTTP traffic, 5xx responses; for TCP traffic, connection failures; etc.) before an ejection occurs. Defaults to 5. If this is zero, consecutive 5xx passive health checks will be disabled. In the future, other types of passive health checking might be added, but none will be enabled by default. |
5 | Minimum: 0 |
interval Duration |
The time interval between ejection analysis sweeps. This can result in both new ejections as well as hosts being returned to service. Defaults to 10s. |
10s | |
baseEjectionTime Duration |
The base time that a host is ejected for. The real time is equal to the base time multiplied by the number of times the host has been ejected. Defaults to 30s. |
30s | |
maxEjectionPercent integer |
The maximum % of an upstream cluster that can be ejected due to outlier detection. Defaults to 10%. |
10 | Maximum: 100 Minimum: 0 |
Pod
Configuration for a Kubernetes Pod template.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
extraLabels object (keys:string, values:string) |
Additional labels to add to the Pod object metadata. If the same label is present on Gateway.spec.infrastructure.labels, the Gateway takes precedence. |
||
extraAnnotations object (keys:string, values:string) |
Additional annotations to add to the Pod object metadata. If the same annotation is present on Gateway.spec.infrastructure.annotations, the Gateway takes precedence. |
||
securityContext PodSecurityContext |
The pod security context. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#podsecuritycontext-v1-core for details. |
||
imagePullSecrets LocalObjectReference array |
An optional list of references to secrets in the same namespace to use for pulling any of the images used by this Pod spec. See https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod for details. |
||
nodeSelector object (keys:string, values:string) |
A selector which must be true for the pod to fit on a node. See https://kubernetes.io/docs/concepts/configuration/assign-pod-node/ for details. |
||
affinity Affinity |
If specified, the pod’s scheduling constraints. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#affinity-v1-core for details. |
||
tolerations Toleration array |
do not use slice of pointers: https://github.com/kubernetes/code-generator/issues/166 If specified, the pod’s tolerations. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#toleration-v1-core for details. |
||
gracefulShutdown GracefulShutdownSpec |
If specified, the pod’s graceful shutdown spec. | ||
terminationGracePeriodSeconds integer |
If specified, the pod’s termination grace period in seconds. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#pod-v1-core for details |
Maximum: 3.1536e+07 Minimum: 0 |
|
startupProbe Probe |
If specified, the pod’s startup probe. A probe of container startup readiness. Container will be only be added to service endpoints if the probe succeeds. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#probe-v1-core for details. |
||
readinessProbe Probe |
If specified, the pod’s readiness probe. Periodic probe of container service readiness. Container will be removed from service endpoints if the probe fails. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#probe-v1-core for details. |
||
livenessProbe Probe |
If specified, the pod’s liveness probe. Periodic probe of container service readiness. Container will be restarted if the probe fails. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#probe-v1-core for details. |
||
topologySpreadConstraints TopologySpreadConstraint array |
If specified, the pod’s topology spread constraints. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#topologyspreadconstraint-v1-core for details. |
||
extraVolumes Volume array |
Additional volumes to add to the pod. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#volume-v1-core for details. |
PolicyAncestorStatus
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
ancestorRef ParentReference |
AncestorRef corresponds with a ParentRef in the spec that this PolicyAncestorStatus struct describes the status of. |
||
controllerName string |
ControllerName is a domain/path string that indicates the name of the controller that wrote this status. This corresponds with the controllerName field on GatewayClass. Example: “example.net/gateway-controller”. The format of this field is DOMAIN “/” PATH, where DOMAIN and PATH are valid Kubernetes names (https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names). Controllers MUST populate this field when writing status. Controllers should ensure that entries to status populated with their ControllerName are cleaned up when they are no longer necessary. |
||
conditions Condition array |
Conditions describes the status of the Policy with respect to the given Ancestor. | MaxItems: 8 MinItems: 1 |
PolicyDisable
PolicyDisable is used to disable a policy.
Appears in:
PolicyPhase
Underlying type: string
Validation:
- Enum: [PreRouting PostRouting]
Appears in:
| Field | Description |
|---|---|
PreRouting |
|
PostRouting |
Port
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
port integer |
The port number to match on the Gateway | Maximum: 65535 Minimum: 1 |
|
nodePort integer |
The NodePort to be used for the service. If not specified, a random port will be assigned by the Kubernetes API server. |
Maximum: 65535 Minimum: 1 |
PriorityGroup
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
providers NamedLLMProvider array |
providers specifies a list of LLM providers within this group. Each provider is treated equally in terms of priority, with automatic weighting based on health. |
MaxItems: 32 MinItems: 1 |
ProcessingMode
ProcessingMode defines how the filter should interact with the request/response streams
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
requestHeaderMode string |
RequestHeaderMode determines how to handle the request headers | SEND | Enum: [DEFAULT SEND SKIP] |
responseHeaderMode string |
ResponseHeaderMode determines how to handle the response headers | SEND | Enum: [DEFAULT SEND SKIP] |
requestBodyMode string |
RequestBodyMode determines how to handle the request body | NONE | Enum: [NONE STREAMED BUFFERED BUFFERED_PARTIAL FULL_DUPLEX_STREAMED] |
responseBodyMode string |
ResponseBodyMode determines how to handle the response body | NONE | Enum: [NONE STREAMED BUFFERED BUFFERED_PARTIAL FULL_DUPLEX_STREAMED] |
requestTrailerMode string |
RequestTrailerMode determines how to handle the request trailers | SKIP | Enum: [DEFAULT SEND SKIP] |
responseTrailerMode string |
ResponseTrailerMode determines how to handle the response trailers | SKIP | Enum: [DEFAULT SEND SKIP] |
PromptCachingConfig
PromptCachingConfig configures automatic prompt caching for supported LLM providers. Currently only AWS Bedrock supports this feature (Claude 3+ and Nova models).
When enabled, the gateway automatically inserts cache points at strategic locations to reduce API costs. Bedrock charges lower rates for cached tokens (90% discount).
Example:
promptCaching:
cacheSystem: true # Cache system prompts
cacheMessages: true # Cache conversation history
cacheTools: false # Don't cache tool definitions
minTokens: 1024 # Only cache if ≥1024 tokens
Cost savings example:
- Without caching: 10,000 tokens × $3/MTok = $0.03
- With caching (90% cached): 1,000 × $3/MTok + 9,000 × $0.30/MTok = $0.0057 (81% savings)
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
cacheSystem boolean |
CacheSystem enables caching for system prompts. Inserts a cache point after all system messages. |
true | |
cacheMessages boolean |
CacheMessages enables caching for conversation messages. Caches all messages in the conversation for cost savings. |
true | |
cacheTools boolean |
CacheTools enables caching for tool definitions. Inserts a cache point after all tool specifications. |
false | |
minTokens integer |
MinTokens specifies the minimum estimated token count before caching is enabled. Uses rough heuristic (word count × 1.3) to estimate tokens. Bedrock requires at least 1,024 tokens for caching to be effective. |
1024 | Minimum: 0 |
PromptguardRequest
PromptguardRequest defines the prompt guards to apply to requests sent by the client.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
response CustomResponse |
A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
||
regex Regex |
Regular expression (regex) matching for prompt guards and data masking. | ||
webhook Webhook |
Configure a webhook to forward requests to for prompt guarding. | ||
openAIModeration OpenAIModeration |
openAIModeration passes prompt data through the OpenAI Moderations endpoint. See https://platform.openai.com/docs/api-reference/moderations for more information. |
PromptguardResponse
PromptguardResponse configures the response that the prompt guard applies to responses returned by the LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
response CustomResponse |
A custom response message to return to the client. If not specified, defaults to “The response was rejected due to inappropriate content”. |
||
regex Regex |
Regular expression (regex) matching for prompt guards and data masking. | ||
webhook Webhook |
Configure a webhook to forward responses to for prompt guarding. |
ProxyDeployment
ProxyDeployment configures the Proxy deployment in Kubernetes.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
replicas integer |
The number of desired pods. If omitted, behavior will be managed by the K8s control plane, and will default to 1. If you are using an HPA, make sure to not explicitly define this. K8s reference: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#replicas |
Minimum: 0 |
|
strategy DeploymentStrategy |
The deployment strategy to use to replace existing pods with new ones. The Kubernetes default is a RollingUpdate with 25% maxUnavailable, 25% maxSurge. E.g., to recreate pods, minimizing resources for the rollout but causing downtime: strategy: type: Recreate E.g., to roll out as a RollingUpdate but with non-default parameters: strategy: type: RollingUpdate rollingUpdate: maxSurge: 100% |
RateLimit
RateLimit defines a rate limiting policy.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
local LocalRateLimitPolicy |
Local defines a local rate limiting policy. | ||
global RateLimitPolicy |
Global defines a global rate limiting policy using an external service. |
RateLimitDescriptor
RateLimitDescriptor defines a descriptor for rate limiting. A descriptor is a group of entries that form a single rate limit rule.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
entries RateLimitDescriptorEntry array |
Entries are the individual components that make up this descriptor. When translated to Envoy, these entries combine to form a single descriptor. |
MinItems: 1 |
RateLimitDescriptorEntry
RateLimitDescriptorEntry defines a single entry in a rate limit descriptor. Only one entry type may be specified.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
type RateLimitDescriptorEntryType |
Type specifies what kind of rate limit descriptor entry this is. | Enum: [Generic Header RemoteAddress Path] |
|
generic RateLimitDescriptorEntryGeneric |
Generic contains the configuration for a generic key-value descriptor entry. This field must be specified when Type is Generic. |
||
header string |
Header specifies a request header to extract the descriptor value from. This field must be specified when Type is Header. |
MinLength: 1 |
RateLimitDescriptorEntryType
Underlying type: string
RateLimitDescriptorEntryType defines the type of a rate limit descriptor entry.
Validation:
- Enum: [Generic Header RemoteAddress Path]
Appears in:
| Field | Description |
|---|---|
Generic |
RateLimitDescriptorEntryTypeGeneric represents a generic key-value descriptor entry. |
Header |
RateLimitDescriptorEntryTypeHeader represents a descriptor entry that extracts its value from a request header. |
RemoteAddress |
RateLimitDescriptorEntryTypeRemoteAddress represents a descriptor entry that uses the client’s IP address as its value. |
Path |
RateLimitDescriptorEntryTypePath represents a descriptor entry that uses the request path as its value. |
RateLimitPolicy
RateLimitPolicy defines a global rate limiting policy using an external service.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
descriptors RateLimitDescriptor array |
Descriptors define the dimensions for rate limiting. These values are passed to the rate limit service which applies configured limits based on them. Each descriptor represents a single rate limit rule with one or more entries. |
MinItems: 1 |
|
extensionRef NamespacedObjectReference |
ExtensionRef references a GatewayExtension that provides the global rate limit service. |
RateLimitProvider
RateLimitProvider defines the configuration for a RateLimit service provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
grpcService ExtGrpcService |
GrpcService is the GRPC service that will handle the rate limiting. | ||
domain string |
Domain identifies a rate limiting configuration for the rate limit service. All rate limit requests must specify a domain, which enables the configuration to be per application without fear of overlap (e.g., “api”, “web”, “admin”). |
||
failOpen boolean |
FailOpen determines if requests are limited when the rate limit service is unavailable. Defaults to true, meaning requests are allowed upstream and not limited if the rate limit service is unavailable. |
true | |
timeout Duration |
Timeout provides an optional timeout value for requests to the rate limit service. For rate limiting, prefer using this timeout rather than setting the generic timeout on the GrpcService.See envoy issue for more info. |
100ms | |
xRateLimitHeaders XRateLimitHeadersStandard |
XRateLimitHeaders configures the standard version to use for X-RateLimit headers emitted. See envoy docs for more info. Disabled by default. |
Off | Enum: [Off DraftVersion03] |
Regex
Regex configures the regular expression (regex) matching for prompt guards and data masking.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
matches string array |
A list of regex patterns to match against the request or response. Matches and built-ins are additive. |
||
builtins BuiltIn array |
A list of built-in regex patterns to match against the request or response. Matches and built-ins are additive. |
Enum: [SSN CREDIT_CARD PHONE_NUMBER EMAIL] |
|
action Action |
The action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. PromptguardResponse matches are always masked by default. Defaults to MASK. |
MASK |
ResourceDetector
ResourceDetector defines the list of supported ResourceDetectors
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
environmentResourceDetector EnvironmentResourceDetectorConfig |
ResponseFlagFilter
ResponseFlagFilter filters based on response flags. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-responseflagfilter
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
flags string array |
MinItems: 1 |
Retry
Retry defines the retry policy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
retryOn RetryOnCondition array |
RetryOn specifies the conditions under which a retry should be attempted. | Enum: [5xx gateway-error reset reset-before-request connect-failure envoy-ratelimited retriable-4xx refused-stream retriable-status-codes http3-post-connect-failure cancelled deadline-exceeded internal resource-exhausted unavailable] MinItems: 1 |
|
attempts integer |
Attempts specifies the number of retry attempts for a request. Defaults to 1 attempt if not set. A value of 0 effectively disables retries. |
1 | Minimum: 0 |
perTryTimeout Duration |
PerTryTimeout specifies the timeout per retry attempt (incliding the initial attempt). If a global timeout is configured on a route, this timeout must be less than the global route timeout. It is specified as a sequence of decimal numbers, each with optional fraction and a unit suffix, such as “1s” or “500ms”. |
||
statusCodes HTTPRouteRetryStatusCode array |
StatusCodes specifies the HTTP status codes in the range 400-599 that should be retried in addition to the conditions specified in RetryOn. |
MinItems: 1 |
|
backoffBaseInterval Duration |
BackoffBaseInterval specifies the base interval used with a fully jittered exponential back-off between retries. Defaults to 25ms if not set. Given a backoff base interval B and retry number N, the back-off for the retry is in the range [0, (2^N-1)*B]. The backoff interval is capped at a max of 10 times the base interval. E.g., given a value of 25ms, the first retry will be delayed randomly by 0-24ms, the 2nd by 0-74ms, the 3rd by 0-174ms, and so on, and capped to a max of 10 times the base interval (250ms). |
25ms |
RetryOnCondition
Underlying type: string
RetryOnCondition specifies the condition under which retry takes place.
Validation:
- Enum: [5xx gateway-error reset reset-before-request connect-failure envoy-ratelimited retriable-4xx refused-stream retriable-status-codes http3-post-connect-failure cancelled deadline-exceeded internal resource-exhausted unavailable]
Appears in:
RetryPolicy
Specifies the retry policy of remote data source when fetching fails. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/base.proto#envoy-v3-api-msg-config-core-v3-retrypolicy
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
retryBackOff BackoffStrategy |
Specifies parameters that control retry backoff strategy. the default base interval is 1000 milliseconds and the default maximum interval is 10 times the base interval. |
||
numRetries integer |
Specifies the allowed number of retries. Defaults to 1. | Minimum: 1 |
RouteType
Underlying type: string
RouteType specifies how the AI gateway should process incoming requests based on the URL path and the API format expected.
Validation:
- Enum: [completions messages models passthrough responses anthropic_token_count]
Appears in:
| Field | Description |
|---|---|
completions |
RouteTypeCompletions processes OpenAI /v1/chat/completions format requests |
messages |
RouteTypeMessages processes Anthropic /v1/messages format requests |
models |
RouteTypeModels handles /v1/models endpoint (returns available models) |
passthrough |
RouteTypePassthrough sends requests to upstream as-is without LLM processing |
responses |
RouteTypeResponses processes OpenAI /v1/responses format requests |
anthropic_token_count |
RouteTypeAnthropicTokenCount processes Anthropic /v1/messages/count_tokens format requests |
Sampler
Sampler defines the list of supported Samplers
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
alwaysOnConfig AlwaysOnConfig |
SdsBootstrap
SdsBootstrap configures the SDS instance that is provisioned from a Kubernetes Gateway.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
logLevel string |
Log level for SDS. Options include “info”, “debug”, “warn”, “error”, “panic” and “fatal”. Default level is “info”. |
SdsContainer
SdsContainer configures the container running SDS sidecar.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
image Image |
The SDS container image. See https://kubernetes.io/docs/concepts/containers/images for details. |
||
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
||
bootstrap SdsBootstrap |
Initial SDS container configuration. |
SecretSelector
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
matchLabels object (keys:string, values:string) |
Label selector to select the target resource. |
SelfManagedGateway
Appears in:
ServerHeaderTransformation
Underlying type: string
ServerHeaderTransformation determines how the server header is transformed.
Appears in:
| Field | Description |
|---|---|
Overwrite |
OverwriteServerHeaderTransformation overwrites the server header. |
AppendIfAbsent |
AppendIfAbsentServerHeaderTransformation appends to the server header if it’s not present. |
PassThrough |
PassThroughServerHeaderTransformation passes through the server header unchanged. |
Service
Configuration for a Kubernetes Service.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
type ServiceType |
The Kubernetes Service type. | Enum: [ClusterIP NodePort LoadBalancer ExternalName] |
|
clusterIP string |
The manually specified IP address of the service, if a randomly assigned IP is not desired. See https://kubernetes.io/docs/concepts/services-networking/service/#choosing-your-own-ip-address and https://kubernetes.io/docs/concepts/services-networking/service/#headless-services on the implications of setting clusterIP. |
||
extraLabels object (keys:string, values:string) |
Additional labels to add to the Service object metadata. If the same label is present on Gateway.spec.infrastructure.labels, the Gateway takes precedence. |
||
extraAnnotations object (keys:string, values:string) |
Additional annotations to add to the Service object metadata. If the same annotation is present on Gateway.spec.infrastructure.annotations, the Gateway takes precedence. |
||
ports Port array |
Additional configuration for the service ports. The actual port numbers are specified in the Gateway resource. |
||
externalTrafficPolicy string |
ExternalTrafficPolicy defines the external traffic policy for the service. Valid values are Cluster and Local. Default value is Cluster. |
ServiceAccount
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
extraLabels object (keys:string, values:string) |
Additional labels to add to the ServiceAccount object metadata. | ||
extraAnnotations object (keys:string, values:string) |
Additional annotations to add to the ServiceAccount object metadata. If the same annotation is present on Gateway.spec.infrastructure.annotations, the Gateway takes precedence. |
SlowStart
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
window Duration |
Represents the size of slow start window. If set, the newly created host remains in slow start mode starting from its creation time for the duration of slow start window. |
||
aggression string |
This parameter controls the speed of traffic increase over the slow start window. Defaults to 1.0, so that endpoint would get linearly increasing amount of traffic. When increasing the value for this parameter, the speed of traffic ramp-up increases non-linearly. The value of aggression parameter should be greater than 0.0. By tuning the parameter, is possible to achieve polynomial or exponential shape of ramp-up curve. During slow start window, effective weight of an endpoint would be scaled with time factor and aggression: new_weight = weight * max(min_weight_percent, time_factor ^ (1 / aggression)),where time_factor=(time_since_start_seconds / slow_start_time_seconds).As time progresses, more and more traffic would be sent to endpoint, which is in slow start window. Once host exits slow start, time_factor and aggression no longer affect its weight. |
||
minWeightPercent integer |
Minimum weight percentage of an endpoint during slow start. | Maximum: 100 Minimum: 0 |
SourceIP
Appears in:
StaticBackend
StaticBackend references a static list of hosts.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
hosts Host array |
Hosts is a list of hosts to use for the backend. | MinItems: 1 |
|
appProtocol AppProtocol |
AppProtocol is the application protocol to use when communicating with the backend. | Enum: [http2 grpc grpc-web kubernetes.io/h2c kubernetes.io/ws] |
StatsConfig
Configuration for the stats server.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Whether to expose metrics annotations and ports for scraping metrics. | ||
routePrefixRewrite string |
The Envoy stats endpoint to which the metrics are written | ||
enableStatsRoute boolean |
Enables an additional route to the stats cluster defaulting to /stats | ||
statsRoutePrefixRewrite string |
The Envoy stats endpoint with general metrics for the additional stats route |
StatusCodeFilter
Underlying type: ComparisonFilter
StatusCodeFilter filters based on HTTP status code. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#envoy-v3-api-msg-config-accesslog-v3-statuscodefilter
Appears in:
StringMatcher
Specifies the way to match a string.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
exact string |
The input string must match exactly the string specified here. Example: abc matches the value abc |
||
prefix string |
The input string must have the prefix specified here. Note: empty prefix is not allowed, please use regex instead. Example: abc matches the value abc.xyz |
||
suffix string |
The input string must have the suffix specified here. Note: empty prefix is not allowed, please use regex instead. Example: abc matches the value xyz.abc |
||
contains string |
The input string must contain the substring specified here. Example: abc matches the value xyz.abc.def |
||
safeRegex string |
The input string must match the Google RE2 regular expression specified here. See https://github.com/google/re2/wiki/Syntax for the syntax. |
||
ignoreCase boolean |
If true, indicates the exact/prefix/suffix/contains matching should be case insensitive. This has no effect on the regex match. For example, the matcher data will match both input string Data and data if this option is set to true. |
TCPKeepalive
See Envoy documentation for more details.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
keepAliveProbes integer |
Maximum number of keep-alive probes to send before dropping the connection. | Minimum: 0 |
|
keepAliveTime Duration |
The number of seconds a connection needs to be idle before keep-alive probes start being sent. | ||
keepAliveInterval Duration |
The number of seconds between keep-alive probes. |
TLS
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
secretRef LocalObjectReference |
Reference to the TLS secret containing the certificate, key, and optionally the root CA. | ||
files TLSFiles |
File paths to certificates local to the proxy. | ||
wellKnownCACertificates WellKnownCACertificatesType |
WellKnownCACertificates specifies whether to use a well-known set of CA certificates for validating the backend’s certificate chain. Currently, only the system certificate pool is supported via SDS. |
||
insecureSkipVerify boolean |
InsecureSkipVerify originates TLS but skips verification of the backend’s certificate. WARNING: This is an insecure option that should only be used if the risks are understood. |
||
sni string |
The SNI domains that should be considered for TLS connection | MinLength: 1 |
|
verifySubjectAltNames string array |
Verify that the Subject Alternative Name in the peer certificate is one of the specified values. note that a root_ca must be provided if this option is used. |
||
parameters TLSParameters |
General TLS parameters. See the envoy docs for more information on the meaning of these values. |
||
alpnProtocols string array |
Set Application Level Protocol Negotiation If empty, defaults to [“h2”, “http/1.1”]. |
||
allowRenegotiation boolean |
Allow Tls renegotiation, the default value is false. TLS renegotiation is considered insecure and shouldn’t be used unless absolutely necessary. |
||
simpleTLS boolean |
If the TLS config has the tls cert and key provided, kgateway uses it to perform mTLS by default. Set simpleTLS to true to disable mTLS in favor of server-only TLS (one-way TLS), even if kgateway has the client cert. If unset, defaults to false. |
TLSFiles
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
tlsCertificate string |
MinLength: 1 |
||
tlsKey string |
MinLength: 1 |
||
rootCA string |
MinLength: 1 |
TLSParameters
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
minVersion TLSVersion |
Minimum TLS version. | Enum: [AUTO 1.0 1.1 1.2 1.3] |
|
maxVersion TLSVersion |
Maximum TLS version. | Enum: [AUTO 1.0 1.1 1.2 1.3] |
|
cipherSuites string array |
|||
ecdhCurves string array |
TLSVersion
Underlying type: string
TLSVersion defines the TLS version.
Validation:
- Enum: [AUTO 1.0 1.1 1.2 1.3]
Appears in:
| Field | Description |
|---|---|
AUTO |
|
1.0 |
|
1.1 |
|
1.2 |
|
1.3 |
Timeouts
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
request Duration |
Request specifies a timeout for an individual request from the gateway to a backend. This spans between the point at which the entire downstream request (i.e. end-of-stream) has been processed and when the backend response has been completely processed. A value of 0 effectively disables the timeout. It is specified as a sequence of decimal numbers, each with optional fraction and a unit suffix, such as “1s” or “500ms”. |
||
streamIdle Duration |
StreamIdle specifies a timeout for a requests’ idle streams. A value of 0 effectively disables the timeout. |
TokenBucket
TokenBucket defines the configuration for a token bucket rate-limiting mechanism. It controls the rate at which tokens are generated and consumed for a specific operation.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
maxTokens integer |
MaxTokens specifies the maximum number of tokens that the bucket can hold. This value must be greater than or equal to 1. It determines the burst capacity of the rate limiter. |
Minimum: 1 |
|
tokensPerFill integer |
TokensPerFill specifies the number of tokens added to the bucket during each fill interval. If not specified, it defaults to 1. This controls the steady-state rate of token generation. |
1 | Minimum: 1 |
fillInterval Duration |
FillInterval defines the time duration between consecutive token fills. This value must be a valid duration string (e.g., “1s”, “500ms”). It determines the frequency of token replenishment. |
Tracing
Tracing represents the top-level Envoy’s tracer. Ref: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto#extensions-filters-network-http-connection-manager-v3-httpconnectionmanager-tracing
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
provider TracingProvider |
Provider defines the upstream to which envoy sends traces | MaxProperties: 1 MinProperties: 1 |
|
clientSampling integer |
Target percentage of requests managed by this HTTP connection manager that will be force traced if the x-client-trace-id header is set. Defaults to 100% | Maximum: 100 Minimum: 0 |
|
randomSampling integer |
Target percentage of requests managed by this HTTP connection manager that will be randomly selected for trace generation, if not requested by the client or not forced. Defaults to 100% | Maximum: 100 Minimum: 0 |
|
overallSampling integer |
Target percentage of requests managed by this HTTP connection manager that will be traced after all other sampling checks have been applied (client-directed, force tracing, random sampling). Defaults to 100% | Maximum: 100 Minimum: 0 |
|
verbose boolean |
Whether to annotate spans with additional data. If true, spans will include logs for stream events. Defaults to false | ||
maxPathTagLength integer |
Maximum length of the request path to extract and include in the HttpUrl tag. Used to truncate lengthy request paths to meet the needs of a tracing backend. Default: 256 | Minimum: 1 |
|
attributes CustomAttribute array |
A list of attributes with a unique name to create attributes for the active span. | MaxProperties: 2 MinProperties: 1 |
|
spawnUpstreamSpan boolean |
Create separate tracing span for each upstream request if true. Defaults to false Link to envoy docs for more info |
TracingProtocol
Underlying type: string
Appears in:
| Field | Description |
|---|---|
HTTP |
|
GRPC |
TracingProvider
TracingProvider defines the list of providers for tracing
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
openTelemetry OpenTelemetryTracingConfig |
Tracing contains various settings for Envoy’s OTel tracer. |
TrafficPolicy
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
TrafficPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec TrafficPolicySpec |
|||
status PolicyStatus |
TrafficPolicySpec
TrafficPolicySpec defines the desired state of a traffic policy.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
targetRefs LocalPolicyTargetReferenceWithSectionName array |
TargetRefs specifies the target resources by reference to attach the policy to. | MaxItems: 16 MinItems: 1 |
|
targetSelectors LocalPolicyTargetSelectorWithSectionName array |
TargetSelectors specifies the target selectors to select resources to attach the policy to. | ||
transformation TransformationPolicy |
Transformation is used to mutate and transform requests and responses before forwarding them to the destination. |
||
extProc ExtProcPolicy |
ExtProc specifies the external processing configuration for the policy. | ||
extAuth ExtAuthPolicy |
ExtAuth specifies the external authentication configuration for the policy. This controls what external server to send requests to for authentication. |
||
rateLimit RateLimit |
RateLimit specifies the rate limiting configuration for the policy. This controls the rate at which requests are allowed to be processed. |
||
cors CorsPolicy |
Cors specifies the CORS configuration for the policy. | ||
csrf CSRFPolicy |
Csrf specifies the Cross-Site Request Forgery (CSRF) policy for this traffic policy. | ||
headerModifiers HeaderModifiers |
HeaderModifiers defines the policy to modify request and response headers. | ||
autoHostRewrite boolean |
AutoHostRewrite rewrites the Host header to the DNS name of the selected upstream. NOTE: This field is only honoured for HTTPRoute targets. NOTE: If autoHostRewrite is set on a route that also has a URLRewrite filterconfigured to override the hostname, the hostname value will be used and autoHostRewrite will be ignored. |
||
buffer Buffer |
Buffer can be used to set the maximum request size that will be buffered. Requests exceeding this size will return a 413 response. |
||
timeouts Timeouts |
Timeouts defines the timeouts for requests It is applicable to HTTPRoutes and ignored for other targeted kinds. |
||
retry Retry |
Retry defines the policy for retrying requests. It is applicable to HTTPRoutes, Gateway listeners and XListenerSets, and ignored for other targeted kinds. |
||
rbac Authorization |
RBAC specifies the role-based access control configuration for the policy. This defines the rules for authorization based on roles and permissions. RBAC policies applied at different attachment points in the configuration hierarchy are not cumulative, and only the most specific policy is enforced. This means an RBAC policy attached to a route will override any RBAC policies applied to the gateway or listener. |
||
jwt JWTAuthentication |
JWT specifies the JWT authentication configuration for the policy. This defines the JWT providers and their configurations. |
Transform
Transform defines the operations to be performed by the transformation. These operations may include changing the actual request/response but may also cause side effects. Side effects may include setting info that can be used in future steps (e.g. dynamic metadata) and can cause envoy to buffer.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
set HeaderTransformation array |
Set is a list of headers and the value they should be set to. | MaxItems: 16 |
|
add HeaderTransformation array |
Add is a list of headers to add to the request and what that value should be set to. If there is already a header with these values then append the value as an extra entry. |
MaxItems: 16 |
|
remove string array |
Remove is a list of header names to remove from the request/response. | MaxItems: 16 |
|
body BodyTransformation |
Body controls both how to parse the body and if needed how to set. If empty, body will not be buffered. |
TransformationPolicy
TransformationPolicy config is used to modify envoy behavior at a route level. These modifications can be performed on the request and response paths.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
request Transform |
Request is used to modify the request path. | ||
response Transform |
Response is used to modify the response path. |
UpgradeConfig
UpgradeConfig represents configuration for HTTP upgrades.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enabledUpgrades string array |
List of upgrade types to enable (e.g. “websocket”, “CONNECT”, etc.) | MinItems: 1 |
VertexAIConfig
VertexAIConfig settings for the Vertex AI LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
Optional: Override the model name, such as gpt-4o-mini.If unset, the model name is taken from the request. |
||
projectId string |
The ID of the Google Cloud Project that you use for the Vertex AI. | MinLength: 1 |
|
region string |
The location of the Google Cloud Project that you use for the Vertex AI. | MinLength: 1 |
Webhook
Webhook configures a webhook to forward requests or responses to for prompt guarding.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the webhook server to reach. Supported types: Service and Backend. |
||
forwardHeaderMatches HTTPHeaderMatch array |
ForwardHeaderMatches defines a list of HTTP header matches that will be used to select the headers to forward to the webhook. Request headers are used when forwarding requests and response headers are used when forwarding responses. By default, no headers are forwarded. |
XRateLimitHeadersStandard
Underlying type: string
XRateLimitHeadersStandard controls how XRateLimit headers will emitted.
Appears in:
| Field | Description |
|---|---|
Off |
XRateLimitHeaderOff disables emitting of XRateLimit headers. |
DraftVersion03 |
XRateLimitHeaderDraftV03 outputs headers as described in draft RFC version 03. |