API reference
Packages
gateway.kgateway.dev/v1alpha1
Resource Types
- Backend
- BackendConfigPolicy
- DirectResponse
- GatewayExtension
- GatewayParameters
- HTTPListenerPolicy
- TrafficPolicy
AIBackend
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
llm LLMProvider |
The LLM configures the AI gateway to use a single LLM provider backend. | ||
multipool MultiPoolConfig |
The MultiPool configures the backends for multiple hosts or models from the same provider in one Backend resource. |
AIPolicy
AIPolicy config is used to configure the behavior of the LLM provider on the level of individual routes. These route settings, such as prompt enrichment, retrieval augmented generation (RAG), and semantic caching, are applicable only for routes that send requests to an LLM provider backend.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
promptEnrichment AIPromptEnrichment |
Enrich requests sent to the LLM provider by appending and prepending system prompts. This can be configured only for LLM providers that use the CHAT or CHAT_STREAMING API route type. |
||
promptGuard AIPromptGuard |
Set up prompt guards to block unwanted requests to the LLM provider and mask sensitive data. Prompt guards can be used to reject requests based on the content of the prompt, as well as mask responses based on the content of the response. |
||
defaults FieldDefault array |
Provide defaults to merge with user input fields. Defaults do not override the user input fields, unless you explicitly set override to true . |
||
routeType RouteType |
The type of route to the LLM provider API. Currently, CHAT and CHAT_STREAMING are supported. |
CHAT | Enum: [CHAT CHAT_STREAMING] |
AIPromptEnrichment
AIPromptEnrichment defines the config to enrich requests sent to the LLM provider by appending and prepending system prompts.
This can be configured only for LLM providers that use the CHAT
or CHAT_STREAMING
API type.
Prompt enrichment allows you to add additional context to the prompt before sending it to the model. Unlike RAG or other dynamic context methods, prompt enrichment is static and is applied to every request.
Note: Some providers, including Anthropic, do not support SYSTEM role messages, and instead have a dedicated
system field in the input JSON. In this case, use the defaults
setting to set the system field.
The following example prepends a system prompt of Answer all questions in French.
and appends Describe the painting as if you were a famous art critic from the 17th century.
to each request that is sent to the openai
HTTPRoute.
name: openai-opt
namespace: kgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: openai
ai:
promptEnrichment:
prepend:
- role: SYSTEM
content: "Answer all questions in French."
append:
- role: USER
content: "Describe the painting as if you were a famous art critic from the 17th century."
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
prepend Message array |
A list of messages to be prepended to the prompt sent by the client. | ||
append Message array |
A list of messages to be appended to the prompt sent by the client. |
AIPromptGuard
AIPromptGuard configures a prompt guards to block unwanted requests to the LLM provider and mask sensitive data. Prompt guards can be used to reject requests based on the content of the prompt, as well as mask responses based on the content of the response.
This example rejects any request prompts that contain the string “credit card”, and masks any credit card numbers in the response.
promptGuard:
request:
customResponse:
message: "Rejected due to inappropriate content"
regex:
action: REJECT
matches:
- pattern: "credit card"
name: "CC"
response:
regex:
builtins:
- CREDIT_CARD
action: MASK
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
request PromptguardRequest |
Prompt guards to apply to requests sent by the client. | ||
response PromptguardResponse |
Prompt guards to apply to responses returned by the LLM provider. |
AWSLambdaPayloadTransformMode
Underlying type: string
AWSLambdaPayloadTransformMode defines the transformation mode for the payload in the request before it is sent to the AWS Lambda function.
Validation:
- Enum: [None Envoy]
Appears in:
Field | Description |
---|---|
None |
AWSLambdaPayloadTransformNone indicates that the payload will not be transformed using Envoy’s built-in transformation before it is sent to the Lambda function. Note: Transformation policies configured on the route will still apply. |
Envoy |
AWSLambdaPayloadTransformEnvoy indicates that the payload will be transformed using Envoy’s built-in transformation. Refer to https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/aws_lambda_filter#configuration-as-a-listener-filter for more details on how Envoy transforms the payload. |
AccessLog
AccessLog represents the top-level access log configuration.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
fileSink FileSink |
Output access logs to local file | ||
grpcService GrpcService |
Send access logs to gRPC service | ||
filter AccessLogFilter |
Filter access logs configuration | MaxProperties: 1 MinProperties: 1 |
AccessLogFilter
AccessLogFilter represents the top-level filter structure. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-accesslogfilter
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
andFilter FilterType array |
Performs a logical “and” operation on the result of each individual filter. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-andfilter |
MaxProperties: 1 MinItems: 2 MinProperties: 1 |
|
orFilter FilterType array |
Performs a logical “or” operation on the result of each individual filter. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-orfilter |
MaxProperties: 1 MinItems: 2 MinProperties: 1 |
Action
Underlying type: string
Action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. PromptguardResponse matches are always masked by default.
Appears in:
Field | Description |
---|---|
MASK |
Mask the matched data in the request. |
REJECT |
Reject the request if the regex matches content in the request. |
AgentGateway
Configuration of the AgentGateway integration
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enabled boolean |
Whether to enable the extension. | ||
logLevel string |
Log level for the agentgateway. Defaults to info. Levels include “trace”, “debug”, “info”, “error”, “warn”. See: https://docs.rs/tracing/latest/tracing/struct.Level.html |
AiExtension
Configuration for the AI extension.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enabled boolean |
Whether to enable the extension. | ||
image Image |
The extension’s container image. See https://kubernetes.io/docs/concepts/containers/images for details. |
||
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
||
env EnvVar array |
The extension’s container environment variables. | ||
ports ContainerPort array |
The extension’s container ports. | ||
stats AiExtensionStats |
Additional stats config for AI Extension. This config can be useful for adding custom labels to the request metrics. Example: stats: customLabels: - name: “subject” metadataNamespace: “envoy.filters.http.jwt_authn” metadataKey: “principal:sub” - name: “issuer” metadataNamespace: “envoy.filters.http.jwt_authn” metadataKey: “principal:iss” |
AiExtensionStats
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
customLabels CustomLabel array |
Set of custom labels to be added to the request metrics. These will be added on each request which goes through the AI Extension. |
AnthropicConfig
AnthropicConfig settings for the Anthropic LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the Anthropic API. This token is automatically sent in the x-api-key header of the request. |
||
apiVersion string |
Optional: A version header to pass to the Anthropic API. For more information, see the Anthropic API versioning docs. |
||
model string |
Optional: Override the model name. If unset, the model name is taken from the request. This setting can be useful when testing model failover scenarios. |
AppProtocol
Underlying type: string
AppProtocol defines the application protocol to use when communicating with the backend.
Validation:
- Enum: [http2 grpc grpc-web kubernetes.io/h2c kubernetes.io/ws]
Appears in:
Field | Description |
---|---|
http2 |
AppProtocolHttp2 is the http2 app protocol. |
grpc |
AppProtocolGrpc is the grpc app protocol. |
grpc-web |
AppProtocolGrpcWeb is the grpc-web app protocol. |
kubernetes.io/h2c |
AppProtocolKubernetesH2C is the kubernetes.io/h2c app protocol. |
kubernetes.io/ws |
AppProtocolKubernetesWs is the kubernetes.io/ws app protocol. |
AuthHeaderOverride
AuthHeaderOverride allows customization of the default Authorization header sent to the LLM Provider.
The default header is Authorization: Bearer <token>
. HeaderName can change the Authorization
header name and Prefix can change the Bearer prefix
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
prefix string |
|||
headerName string |
AwsAuth
AwsAuth specifies the authentication method to use for the backend.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
type AwsAuthType |
Type specifies the authentication method to use for the backend. | Enum: [Secret] |
|
secretRef LocalObjectReference |
SecretRef references a Kubernetes Secret containing the AWS credentials. The Secret must have keys “accessKey”, “secretKey”, and optionally “sessionToken”. |
AwsAuthType
Underlying type: string
AwsAuthType specifies the authentication method to use for the backend.
Appears in:
Field | Description |
---|---|
Secret |
AwsAuthTypeSecret uses credentials stored in a Kubernetes Secret. |
AwsBackend
AwsBackend is the AWS backend configuration.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
accountId string |
AccountId is the AWS account ID to use for the backend. | MaxLength: 12 MinLength: 1 Pattern: ^[0-9]\{12\}$ |
|
auth AwsAuth |
Auth specifies an explicit AWS authentication method for the backend. When omitted, the following credential providers are tried in order, stopping when one of them returns an access key ID and a secret access key (the session token is optional): 1. Environment variables: when the environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN are set. 2. AssumeRoleWithWebIdentity API call: when the environment variables AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_ARN are set. 3. EKS Pod Identity: when the environment variable AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE is set. See the Envoy docs for more info: https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/aws_request_signing_filter#credentials |
||
lambda AwsLambda |
Lambda configures the AWS lambda service. | ||
region string |
Region is the AWS region to use for the backend. Defaults to us-east-1 if not specified. |
us-east-1 | MaxLength: 63 MinLength: 1 Pattern: ^[a-z0-9-]+$ |
AwsLambda
AwsLambda configures the AWS lambda service.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
endpointURL string |
EndpointURL is the URL or domain for the Lambda service. This is primarily useful for testing and development purposes. When omitted, the default lambda hostname will be used. |
MaxLength: 2048 Pattern: ^https?://[-a-zA-Z0-9@:%.+~#?&/=]+$ |
|
functionName string |
FunctionName is the name of the Lambda function to invoke. | Pattern: ^[A-Za-z0-9-_]\{1,140\}$ |
|
invocationMode string |
InvocationMode defines how to invoke the Lambda function. Defaults to Sync. |
Sync | Enum: [Sync Async] |
qualifier string |
Qualifier is the alias or version for the Lambda function. Valid values include a numeric version (e.g. “1”), an alias name (alphanumeric plus “-” or “_”), or the special literal “$LATEST”. |
Pattern: ^(\$LATEST|[0-9]+|[A-Za-z0-9-_]\{1,128\})$ |
|
payloadTransformMode AWSLambdaPayloadTransformMode |
PayloadTransformation specifies payload transformation mode before it is sent to the Lambda function. Defaults to Envoy. |
Envoy | Enum: [None Envoy] |
AzureOpenAIConfig
AzureOpenAIConfig settings for the Azure OpenAI LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the Azure OpenAI API. This token is automatically sent in the api-key header of the request. |
||
endpoint string |
The endpoint for the Azure OpenAI API to use, such as my-endpoint.openai.azure.com .If the scheme is included, it is stripped. |
MinLength: 1 |
|
deploymentName string |
The name of the Azure OpenAI model deployment to use. For more information, see the Azure OpenAI model docs. |
MinLength: 1 |
|
apiVersion string |
The version of the Azure OpenAI API to use. For more information, see the Azure OpenAI API version reference. |
MinLength: 1 |
Backend
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
Backend |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec BackendSpec |
|||
status BackendStatus |
BackendConfigPolicy
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
BackendConfigPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec BackendConfigPolicySpec |
|||
status PolicyStatus |
BackendConfigPolicySpec
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
targetRefs LocalPolicyTargetReference array |
MaxItems: 16 MinItems: 1 |
||
targetSelectors LocalPolicyTargetSelector array |
TargetSelectors specifies the target selectors to select resources to attach the policy to. | ||
connectTimeout Duration |
The timeout for new network connections to hosts in the cluster. | ||
perConnectionBufferLimitBytes integer |
Soft limit on size of the cluster’s connections read and write buffers. If unspecified, an implementation defined default is applied (1MiB). |
||
tcpKeepalive TCPKeepalive |
Configure OS-level TCP keepalive checks. | ||
commonHttpProtocolOptions CommonHttpProtocolOptions |
Additional options when handling HTTP requests upstream, applicable to both HTTP1 and HTTP2 requests. |
||
http1ProtocolOptions Http1ProtocolOptions |
Additional options when handling HTTP1 requests upstream. | ||
tls TLS |
TLS contains the options necessary to configure a backend to use TLS origination. See Envoy documentation for more details. |
||
loadBalancer LoadBalancer |
LoadBalancer contains the options necessary to configure the load balancer. |
BackendSpec
BackendSpec defines the desired state of Backend.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
type BackendType |
Type indicates the type of the backend to be used. | Enum: [AI AWS Static DynamicForwardProxy] |
|
ai AIBackend |
AI is the AI backend configuration. | MaxProperties: 1 MinProperties: 1 |
|
aws AwsBackend |
Aws is the AWS backend configuration. | ||
static StaticBackend |
Static is the static backend configuration. | ||
dynamicForwardProxy DynamicForwardProxyBackend |
DynamicForwardProxy is the dynamic forward proxy backend configuration. |
BackendStatus
BackendStatus defines the observed state of Backend.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
conditions Condition array |
Conditions is the list of conditions for the backend. | MaxItems: 8 |
BackendType
Underlying type: string
BackendType indicates the type of the backend.
Appears in:
Field | Description |
---|---|
AI |
BackendTypeAI is the type for AI backends. |
AWS |
BackendTypeAWS is the type for AWS backends. |
Static |
BackendTypeStatic is the type for static backends. |
DynamicForwardProxy |
BackendTypeDynamicForwardProxy is the type for dynamic forward proxy backends. |
BodyParseBehavior
Underlying type: string
BodyparseBehavior defines how the body should be parsed If set to json and the body is not json then the filter will not perform the transformation.
Validation:
- Enum: [AsString AsJson]
Appears in:
Field | Description |
---|---|
AsString |
BodyParseBehaviorAsString will parse the body as a string. |
AsJson |
BodyParseBehaviorAsJSON will parse the body as a json object. |
BodyTransformation
BodyTransformation controls how the body should be parsed and transformed.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
parseAs BodyParseBehavior |
ParseAs defines what auto formatting should be applied to the body. This can make interacting with keys within a json body much easier if AsJson is selected. |
AsString | Enum: [AsString AsJson] |
value InjaTemplate |
Value is the template to apply to generate the output value for the body. |
BufferSettings
BufferSettings configures how the request body should be buffered.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
maxRequestBytes integer |
MaxRequestBytes sets the maximum size of a message body to buffer. Requests exceeding this size will receive HTTP 413 and not be sent to the authorization service. |
Minimum: 1 |
|
allowPartialMessage boolean |
AllowPartialMessage determines if partial messages should be allowed. When true, requests will be sent to the authorization service even if they exceed maxRequestBytes. When unset, the default behavior is false. |
||
packAsBytes boolean |
PackAsBytes determines if the body should be sent as raw bytes. When true, the body is sent as raw bytes in the raw_body field. When false, the body is sent as UTF-8 string in the body field. When unset, the default behavior is false. |
BuiltIn
Underlying type: string
BuiltIn regex patterns for specific types of strings in prompts.
For example, if you specify CREDIT_CARD
, any credit card numbers
in the request or response are matched.
Validation:
- Enum: [SSN CREDIT_CARD PHONE_NUMBER EMAIL]
Appears in:
Field | Description |
---|---|
SSN |
Default regex matching for Social Security numbers. |
CREDIT_CARD |
Default regex matching for credit card numbers. |
PHONE_NUMBER |
Default regex matching for phone numbers. |
EMAIL |
Default regex matching for email addresses. |
CELFilter
CELFilter filters requests based on Common Expression Language (CEL).
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
match string |
The CEL expressions to evaluate. AccessLogs are only emitted when the CEL expressions evaluates to true. see: https://www.envoyproxy.io/docs/envoy/v1.33.0/xds/type/v3/cel.proto.html#common-expression-language-cel-proto |
CSRFPolicy
CSRFPolicy can be used to set percent of requests for which the CSRF filter is enabled, enable shadow-only mode where policies will be evaluated and tracked, but not enforced and add additional source origins that will be allowed in addition to the destination origin.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
percentageEnabled integer |
Specifies the percentage of requests for which the CSRF filter is enabled. | Maximum: 100 Minimum: 0 |
|
percentageShadowed integer |
Specifies that CSRF policies will be evaluated and tracked, but not enforced. | Maximum: 100 Minimum: 0 |
|
additionalOrigins StringMatcher array |
Specifies additional source origins that will be allowed in addition to the destination origin. | MaxItems: 16 |
CommonHttpProtocolOptions
CommonHttpProtocolOptions are options that are applicable to both HTTP1 and HTTP2 requests. See Envoy documentation for more details.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
idleTimeout Duration |
The idle timeout for connections. The idle timeout is defined as the period in which there are no active requests. When the idle timeout is reached the connection will be closed. If the connection is an HTTP/2 downstream connection a drain sequence will occur prior to closing the connection. Note that request based timeouts mean that HTTP/2 PINGs will not keep the connection alive. If not specified, this defaults to 1 hour. To disable idle timeouts explicitly set this to 0. Disabling this timeout has a highly likelihood of yielding connection leaks due to lost TCP FIN packets, etc. |
||
maxHeadersCount integer |
Specifies the maximum number of headers that the connection will accept. If not specified, the default of 100 is used. Requests that exceed this limit will receive a 431 response for HTTP/1.x and cause a stream reset for HTTP/2. |
||
maxStreamDuration Duration |
Total duration to keep alive an HTTP request/response stream. If the time limit is reached the stream will be reset independent of any other timeouts. If not specified, this value is not set. |
||
maxRequestsPerConnection integer |
Maximum requests for a single upstream connection. If set to 0 or unspecified, defaults to unlimited. |
ComparisonFilter
Underlying type: struct{Op Op “json:"op,omitempty"”; Value uint32 “json:"value,omitempty"”}
ComparisonFilter represents a filter based on a comparison. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-comparisonfilter
Appears in:
CorsPolicy
Appears in:
CustomLabel
Appears in:
CustomResponse
CustomResponse configures a response to return to the client if request content
is matched against a regex pattern and the action is REJECT
.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
message string |
A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
The request was rejected due to inappropriate content | |
statusCode integer |
The status code to return to the client. Defaults to 403. | 403 | Maximum: 599 Minimum: 200 |
DirectResponse
DirectResponse contains configuration for defining direct response routes.
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
DirectResponse |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec DirectResponseSpec |
|||
status DirectResponseStatus |
DirectResponseSpec
DirectResponseSpec describes the desired state of a DirectResponse.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
status integer |
StatusCode defines the HTTP status code to return for this route. | Maximum: 599 Minimum: 200 |
|
body string |
Body defines the content to be returned in the HTTP response body. The maximum length of the body is restricted to prevent excessively large responses. |
MaxLength: 4096 |
DirectResponseStatus
DirectResponseStatus defines the observed state of a DirectResponse.
Appears in:
DurationFilter
Underlying type: ComparisonFilter
DurationFilter filters based on request duration. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-durationfilter
Appears in:
DynamicForwardProxyBackend
DynamicForwardProxyBackend is the dynamic forward proxy backend configuration.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enableTls boolean |
EnableTls enables TLS. When true, the backend will be configured to use TLS. System CA will be used for validation. The hostname will be used for SNI and auto SAN validation. |
EnvoyBootstrap
Configuration for the Envoy proxy instance that is provisioned from a Kubernetes Gateway.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
logLevel string |
Envoy log level. Options include “trace”, “debug”, “info”, “warn”, “error”, “critical” and “off”. Defaults to “info”. See https://www.envoyproxy.io/docs/envoy/latest/start/quick-start/run-envoy#debugging-envoy for more information. |
||
componentLogLevels object (keys:string, values:string) |
Envoy log levels for specific components. The keys are component names and the values are one of “trace”, “debug”, “info”, “warn”, “error”, “critical”, or “off”, e.g. componentLogLevels: upstream: debug connection: trace These will be converted to the --component-log-level Envoy argumentvalue. See https://www.envoyproxy.io/docs/envoy/latest/start/quick-start/run-envoy#debugging-envoy for more information. Note: the keys and values cannot be empty, but they are not otherwise validated. |
EnvoyContainer
Configuration for the container running Envoy.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
bootstrap EnvoyBootstrap |
Initial envoy configuration. | ||
image Image |
The envoy container image. See https://kubernetes.io/docs/concepts/containers/images for details. Default values, which may be overridden individually: registry: quay.io/solo-io repository: gloo-envoy-wrapper (OSS) / gloo-ee-envoy-wrapper (EE) tag: pullPolicy: IfNotPresent |
||
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
ExtAuthEnabled
Underlying type: string
ExtAuthEnabled determines the enabled state of the ExtAuth filter.
Validation:
- Enum: [DisableAll]
Appears in:
Field | Description |
---|---|
DisableAll |
ExtAuthDisableAll disables all instances of the ExtAuth filter for this route. This is to enable a global disable such as for a health check route. |
ExtAuthPolicy
ExtAuthPolicy configures external authentication for a route. This policy will determine the ext auth server to use and how to talk to it. Note that most of these fields are passed along as is to Envoy. For more details on particular fields please see the Envoy ExtAuth documentation. https://raw.githubusercontent.com/envoyproxy/envoy/f910f4abea24904aff04ec33a00147184ea7cffa/api/envoy/extensions/filters/http/ext_authz/v3/ext_authz.proto
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
extensionRef LocalObjectReference |
ExtensionRef references the ExternalExtension that should be used for authentication. | ||
enablement ExtAuthEnabled |
Enablement determines the enabled state of the ExtAuth filter. When set to “DisableAll”, the filter is disabled for this route. When empty, the filter is enabled as long as it is not disabled by another policy. |
Enum: [DisableAll] |
|
withRequestBody BufferSettings |
WithRequestBody allows the request body to be buffered and sent to the authorization service. Warning buffering has implications for streaming and therefore performance. |
||
contextExtensions object (keys:string, values:string) |
Additional context for the authorization service. |
ExtAuthProvider
ExtAuthProvider defines the configuration for an ExtAuth provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
grpcService ExtGrpcService |
GrpcService is the GRPC service that will handle the authentication. |
ExtGrpcService
ExtGrpcService defines the GRPC service that will handle the processing.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
backendRef BackendRef |
BackendRef references the backend GRPC service. | ||
authority string |
Authority is the authority header to use for the GRPC service. |
ExtProcPolicy
ExtProcPolicy defines the configuration for the Envoy External Processing filter.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
extensionRef LocalObjectReference |
ExtensionRef references the GatewayExtension that should be used for external processing. | ||
processingMode ProcessingMode |
ProcessingMode defines how the filter should interact with the request/response streams |
ExtProcProvider
ExtProcProvider defines the configuration for an ExtProc provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
grpcService ExtGrpcService |
GrpcService is the GRPC service that will handle the processing. |
FieldDefault
FieldDefault provides default values for specific fields in the JSON request body sent to the LLM provider. These defaults are merged with the user-provided request to ensure missing fields are populated.
User input fields here refer to the fields in the JSON request body that a client sends when making a request to the LLM provider.
Defaults set here do not override those user-provided values unless you explicitly set override
to true
.
Example: Setting a default system field for Anthropic, which does not support system role messages:
defaults:
- field: "system"
value: "answer all questions in French"
Example: Setting a default temperature and overriding max_tokens
:
defaults:
- field: "temperature"
value: "0.5"
- field: "max_tokens"
value: "100"
override: true
Example: Overriding a custom list field:
defaults:
- field: "custom_list"
value: "[a,b,c]"
Note: The field
values correspond to keys in the JSON request body, not fields in this CRD.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
field string |
The name of the field. | MinLength: 1 |
|
value string |
The field default value, which can be any JSON Data Type. | MinLength: 1 |
|
override boolean |
Whether to override the field’s value if it already exists. Defaults to false. |
false |
FileSink
FileSink represents the file sink configuration for access logs.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
path string |
the file path to which the file access logging service will sink | ||
stringFormat string |
the format string by which envoy will format the log lines https://www.envoyproxy.io/docs/envoy/v1.33.0/configuration/observability/access_log/usage#format-strings |
||
jsonFormat RawExtension |
the format object by which to envoy will emit the logs in a structured way. https://www.envoyproxy.io/docs/envoy/v1.33.0/configuration/observability/access_log/usage#format-dictionaries |
FilterType
FilterType represents the type of filter to apply (only one of these should be set). Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#envoy-v3-api-msg-config-accesslog-v3-accesslogfilter
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
statusCodeFilter StatusCodeFilter |
|||
durationFilter DurationFilter |
|||
notHealthCheckFilter boolean |
Filters for requests that are not health check requests. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-nothealthcheckfilter |
||
traceableFilter boolean |
Filters for requests that are traceable. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-traceablefilter |
||
headerFilter HeaderFilter |
|||
responseFlagFilter ResponseFlagFilter |
|||
grpcStatusFilter GrpcStatusFilter |
|||
celFilter CELFilter |
GatewayExtension
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
GatewayExtension |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec GatewayExtensionSpec |
|||
status GatewayExtensionStatus |
GatewayExtensionSpec
GatewayExtensionSpec defines the desired state of GatewayExtension.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
type GatewayExtensionType |
Type indicates the type of the GatewayExtension to be used. | Enum: [ExtAuth ExtProc RateLimit Extended] |
|
extAuth ExtAuthProvider |
ExtAuth configuration for ExtAuth extension type. | ||
extProc ExtProcProvider |
ExtProc configuration for ExtProc extension type. | ||
rateLimit RateLimitProvider |
RateLimit configuration for RateLimit extension type. |
GatewayExtensionStatus
GatewayExtensionStatus defines the observed state of GatewayExtension.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
conditions Condition array |
Conditions is the list of conditions for the GatewayExtension. | MaxItems: 8 |
GatewayExtensionType
Underlying type: string
GatewayExtensionType indicates the type of the GatewayExtension.
Appears in:
Field | Description |
---|---|
ExtAuth |
GatewayExtensionTypeExtAuth is the type for Extauth extensions. |
ExtProc |
GatewayExtensionTypeExtProc is the type for ExtProc extensions. |
RateLimit |
GatewayExtensionTypeRateLimit is the type for RateLimit extensions. |
GatewayParameters
A GatewayParameters contains configuration that is used to dynamically provision kgateway’s data plane (Envoy proxy instance), based on a Kubernetes Gateway.
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
GatewayParameters |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec GatewayParametersSpec |
|||
status GatewayParametersStatus |
GatewayParametersSpec
A GatewayParametersSpec describes the type of environment/platform in which the proxy will be provisioned.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
kube KubernetesProxyConfig |
The proxy will be deployed on Kubernetes. | ||
selfManaged SelfManagedGateway |
The proxy will be self-managed and not auto-provisioned. |
GatewayParametersStatus
The current conditions of the GatewayParameters. This is not currently implemented.
Appears in:
GeminiConfig
GeminiConfig settings for the Gemini LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the Gemini API. This token is automatically sent in the key query parameter of the request. |
||
model string |
The Gemini model to use. For more information, see the Gemini models docs. |
||
apiVersion string |
The version of the Gemini API to use. For more information, see the Gemini API version docs. |
GracefulShutdownSpec
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enabled boolean |
Enable grace period before shutdown to finish current requests while Envoy health checks fail to e.g. notify external load balancers. NOTE: This will not have any effect if you have not defined health checks via the health check filter | ||
sleepTimeSeconds integer |
Time (in seconds) for the preStop hook to wait before allowing Envoy to terminate |
GrpcService
GrpcService represents the gRPC service configuration for access logs.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
logName string |
name of log stream | ||
backendRef BackendRef |
The backend gRPC service. Can be any type of supported backend (Kubernetes Service, kgateway Backend, etc..) | ||
additionalRequestHeadersToLog string array |
Additional request headers to log in the access log | ||
additionalResponseHeadersToLog string array |
Additional response headers to log in the access log | ||
additionalResponseTrailersToLog string array |
Additional response trailers to log in the access log |
GrpcStatusFilter
GrpcStatusFilter filters gRPC requests based on their response status. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#enum-config-accesslog-v3-grpcstatusfilter-status
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
statuses GrpcStatus array |
Enum: [OK CANCELED UNKNOWN INVALID_ARGUMENT DEADLINE_EXCEEDED NOT_FOUND ALREADY_EXISTS PERMISSION_DENIED RESOURCE_EXHAUSTED FAILED_PRECONDITION ABORTED OUT_OF_RANGE UNIMPLEMENTED INTERNAL UNAVAILABLE DATA_LOSS UNAUTHENTICATED] MinItems: 1 |
||
exclude boolean |
HTTPListenerPolicy
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
HTTPListenerPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec HTTPListenerPolicySpec |
|||
status PolicyStatus |
HTTPListenerPolicySpec
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
targetRefs LocalPolicyTargetReference array |
TargetRefs specifies the target resources by reference to attach the policy to. | MaxItems: 16 MinItems: 1 |
|
targetSelectors LocalPolicyTargetSelector array |
TargetSelectors specifies the target selectors to select resources to attach the policy to. | ||
accessLog AccessLog array |
AccessLoggingConfig contains various settings for Envoy’s access logging service. See here for more information: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto |
||
upgradeConfig UpgradeConfig |
UpgradeConfig contains configuration for HTTP upgrades like WebSocket. See here for more information: https://www.envoyproxy.io/docs/envoy/v1.34.1/intro/arch_overview/http/upgrades.html |
HeaderFilter
HeaderFilter filters requests based on headers. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-headerfilter
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
header HTTPHeaderMatch |
HeaderFormat
Underlying type: string
Appears in:
Field | Description |
---|---|
ProperCaseHeaderKeyFormat |
|
PreserveCaseHeaderKeyFormat |
HeaderName
Underlying type: string
Appears in:
HeaderTransformation
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
name HeaderName |
Name is the name of the header to interact with. | ||
value InjaTemplate |
Value is the template to apply to generate the output value for the header. |
Host
Host defines a static backend host.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
host string |
Host is the host name to use for the backend. | MinLength: 1 |
|
port PortNumber |
Port is the port to use for the backend. | ||
insecureSkipVerify boolean |
InsecureSkipVerify allows skipping ssl validation for custom hosts |
Http1ProtocolOptions
See Envoy documentation for more details.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enableTrailers boolean |
Enables trailers for HTTP/1. By default the HTTP/1 codec drops proxied trailers. Note: Trailers must also be enabled at the gateway level in order for this option to take effect |
||
headerFormat HeaderFormat |
The format of the header key. | Enum: [ProperCaseHeaderKeyFormat PreserveCaseHeaderKeyFormat] |
|
overrideStreamErrorOnInvalidHttpMessage boolean |
Allows invalid HTTP messaging. When this option is false, then Envoy will terminate HTTP/1.1 connections upon receiving an invalid HTTP message. However, when this option is true, then Envoy will leave the HTTP/1.1 connection open where possible. |
Image
A container image. See https://kubernetes.io/docs/concepts/containers/images for details.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
registry string |
The image registry. | ||
repository string |
The image repository (name). | ||
tag string |
The image tag. | ||
digest string |
The hash digest of the image, e.g. sha256:12345... |
||
pullPolicy PullPolicy |
The image pull policy for the container. See https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy for details. |
InjaTemplate
Underlying type: string
Appears in:
IstioContainer
Configuration for the container running the istio-proxy.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
image Image |
The envoy container image. See https://kubernetes.io/docs/concepts/containers/images for details. |
||
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
||
logLevel string |
Log level for istio-proxy. Options include “info”, “debug”, “warning”, and “error”. Default level is info Default is “warning”. |
||
istioDiscoveryAddress string |
The address of the istio discovery service. Defaults to “istiod.istio-system.svc:15012”. | ||
istioMetaMeshId string |
The mesh id of the istio mesh. Defaults to “cluster.local”. | ||
istioMetaClusterId string |
The cluster id of the istio cluster. Defaults to “Kubernetes”. |
IstioIntegration
Configuration for the Istio integration settings used by a Gloo Gateway’s data plane (Envoy proxy instance)
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
istioProxyContainer IstioContainer |
Configuration for the container running istio-proxy. Note that if Istio integration is not enabled, the istio container will not be injected into the gateway proxy deployment. |
||
customSidecars Container array |
do not use slice of pointers: https://github.com/kubernetes/code-generator/issues/166 Override the default Istio sidecar in gateway-proxy with a custom container. |
KubernetesProxyConfig
Configuration for the set of Kubernetes resources that will be provisioned for a given Gateway.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
deployment ProxyDeployment |
Use a Kubernetes deployment as the proxy workload type. Currently, this is the only supported workload type. |
||
envoyContainer EnvoyContainer |
Configuration for the container running Envoy. | ||
sdsContainer SdsContainer |
Configuration for the container running the Secret Discovery Service (SDS). | ||
podTemplate Pod |
Configuration for the pods that will be created. | ||
service Service |
Configuration for the Kubernetes Service that exposes the Envoy proxy over the network. |
||
serviceAccount ServiceAccount |
Configuration for the Kubernetes ServiceAccount used by the Envoy pod. | ||
istio IstioIntegration |
Configuration for the Istio integration. | ||
stats StatsConfig |
Configuration for the stats server. | ||
aiExtension AiExtension |
Configuration for the AI extension. | ||
agentGateway AgentGateway |
Configure the AgentGateway integration | ||
floatingUserId boolean |
Used to unset the runAsUser values in security contexts. |
LLMProvider
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
provider SupportedLLMProvider |
The LLM provider type to configure. | MaxProperties: 1 MinProperties: 1 |
|
hostOverride Host |
Send requests to a custom host and port, such as to proxy the request, or to use a different backend that is API-compliant with the Backend version. |
||
pathOverride PathOverride |
TODO: Consolidate all Override options into ProviderOverride. Overrides the default API path for the LLM provider. Allows routing requests to a custom API endpoint path. |
MinProperties: 1 |
|
authHeaderOverride AuthHeaderOverride |
Customizes the Authorization header sent to the LLM provider. Allows changing the header name and/or the prefix (e.g., “Bearer”). Note: Not all LLM providers use the Authorization header and prefix. For example, OpenAI uses header: “Authorization” and prefix: “Bearer” But Azure OpenAI uses header: “api-key” and no Bearer. |
LoadBalancer
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
healthyPanicThreshold integer |
HealthyPanicThreshold configures envoy’s panic threshold percentage between 0-100. Once the number of non-healthy hosts reaches this percentage, envoy disregards health information. See Envoy documentation. |
Maximum: 100 Minimum: 0 |
|
updateMergeWindow Duration |
This allows batch updates of endpoints health/weight/metadata that happen during a time window. this help lower cpu usage when endpoint change rate is high. defaults to 1 second. Set to 0 to disable and have changes applied immediately. |
||
leastRequest LoadBalancerLeastRequestConfig |
LeastRequest configures the least request load balancer type. | ||
roundRobin LoadBalancerRoundRobinConfig |
RoundRobin configures the round robin load balancer type. | ||
ringHash LoadBalancerRingHashConfig |
RingHash configures the ring hash load balancer type. | ||
maglev LoadBalancerMaglevConfig |
Maglev configures the maglev load balancer type. | ||
random LoadBalancerRandomConfig |
Random configures the random load balancer type. | ||
localityType LocalityType |
LocalityType specifies the locality config type to use. See https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/load_balancing_policies/common/v3/common.proto#envoy-v3-api-msg-extensions-load-balancing-policies-common-v3-localitylbconfig |
Enum: [WeightedLb] |
|
useHostnameForHashing boolean |
UseHostnameForHashing specifies whether to use the hostname instead of the resolved IP address for hashing. Defaults to false. |
||
closeConnectionsOnHostSetChange boolean |
If set to true, the load balancer will drain connections when the host set changes. Ring Hash or Maglev can be used to ensure that clients with the same key are routed to the same upstream host. Distruptions can cause new connections with the same key as existing connections to be routed to different hosts. Enabling this feature will cause the load balancer to drain existing connections when the host set changes, ensuring that new connections with the same key are consistently routed to the same host. Connections are not immediately closed, but are allowed to drain before being closed. |
LoadBalancerLeastRequestConfig
LoadBalancerLeastRequestConfig configures the least request load balancer type.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
choiceCount integer |
How many choices to take into account. Defaults to 2. |
||
slowStart SlowStart |
SlowStart configures the slow start configuration for the load balancer. |
LoadBalancerMaglevConfig
Appears in:
LoadBalancerRandomConfig
Appears in:
LoadBalancerRingHashConfig
LoadBalancerRingHashConfig configures the ring hash load balancer type.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
minimumRingSize integer |
MinimumRingSize is the minimum size of the ring. | ||
maximumRingSize integer |
MaximumRingSize is the maximum size of the ring. |
LoadBalancerRoundRobinConfig
LoadBalancerRoundRobinConfig configures the round robin load balancer type.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
slowStart SlowStart |
SlowStart configures the slow start configuration for the load balancer. |
LocalPolicyTargetReference
Select the object to attach the policy by Group, Kind, and Name. The object must be in the same namespace as the policy. You can target only one object at a time.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
group Group |
The API group of the target resource. For Kubernetes Gateway API resources, the group is gateway.networking.k8s.io . |
||
kind Kind |
The API kind of the target resource, such as Gateway or HTTPRoute. |
||
name ObjectName |
The name of the target resource. |
LocalPolicyTargetReferenceWithSectionName
Select the object to attach the policy by Group, Kind, Name and SectionName. The object must be in the same namespace as the policy. You can target only one object at a time.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
group Group |
The API group of the target resource. For Kubernetes Gateway API resources, the group is gateway.networking.k8s.io . |
||
kind Kind |
The API kind of the target resource, such as Gateway or HTTPRoute. |
||
name ObjectName |
The name of the target resource. | ||
sectionName SectionName |
The section name of the target resource. | MaxLength: 253 MinLength: 1 Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$ |
LocalPolicyTargetSelector
Select the object to attach the policy by Group, Kind, and its labels. The object must be in the same namespace as the policy and match the specified labels.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
group Group |
The API group of the target resource. For Kubernetes Gateway API resources, the group is gateway.networking.k8s.io . |
||
kind Kind |
The API kind of the target resource, such as Gateway or HTTPRoute. |
||
matchLabels object (keys:string, values:string) |
Label selector to select the target resource. |
LocalRateLimitPolicy
LocalRateLimitPolicy represents a policy for local rate limiting. It defines the configuration for rate limiting using a token bucket mechanism.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
tokenBucket TokenBucket |
TokenBucket represents the configuration for a token bucket local rate-limiting mechanism. It defines the parameters for controlling the rate at which requests are allowed. |
LocalityType
Underlying type: string
Appears in:
Field | Description |
---|---|
WeightedLb |
https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/locality_weight#locality-weighted-load-balancing Locality weighted load balancing enables weighting assignments across different zones and geographical locations by using explicit weights. This field is required to enable locality weighted load balancing. |
Message
An entry for a message to prepend or append to each prompt.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
role string |
Role of the message. The available roles depend on the backend LLM provider model, such as SYSTEM or USER in the OpenAI API. |
||
content string |
String content of the message. |
Moderation
Moderation configures an external moderation model endpoint. This endpoint evaluates request prompt data against predefined content rules to determine if the content adheres to those rules.
Any requests routed through the AI Gateway are processed by the specified moderation model. If the model identifies the content as harmful based on its rules, the request is automatically rejected.
You can configure a moderation endpoint either as a standalone prompt guard setting or alongside other request and response guard settings.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
openAIModeration OpenAIConfig |
Pass prompt data through an external moderation model endpoint, which compares the request prompt input to predefined content rules. Configure an OpenAI moderation endpoint. |
MultiPoolConfig
MultiPoolConfig configures the backends for multiple hosts or models from the same provider in one Backend resource. This method can be useful for creating one logical endpoint that is backed by multiple hosts or models.
In the priorities
section, the order of pool
entries defines the priority of the backend endpoints.
The pool
entries can either define a list of backends or a single backend.
Note: Only two levels of nesting are permitted. Any nested entries after the second level are ignored.
multi:
priorities:
- pool:
- azureOpenai:
deploymentName: gpt-4o-mini
apiVersion: 2024-02-15-preview
endpoint: ai-gateway.openai.azure.com
authToken:
secretRef:
name: azure-secret
namespace: kgateway-system
- pool:
- azureOpenai:
deploymentName: gpt-4o-mini-2
apiVersion: 2024-02-15-preview
endpoint: ai-gateway-2.openai.azure.com
authToken:
secretRef:
name: azure-secret-2
namespace: kgateway-system
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
priorities Priority array |
The priority list of backend pools. Each entry represents a set of LLM provider backends. The order defines the priority of the backend endpoints. |
MaxItems: 20 MinItems: 1 |
OpenAIConfig
OpenAIConfig settings for the OpenAI LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the OpenAI API. This token is automatically sent in the Authorization header of therequest and prefixed with Bearer . |
||
model string |
Optional: Override the model name, such as gpt-4o-mini .If unset, the model name is taken from the request. This setting can be useful when setting up model failover within the same LLM provider. |
Parameters
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
tlsMinVersion TLSVersion |
Minimum TLS version. | Enum: [AUTO 1.0 1.1 1.2 1.3] |
|
tlsMaxVersion TLSVersion |
Maximum TLS version. | Enum: [AUTO 1.0 1.1 1.2 1.3] |
|
cipherSuites string array |
|||
ecdhCurves string array |
PathOverride
PathOverride configures the AI gateway to use a custom path for LLM provider chat-completion API requests. It allows overriding the default API path with a custom one. This is useful when you need to route requests to a different API endpoint while maintaining compatibility with the original provider’s API structure.
Validation:
- MinProperties: 1
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
fullPath string |
FullPath specifies the custom API path to use for the LLM provider requests. This path will replace the default API path for the provider. |
Pod
Configuration for a Kubernetes Pod template.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
extraLabels object (keys:string, values:string) |
Additional labels to add to the Pod object metadata. | ||
extraAnnotations object (keys:string, values:string) |
Additional annotations to add to the Pod object metadata. | ||
securityContext PodSecurityContext |
The pod security context. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#podsecuritycontext-v1-core for details. |
||
imagePullSecrets LocalObjectReference array |
An optional list of references to secrets in the same namespace to use for pulling any of the images used by this Pod spec. See https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod for details. |
||
nodeSelector object (keys:string, values:string) |
A selector which must be true for the pod to fit on a node. See https://kubernetes.io/docs/concepts/configuration/assign-pod-node/ for details. |
||
affinity Affinity |
If specified, the pod’s scheduling constraints. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#affinity-v1-core for details. |
||
tolerations Toleration array |
do not use slice of pointers: https://github.com/kubernetes/code-generator/issues/166 If specified, the pod’s tolerations. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#toleration-v1-core for details. |
||
gracefulShutdown GracefulShutdownSpec |
If specified, the pod’s graceful shutdown spec. | ||
terminationGracePeriodSeconds integer |
If specified, the pod’s termination grace period in seconds. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#pod-v1-core for details |
||
readinessProbe Probe |
If specified, the pod’s readiness probe. Periodic probe of container service readiness. Container will be removed from service endpoints if the probe fails. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#probe-v1-core for details. |
||
livenessProbe Probe |
If specified, the pod’s liveness probe. Periodic probe of container service readiness. Container will be restarted if the probe fails. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#probe-v1-core for details. |
PolicyAncestorStatus
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
ancestorRef ParentReference |
AncestorRef corresponds with a ParentRef in the spec that this PolicyAncestorStatus struct describes the status of. |
||
controllerName string |
ControllerName is a domain/path string that indicates the name of the controller that wrote this status. This corresponds with the controllerName field on GatewayClass. Example: “example.net/gateway-controller”. The format of this field is DOMAIN “/” PATH, where DOMAIN and PATH are valid Kubernetes names (https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names). Controllers MUST populate this field when writing status. Controllers should ensure that entries to status populated with their ControllerName are cleaned up when they are no longer necessary. |
||
conditions Condition array |
Conditions describes the status of the Policy with respect to the given Ancestor. | MaxItems: 8 MinItems: 1 |
Port
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
port integer |
The port number to match on the Gateway | ||
nodePort integer |
The NodePort to be used for the service. If not specified, a random port will be assigned by the Kubernetes API server. |
Priority
Priority configures the priority of the backend endpoints.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
pool LLMProvider array |
A list of LLM provider backends within a single endpoint pool entry. | MaxItems: 20 MinItems: 1 |
ProcessingMode
ProcessingMode defines how the filter should interact with the request/response streams
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
requestHeaderMode string |
RequestHeaderMode determines how to handle the request headers | SEND | Enum: [DEFAULT SEND SKIP] |
responseHeaderMode string |
ResponseHeaderMode determines how to handle the response headers | SEND | Enum: [DEFAULT SEND SKIP] |
requestBodyMode string |
RequestBodyMode determines how to handle the request body | NONE | Enum: [NONE STREAMED BUFFERED BUFFERED_PARTIAL FULL_DUPLEX_STREAMED] |
responseBodyMode string |
ResponseBodyMode determines how to handle the response body | NONE | Enum: [NONE STREAMED BUFFERED BUFFERED_PARTIAL FULL_DUPLEX_STREAMED] |
requestTrailerMode string |
RequestTrailerMode determines how to handle the request trailers | SKIP | Enum: [DEFAULT SEND SKIP] |
responseTrailerMode string |
ResponseTrailerMode determines how to handle the response trailers | SKIP | Enum: [DEFAULT SEND SKIP] |
PromptguardRequest
PromptguardRequest defines the prompt guards to apply to requests sent by the client. Multiple prompt guard configurations can be set, and they will be executed in the following order: webhook → regex → moderation for requests, where each step can reject the request and stop further processing.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
customResponse CustomResponse |
A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
||
regex Regex |
Regular expression (regex) matching for prompt guards and data masking. | ||
webhook Webhook |
Configure a webhook to forward requests to for prompt guarding. | ||
moderation Moderation |
Pass prompt data through an external moderation model endpoint, which compares the request prompt input to predefined content rules. |
PromptguardResponse
PromptguardResponse configures the response that the prompt guard applies to responses returned by the LLM provider. Both webhook and regex can be set, they will be executed in the following order: webhook → regex, where each step can reject the request and stop further processing.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
regex Regex |
Regular expression (regex) matching for prompt guards and data masking. | ||
webhook Webhook |
Configure a webhook to forward responses to for prompt guarding. |
ProxyDeployment
Configuration for the Proxy deployment in Kubernetes.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
replicas integer |
The number of desired pods. Defaults to 1. |
Publisher
Underlying type: string
Publisher configures the type of publisher model to use for VertexAI. Currently, only Google is supported.
Appears in:
Field | Description |
---|---|
GOOGLE |
RateLimit
RateLimit defines a rate limiting policy.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
local LocalRateLimitPolicy |
Local defines a local rate limiting policy. | ||
global RateLimitPolicy |
Global defines a global rate limiting policy using an external service. |
RateLimitDescriptor
RateLimitDescriptor defines a descriptor for rate limiting. A descriptor is a group of entries that form a single rate limit rule.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
entries RateLimitDescriptorEntry array |
Entries are the individual components that make up this descriptor. When translated to Envoy, these entries combine to form a single descriptor. |
MinItems: 1 |
RateLimitDescriptorEntry
RateLimitDescriptorEntry defines a single entry in a rate limit descriptor. Only one entry type may be specified.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
type RateLimitDescriptorEntryType |
Type specifies what kind of rate limit descriptor entry this is. | Enum: [Generic Header RemoteAddress Path] |
|
generic RateLimitDescriptorEntryGeneric |
Generic contains the configuration for a generic key-value descriptor entry. This field must be specified when Type is Generic. |
||
header string |
Header specifies a request header to extract the descriptor value from. This field must be specified when Type is Header. |
RateLimitDescriptorEntryType
Underlying type: string
RateLimitDescriptorEntryType defines the type of a rate limit descriptor entry.
Validation:
- Enum: [Generic Header RemoteAddress Path]
Appears in:
Field | Description |
---|---|
Generic |
RateLimitDescriptorEntryTypeGeneric represents a generic key-value descriptor entry. |
Header |
RateLimitDescriptorEntryTypeHeader represents a descriptor entry that extracts its value from a request header. |
RemoteAddress |
RateLimitDescriptorEntryTypeRemoteAddress represents a descriptor entry that uses the client’s IP address as its value. |
Path |
RateLimitDescriptorEntryTypePath represents a descriptor entry that uses the request path as its value. |
RateLimitPolicy
RateLimitPolicy defines a global rate limiting policy using an external service.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
descriptors RateLimitDescriptor array |
Descriptors define the dimensions for rate limiting. These values are passed to the rate limit service which applies configured limits based on them. Each descriptor represents a single rate limit rule with one or more entries. |
MinItems: 1 |
|
extensionRef LocalObjectReference |
ExtensionRef references a GatewayExtension that provides the global rate limit service. |
RateLimitProvider
RateLimitProvider defines the configuration for a RateLimit service provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
grpcService ExtGrpcService |
GrpcService is the GRPC service that will handle the rate limiting. | ||
domain string |
Domain identifies a rate limiting configuration for the rate limit service. All rate limit requests must specify a domain, which enables the configuration to be per application without fear of overlap (e.g., “api”, “web”, “admin”). |
||
failOpen boolean |
FailOpen determines if requests are limited when the rate limit service is unavailable. When true, requests are not limited if the rate limit service is unavailable. |
false | |
timeout Duration |
Timeout for requests to the rate limit service. | 20ms |
Regex
Regex configures the regular expression (regex) matching for prompt guards and data masking.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
matches RegexMatch array |
A list of regex patterns to match against the request or response. Matches and built-ins are additive. |
||
builtins BuiltIn array |
A list of built-in regex patterns to match against the request or response. Matches and built-ins are additive. |
Enum: [SSN CREDIT_CARD PHONE_NUMBER EMAIL] |
|
action Action |
The action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. PromptguardResponse matches are always masked by default. Defaults to MASK . |
MASK |
RegexMatch
RegexMatch configures the regular expression (regex) matching for prompt guards and data masking.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
pattern string |
The regex pattern to match against the request or response. | ||
name string |
An optional name for this match, which can be used for debugging purposes. |
ResponseFlagFilter
ResponseFlagFilter filters based on response flags. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-responseflagfilter
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
flags string array |
MinItems: 1 |
RouteType
Underlying type: string
RouteType is the type of route to the LLM provider API.
Appears in:
Field | Description |
---|---|
CHAT |
The LLM generates the full response before responding to a client. |
CHAT_STREAMING |
Stream responses to a client, which allows the LLM to stream out tokens as they are generated. |
SdsBootstrap
Configuration for the SDS instance that is provisioned from a Kubernetes Gateway.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
logLevel string |
Log level for SDS. Options include “info”, “debug”, “warn”, “error”, “panic” and “fatal”. Default level is “info”. |
SdsContainer
Configuration for the container running Gloo SDS.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
image Image |
The SDS container image. See https://kubernetes.io/docs/concepts/containers/images for details. |
||
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
||
bootstrap SdsBootstrap |
Initial SDS container configuration. |
SelfManagedGateway
Appears in:
Service
Configuration for a Kubernetes Service.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
type ServiceType |
The Kubernetes Service type. | Enum: [ClusterIP NodePort LoadBalancer ExternalName] |
|
clusterIP string |
The manually specified IP address of the service, if a randomly assigned IP is not desired. See https://kubernetes.io/docs/concepts/services-networking/service/#choosing-your-own-ip-address and https://kubernetes.io/docs/concepts/services-networking/service/#headless-services on the implications of setting clusterIP . |
||
extraLabels object (keys:string, values:string) |
Additional labels to add to the Service object metadata. | ||
extraAnnotations object (keys:string, values:string) |
Additional annotations to add to the Service object metadata. | ||
ports Port array |
Additional configuration for the service ports. The actual port numbers are specified in the Gateway resource. |
ServiceAccount
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
extraLabels object (keys:string, values:string) |
Additional labels to add to the ServiceAccount object metadata. | ||
extraAnnotations object (keys:string, values:string) |
Additional annotations to add to the ServiceAccount object metadata. |
SingleAuthToken
SingleAuthToken configures the authorization token that the AI gateway uses to access the LLM provider API. This token is automatically sent in a request header, depending on the LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
kind SingleAuthTokenKind |
Kind specifies which type of authorization token is being used. Must be one of: “Inline”, “SecretRef”, “Passthrough”. |
Enum: [Inline SecretRef Passthrough] |
|
inline string |
Provide the token directly in the configuration for the Backend. This option is the least secure. Only use this option for quick tests such as trying out AI Gateway. |
||
secretRef LocalObjectReference |
Store the API key in a Kubernetes secret in the same namespace as the Backend. Then, refer to the secret in the Backend configuration. This option is more secure than an inline token, because the API key is encoded and you can restrict access to secrets through RBAC rules. You might use this option in proofs of concept, controlled development and staging environments, or well-controlled prod environments that use secrets. |
SingleAuthTokenKind
Underlying type: string
Appears in:
Field | Description |
---|---|
Inline |
Inline provides the token directly in the configuration for the Backend. |
SecretRef |
SecretRef provides the token directly in the configuration for the Backend. |
Passthrough |
Passthrough the existing token. This token can either come directly from the client, or be generated by an OIDC flow early in the request lifecycle. This option is useful for backends which have federated identity setup and can re-use the token from the client. Currently, this token must exist in the Authorization header. |
SlowStart
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
window Duration |
Represents the size of slow start window. If set, the newly created host remains in slow start mode starting from its creation time for the duration of slow start window. |
||
aggression string |
This parameter controls the speed of traffic increase over the slow start window. Defaults to 1.0, so that endpoint would get linearly increasing amount of traffic. When increasing the value for this parameter, the speed of traffic ramp-up increases non-linearly. The value of aggression parameter should be greater than 0.0. By tuning the parameter, is possible to achieve polynomial or exponential shape of ramp-up curve. During slow start window, effective weight of an endpoint would be scaled with time factor and aggression: new_weight = weight * max(min_weight_percent, time_factor ^ (1 / aggression)) ,where time_factor=(time_since_start_seconds / slow_start_time_seconds) .As time progresses, more and more traffic would be sent to endpoint, which is in slow start window. Once host exits slow start, time_factor and aggression no longer affect its weight. |
||
minWeightPercent integer |
Minimum weight percentage of an endpoint during slow start. |
StaticBackend
StaticBackend references a static list of hosts.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
hosts Host array |
Hosts is a list of hosts to use for the backend. | MinItems: 1 |
|
appProtocol AppProtocol |
AppProtocol is the application protocol to use when communicating with the backend. | Enum: [http2 grpc grpc-web kubernetes.io/h2c kubernetes.io/ws] Optional |
StatsConfig
Configuration for the stats server.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enabled boolean |
Whether to expose metrics annotations and ports for scraping metrics. | ||
routePrefixRewrite string |
The Envoy stats endpoint to which the metrics are written | ||
enableStatsRoute boolean |
Enables an additional route to the stats cluster defaulting to /stats | ||
statsRoutePrefixRewrite string |
The Envoy stats endpoint with general metrics for the additional stats route |
StatusCodeFilter
Underlying type: ComparisonFilter
StatusCodeFilter filters based on HTTP status code. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#envoy-v3-api-msg-config-accesslog-v3-statuscodefilter
Appears in:
StringMatcher
Specifies the way to match a string.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
exact string |
The input string must match exactly the string specified here. Example: abc matches the value abc |
||
prefix string |
The input string must have the prefix specified here. Note: empty prefix is not allowed, please use regex instead. Example: abc matches the value abc.xyz |
||
suffix string |
The input string must have the suffix specified here. Note: empty prefix is not allowed, please use regex instead. Example: abc matches the value xyz.abc |
||
contains string |
The input string must contain the substring specified here. Example: abc matches the value xyz.abc.def |
||
safeRegex string |
The input string must match the Google RE2 regular expression specified here. See https://github.com/google/re2/wiki/Syntax for the syntax. |
||
ignoreCase boolean |
If true, indicates the exact/prefix/suffix/contains matching should be case insensitive. This has no effect on the regex match. For example, the matcher data will match both input string Data and data if this option is set to true. |
false |
SupportedLLMProvider
SupportedLLMProvider configures the AI gateway to use a single LLM provider backend.
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
openai OpenAIConfig |
|||
azureopenai AzureOpenAIConfig |
|||
anthropic AnthropicConfig |
|||
gemini GeminiConfig |
|||
vertexai VertexAIConfig |
TCPKeepalive
See Envoy documentation for more details.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
keepAliveProbes integer |
Maximum number of keep-alive probes to send before dropping the connection. | ||
keepAliveTime Duration |
The number of seconds a connection needs to be idle before keep-alive probes start being sent. | ||
keepAliveInterval Duration |
The number of seconds between keep-alive probes. |
TLS
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
secretRef LocalObjectReference |
Reference to the TLS secret containing the certificate, key, and optionally the root CA. | ||
tlsFiles TLSFiles |
File paths to certificates local to the proxy. | ||
sni string |
The SNI domains that should be considered for TLS connection | ||
verifySubjectAltName string array |
Verify that the Subject Alternative Name in the peer certificate is one of the specified values. note that a root_ca must be provided if this option is used. |
||
parameters Parameters |
General TLS parameters. See the envoy docs for more information on the meaning of these values. |
||
alpnProtocols string array |
Set Application Level Protocol Negotiation If empty, defaults to [“h2”, “http/1.1”]. |
||
allowRenegotiation boolean |
Allow Tls renegotiation, the default value is false. TLS renegotiation is considered insecure and shouldn’t be used unless absolutely necessary. |
||
oneWayTLS boolean |
If the TLS config has the ca.crt (root CA) provided, kgateway uses it to perform mTLS by default. Set oneWayTls to true to disable mTLS in favor of server-only TLS (one-way TLS), even if kgateway has the root CA. If unset, defaults to false. |
TLSFiles
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
tlsCertificate string |
|||
tlsKey string |
|||
rootCA string |
TLSVersion
Underlying type: string
TLSVersion defines the TLS version.
Validation:
- Enum: [AUTO 1.0 1.1 1.2 1.3]
Appears in:
Field | Description |
---|---|
AUTO |
|
1.0 |
|
1.1 |
|
1.2 |
|
1.3 |
TokenBucket
TokenBucket defines the configuration for a token bucket rate-limiting mechanism. It controls the rate at which tokens are generated and consumed for a specific operation.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
maxTokens integer |
MaxTokens specifies the maximum number of tokens that the bucket can hold. This value must be greater than or equal to 1. It determines the burst capacity of the rate limiter. |
Minimum: 1 |
|
tokensPerFill integer |
TokensPerFill specifies the number of tokens added to the bucket during each fill interval. If not specified, it defaults to 1. This controls the steady-state rate of token generation. |
1 | |
fillInterval Duration |
FillInterval defines the time duration between consecutive token fills. This value must be a valid duration string (e.g., “1s”, “500ms”). It determines the frequency of token replenishment. |
TrafficPolicy
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
TrafficPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec TrafficPolicySpec |
|||
status PolicyStatus |
TrafficPolicySpec
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
targetRefs LocalPolicyTargetReferenceWithSectionName array |
TargetRefs specifies the target resources by reference to attach the policy to. | MaxItems: 16 MinItems: 1 |
|
targetSelectors LocalPolicyTargetSelector array |
TargetSelectors specifies the target selectors to select resources to attach the policy to. | ||
ai AIPolicy |
AI is used to configure AI-based policies for the policy. | ||
transformation TransformationPolicy |
Transformation is used to mutate and transform requests and responses before forwarding them to the destination. |
||
extProc ExtProcPolicy |
ExtProc specifies the external processing configuration for the policy. | ||
extAuth ExtAuthPolicy |
ExtAuth specifies the external authentication configuration for the policy. This controls what external server to send requests to for authentication. |
||
rateLimit RateLimit |
RateLimit specifies the rate limiting configuration for the policy. This controls the rate at which requests are allowed to be processed. |
||
cors CorsPolicy |
Cors specifies the CORS configuration for the policy. | ||
csrf CSRFPolicy |
Csrf specifies the Cross-Site Request Forgery (CSRF) policy for this traffic policy. |
Transform
Transform defines the operations to be performed by the transformation. These operations may include changing the actual request/response but may also cause side effects. Side effects may include setting info that can be used in future steps (e.g. dynamic metadata) and can cause envoy to buffer.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
set HeaderTransformation array |
Set is a list of headers and the value they should be set to. | MaxItems: 16 |
|
add HeaderTransformation array |
Add is a list of headers to add to the request and what that value should be set to. If there is already a header with these values then append the value as an extra entry. |
MaxItems: 16 |
|
remove string array |
Remove is a list of header names to remove from the request/response. | MaxItems: 16 |
|
body BodyTransformation |
Body controls both how to parse the body and if needed how to set. If empty, body will not be buffered. |
TransformationPolicy
TransformationPolicy config is used to modify envoy behavior at a route level. These modifications can be performed on the request and response paths.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
request Transform |
Request is used to modify the request path. | ||
response Transform |
Response is used to modify the response path. |
UpgradeConfig
UpgradeConfig represents configuration for HTTP upgrades.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enabledUpgrades string array |
List of upgrade types to enable (e.g. “websocket”, “CONNECT”, etc.) | MinItems: 1 |
VertexAIConfig
VertexAIConfig settings for the Vertex AI LLM provider.
To find the values for the project ID, project location, and publisher, you can check the fields of an API request, such as
https://{LOCATION}-aiplatform.googleapis.com/{VERSION}/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/{PROVIDER}/<model-path>
.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the Vertex AI API. This token is automatically sent in the key header of the request. |
||
model string |
The Vertex AI model to use. For more information, see the Vertex AI model docs. |
MinLength: 1 |
|
apiVersion string |
The version of the Vertex AI API to use. For more information, see the Vertex AI API reference. |
MinLength: 1 |
|
projectId string |
The ID of the Google Cloud Project that you use for the Vertex AI. | MinLength: 1 |
|
location string |
The location of the Google Cloud Project that you use for the Vertex AI. | MinLength: 1 |
|
modelPath string |
Optional: The model path to route to. Defaults to the Gemini model path, generateContent . |
||
publisher Publisher |
The type of publisher model to use. Currently, only Google is supported. | Enum: [GOOGLE] |
Webhook
Webhook configures a webhook to forward requests or responses to for prompt guarding.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
host Host |
Host to send the traffic to. Note: TLS is not currently supported for webhook. |
||
forwardHeaders HTTPHeaderMatch array |
ForwardHeaders define headers to forward with the request to the webhook. |