API reference
Packages
gateway.kgateway.dev/v1alpha1
Resource Types
AIBackend
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
llm LLMProvider |
The LLM configures the AI gateway to use a single LLM provider backend. | ||
multipool MultiPoolConfig |
The MultiPool configures the backends for multiple hosts or models from the same provider in one Backend resource. |
AIPolicy
AIPolicy config is used to configure the behavior of the LLM provider on the level of individual routes. These route settings, such as prompt enrichment, retrieval augmented generation (RAG), and semantic caching, are applicable only for routes that send requests to an LLM provider backend.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
promptEnrichment AIPromptEnrichment |
Enrich requests sent to the LLM provider by appending and prepending system prompts. This can be configured only for LLM providers that use the CHAT or CHAT_STREAMING API route type. |
||
promptGuard AIPromptGuard |
Set up prompt guards to block unwanted requests to the LLM provider and mask sensitive data. Prompt guards can be used to reject requests based on the content of the prompt, as well as mask responses based on the content of the response. |
||
defaults FieldDefault array |
Provide defaults to merge with user input fields. Defaults do not override the user input fields, unless you explicitly set override to true . |
||
routeType RouteType |
The type of route to the LLM provider API. Currently, CHAT and CHAT_STREAMING are supported. |
CHAT | Enum: [CHAT CHAT_STREAMING] |
AIPromptEnrichment
AIPromptEnrichment defines the config to enrich requests sent to the LLM provider by appending and prepending system prompts.
This can be configured only for LLM providers that use the CHAT
or CHAT_STREAMING
API type.
Prompt enrichment allows you to add additional context to the prompt before sending it to the model. Unlike RAG or other dynamic context methods, prompt enrichment is static and is applied to every request.
Note: Some providers, including Anthropic, do not support SYSTEM role messages, and instead have a dedicated
system field in the input JSON. In this case, use the defaults
setting to set the system field.
The following example prepends a system prompt of Answer all questions in French.
and appends Describe the painting as if you were a famous art critic from the 17th century.
to each request that is sent to the openai
HTTPRoute.
name: openai-opt
namespace: kgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: openai
ai:
promptEnrichment:
prepend:
- role: SYSTEM
content: "Answer all questions in French."
append:
- role: USER
content: "Describe the painting as if you were a famous art critic from the 17th century."
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
prepend Message array |
A list of messages to be prepended to the prompt sent by the client. | ||
append Message array |
A list of messages to be appended to the prompt sent by the client. |
AIPromptGuard
AIPromptGuard configures a prompt guards to block unwanted requests to the LLM provider and mask sensitive data. Prompt guards can be used to reject requests based on the content of the prompt, as well as mask responses based on the content of the response.
This example rejects any request prompts that contain the string “credit card”, and masks any credit card numbers in the response.
promptGuard:
request:
customResponse:
message: "Rejected due to inappropriate content"
regex:
action: REJECT
matches:
- pattern: "credit card"
name: "CC"
response:
regex:
builtins:
- CREDIT_CARD
action: MASK
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
request PromptguardRequest |
Prompt guards to apply to requests sent by the client. | ||
response PromptguardResponse |
Prompt guards to apply to responses returned by the LLM provider. |
AccessLog
AccessLog represents the top-level access log configuration.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
fileSink FileSink |
Output access logs to local file | ||
grpcService GrpcService |
Send access logs to gRPC service | ||
filter AccessLogFilter |
Filter access logs configuration | MaxProperties: 1 MinProperties: 1 |
AccessLogFilter
AccessLogFilter represents the top-level filter structure. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-accesslogfilter
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
andFilter FilterType array |
Performs a logical “and” operation on the result of each individual filter. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-andfilter |
MaxProperties: 1 MinItems: 2 MinProperties: 1 |
|
orFilter FilterType array |
Performs a logical “or” operation on the result of each individual filter. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-orfilter |
MaxProperties: 1 MinItems: 2 MinProperties: 1 |
Action
Underlying type: string
Action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. PromptguardResponse matches are always masked by default.
Appears in:
Field | Description |
---|---|
MASK |
Mask the matched data in the request. |
REJECT |
Reject the request if the regex matches content in the request. |
AiExtension
Configuration for the AI extension.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enabled boolean |
Whether to enable the extension. | Optional |
|
image Image |
The extension’s container image. See https://kubernetes.io/docs/concepts/containers/images for details. |
Optional |
|
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
Optional |
|
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
Optional |
|
env EnvVar array |
The extension’s container environment variables. | Optional |
|
ports ContainerPort array |
The extension’s container ports. | Optional |
|
stats AiExtensionStats |
Additional stats config for AI Extension. This config can be useful for adding custom labels to the request metrics. Example: stats: customLabels: - name: “subject” metadataNamespace: “envoy.filters.http.jwt_authn” metadataKey: “principal:sub” - name: “issuer” metadataNamespace: “envoy.filters.http.jwt_authn” metadataKey: “principal:iss” |
AiExtensionStats
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
customLabels CustomLabel array |
Set of custom labels to be added to the request metrics. These will be added on each request which goes through the AI Extension. |
AnthropicConfig
AnthropicConfig settings for the Anthropic LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the Anthropic API. This token is automatically sent in the x-api-key header of the request. |
Required |
|
apiVersion string |
Optional: A version header to pass to the Anthropic API. For more information, see the Anthropic API versioning docs. |
||
model string |
Optional: Override the model name. If unset, the model name is taken from the request. This setting can be useful when testing model failover scenarios. |
AwsAuth
AwsAuth specifies the authentication method to use for the backend.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
type AwsAuthType |
Type specifies the authentication method to use for the backend. | Enum: [Secret] Required |
|
secretRef LocalObjectReference |
SecretRef references a Kubernetes Secret containing the AWS credentials. The Secret must have keys “accessKey”, “secretKey”, and optionally “sessionToken”. |
Optional |
AwsAuthType
Underlying type: string
AwsAuthType specifies the authentication method to use for the backend.
Appears in:
Field | Description |
---|---|
Secret |
AwsAuthTypeSecret uses credentials stored in a Kubernetes Secret. |
AwsBackend
AwsBackend is the AWS backend configuration.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
accountId string |
AccountId is the AWS account ID to use for the backend. | MaxLength: 12 MinLength: 1 Pattern: ^[0-9]\{12\}$ Required |
|
auth AwsAuth |
Auth specifies an explicit AWS authentication method for the backend. When omitted, the authentication method will be inferred from the environment (e.g. instance metadata, EKS Pod Identity, environment variables, etc.) This may not work in all environments, so it is recommended to specify an authentication method. See the Envoy docs for more info: https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/aws_request_signing_filter#credentials |
Optional |
|
lambda AwsLambda |
Lambda configures the AWS lambda service. | Optional |
|
region string |
Region is the AWS region to use for the backend. Defaults to us-east-1 if not specified. |
us-east-1 | MaxLength: 63 MinLength: 1 Optional Pattern: ^[a-z0-9-]+$ |
AwsLambda
AwsLambda configures the AWS lambda service.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
endpointURL string |
EndpointURL is the URL or domain for the Lambda service. This is primarily useful for testing and development purposes. When omitted, the default lambda hostname will be used. |
MaxLength: 2048 Optional Pattern: ^https?://[-a-zA-Z0-9@:%.+~#?&/=]+$ |
|
functionName string |
FunctionName is the name of the Lambda function to invoke. | Pattern: ^[A-Za-z0-9-_]\{1,140\}$ Required |
|
invocationMode string |
InvocationMode defines how to invoke the Lambda function. Defaults to Sync. |
Sync | Enum: [Sync Async] Optional |
qualifier string |
Qualifier is the alias or version for the Lambda function. Valid values include a numeric version (e.g. “1”), an alias name (alphanumeric plus “-” or “_”), or the special literal “$LATEST”. |
Optional Pattern: ^(\$LATEST|[0-9]+|[A-Za-z0-9-_]\{1,128\})$ |
AzureOpenAIConfig
AzureOpenAIConfig settings for the Azure OpenAI LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the Azure OpenAI API. This token is automatically sent in the api-key header of the request. |
Required |
|
endpoint string |
The endpoint for the Azure OpenAI API to use, such as my-endpoint.openai.azure.com .If the scheme is included, it is stripped. |
MinLength: 1 Required |
|
deploymentName string |
The name of the Azure OpenAI model deployment to use. For more information, see the Azure OpenAI model docs. |
MinLength: 1 Required |
|
apiVersion string |
The version of the Azure OpenAI API to use. For more information, see the Azure OpenAI API version reference. |
MinLength: 1 Required |
Backend
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
Backend |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec BackendSpec |
|||
status BackendStatus |
BackendSpec
BackendSpec defines the desired state of Backend.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
type BackendType |
Type indicates the type of the backend to be used. | Enum: [AI AWS Static] Required |
|
ai AIBackend |
AI is the AI backend configuration. | MaxProperties: 1 MinProperties: 1 |
|
aws AwsBackend |
Aws is the AWS backend configuration. | ||
static StaticBackend |
Static is the static backend configuration. |
BackendStatus
BackendStatus defines the observed state of Backend.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
conditions Condition array |
Conditions is the list of conditions for the backend. | MaxItems: 8 |
BackendType
Underlying type: string
BackendType indicates the type of the backend.
Appears in:
Field | Description |
---|---|
AI |
BackendTypeAI is the type for AI backends. |
AWS |
BackendTypeAWS is the type for AWS backends. |
Static |
BackendTypeStatic is the type for static backends. |
BodyParseBehavior
Underlying type: string
BodyparseBehavior defines how the body should be parsed If set to json and the body is not json then the filter will not perform the transformation.
Validation:
- Enum: [AsString AsJson]
Appears in:
Field | Description |
---|---|
AsString |
BodyParseBehaviorAsString will parse the body as a string. |
AsJson |
BodyParseBehaviorAsJSON will parse the body as a json object. |
BodyTransformation
BodyTransformation controls how the body should be parsed and transformed.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
parseAs BodyParseBehavior |
ParseAs defines what auto formatting should be applied to the body. This can make interacting with keys within a json body much easier if AsJson is selected. |
AsString | Enum: [AsString AsJson] |
value InjaTemplate |
Value is the template to apply to generate the output value for the body. |
BufferSettings
BufferSettings configures how the request body should be buffered.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
maxRequestBytes integer |
MaxRequestBytes sets the maximum size of a message body to buffer. Requests exceeding this size will receive HTTP 413 and not be sent to the authorization service. |
Minimum: 1 Required |
|
allowPartialMessage boolean |
AllowPartialMessage determines if partial messages should be allowed. When true, requests will be sent to the authorization service even if they exceed maxRequestBytes. When unset, the default behavior is false. |
||
packAsBytes boolean |
PackAsBytes determines if the body should be sent as raw bytes. When true, the body is sent as raw bytes in the raw_body field. When false, the body is sent as UTF-8 string in the body field. When unset, the default behavior is false. |
BuiltIn
Underlying type: string
BuiltIn regex patterns for specific types of strings in prompts.
For example, if you specify CREDIT_CARD
, any credit card numbers
in the request or response are matched.
Validation:
- Enum: [SSN CREDIT_CARD PHONE_NUMBER EMAIL]
Appears in:
Field | Description |
---|---|
SSN |
Default regex matching for Social Security numbers. |
CREDIT_CARD |
Default regex matching for credit card numbers. |
PHONE_NUMBER |
Default regex matching for phone numbers. |
EMAIL |
Default regex matching for email addresses. |
CELFilter
CELFilter filters requests based on Common Expression Language (CEL).
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
match string |
The CEL expressions to evaluate. AccessLogs are only emitted when the CEL expressions evaluates to true. see: https://www.envoyproxy.io/docs/envoy/v1.33.0/xds/type/v3/cel.proto.html#common-expression-language-cel-proto |
ComparisonFilter
Underlying type: struct{Op Op “json:"op,omitempty"”; Value uint32 “json:"value,omitempty"”}
ComparisonFilter represents a filter based on a comparison. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-comparisonfilter
Appears in:
CustomLabel
Appears in:
CustomResponse
CustomResponse configures a response to return to the client if request content
is matched against a regex pattern and the action is REJECT
.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
message string |
A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
The request was rejected due to inappropriate content | |
statusCode integer |
The status code to return to the client. Defaults to 403. | 403 | Maximum: 599 Minimum: 200 |
DirectResponse
DirectResponse contains configuration for defining direct response routes.
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
DirectResponse |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec DirectResponseSpec |
|||
status DirectResponseStatus |
DirectResponseSpec
DirectResponseSpec describes the desired state of a DirectResponse.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
status integer |
StatusCode defines the HTTP status code to return for this route. | Maximum: 599 Minimum: 200 Required |
|
body string |
Body defines the content to be returned in the HTTP response body. The maximum length of the body is restricted to prevent excessively large responses. |
MaxLength: 4096 Optional |
DirectResponseStatus
DirectResponseStatus defines the observed state of a DirectResponse.
Appears in:
DurationFilter
Underlying type: ComparisonFilter
DurationFilter filters based on request duration. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-durationfilter
Appears in:
EnvoyBootstrap
Configuration for the Envoy proxy instance that is provisioned from a Kubernetes Gateway.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
logLevel string |
Envoy log level. Options include “trace”, “debug”, “info”, “warn”, “error”, “critical” and “off”. Defaults to “info”. See https://www.envoyproxy.io/docs/envoy/latest/start/quick-start/run-envoy#debugging-envoy for more information. |
Optional |
|
componentLogLevels object (keys:string, values:string) |
Envoy log levels for specific components. The keys are component names and the values are one of “trace”, “debug”, “info”, “warn”, “error”, “critical”, or “off”, e.g. componentLogLevels: upstream: debug connection: trace These will be converted to the --component-log-level Envoy argumentvalue. See https://www.envoyproxy.io/docs/envoy/latest/start/quick-start/run-envoy#debugging-envoy for more information. Note: the keys and values cannot be empty, but they are not otherwise validated. |
Optional |
EnvoyContainer
Configuration for the container running Envoy.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
bootstrap EnvoyBootstrap |
Initial envoy configuration. | Optional |
|
image Image |
The envoy container image. See https://kubernetes.io/docs/concepts/containers/images for details. Default values, which may be overridden individually: registry: quay.io/solo-io repository: gloo-envoy-wrapper (OSS) / gloo-ee-envoy-wrapper (EE) tag: pullPolicy: IfNotPresent |
Optional |
|
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
Optional |
|
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
Optional |
ExtAuthEnabled
Underlying type: string
ExtAuthEnabled determines the enabled state of the ExtAuth filter.
Validation:
- Enum: [DisableAll]
Appears in:
Field | Description |
---|---|
DisableAll |
ExtAuthDisableAll disables all instances of the ExtAuth filter for this route. This is to enable a global disable such as for a health check route. |
ExtAuthPolicy
ExtAuthPolicy configures external authentication for a route. This policy will determine the ext auth server to use and how to talk to it. Note that most of these fields are passed along as is to Envoy. For more details on particular fields please see the Envoy ExtAuth documentation. https://raw.githubusercontent.com/envoyproxy/envoy/f910f4abea24904aff04ec33a00147184ea7cffa/api/envoy/extensions/filters/http/ext_authz/v3/ext_authz.proto
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
extensionRef LocalObjectReference |
ExtensionRef references the ExternalExtension that should be used for authentication. | ||
enablement ExtAuthEnabled |
Enablement determines the enabled state of the ExtAuth filter. When set to “DisableAll”, the filter is disabled for this route. When empty, the filter is enabled as long as it is not disabled by another policy. |
Enum: [DisableAll] |
|
failureModeAllow boolean |
FailureModeAllow determines the behavior on authorization service errors. When true, requests will be allowed even if the authorization service fails or returns HTTP 5xx errors. When unset, the default behavior is false. |
||
withRequestBody BufferSettings |
WithRequestBody allows the request body to be buffered and sent to the authorization service. Warning buffering has implications for streaming and therefore performance. |
||
clearRouteCache boolean |
ClearRouteCache allows the authorization service to affect routing decisions. When unset, the default behavior is false. |
||
metadataContextNamespaces string array |
MetadataContextNamespaces specifies metadata namespaces to pass to the authorization service. Default to allowing jwt info if processing for jwt is configured. |
[jwt] | |
includePeerCertificate boolean |
IncludePeerCertificate determines if the client’s X.509 certificate should be sent to the authorization service. When true, the certificate will be included if available. When unset, the default behavior is false. |
||
includeTLSSession boolean |
IncludeTLSSession determines if TLS session details should be sent to the authorization service. When true, the SNI name from TLSClientHello will be included if available. When unset, the default behavior is false. |
||
emitFilterStateStats boolean |
EmitFilterStateStats determines if per-stream stats should be emitted for access logging. When true and using Envoy gRPC, emits latency, bytes sent/received, and upstream info. When true and not using Envoy gRPC, emits only latency. Stats are only added if a check request is made to the ext_authz service. When unset, the default behavior is false. |
ExtAuthProvider
ExtAuthProvider defines the configuration for an ExtAuth provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
grpcService ExtGrpcService |
GrpcService is the GRPC service that will handle the authentication. | Required |
ExtGrpcService
ExtGrpcService defines the GRPC service that will handle the processing.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
backendRef BackendRef |
BackendRef references the backend GRPC service. | Required |
|
authority string |
Authority is the authority header to use for the GRPC service. |
ExtProcPolicy
ExtProcPolicy defines the configuration for the Envoy External Processing filter.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
extensionRef LocalObjectReference |
ExtensionRef references the GatewayExtension that should be used for external processing. | Required |
|
processingMode ProcessingMode |
ProcessingMode defines how the filter should interact with the request/response streams | ||
failureModeAllow boolean |
FailureModeAllow defines the behavior of the filter when the external processing fails. Defaults to false. |
ExtProcProvider
ExtProcProvider defines the configuration for an ExtProc provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
grpcService ExtGrpcService |
GrpcService is the GRPC service that will handle the processing. | Required |
FieldDefault
FieldDefault provides default values for specific fields in the JSON request body sent to the LLM provider. These defaults are merged with the user-provided request to ensure missing fields are populated.
User input fields here refer to the fields in the JSON request body that a client sends when making a request to the LLM provider.
Defaults set here do not override those user-provided values unless you explicitly set override
to true
.
Example: Setting a default system field for Anthropic, which does not support system role messages:
defaults:
- field: "system"
value: "answer all questions in French"
Example: Setting a default temperature and overriding max_tokens
:
defaults:
- field: "temperature"
value: "0.5"
- field: "max_tokens"
value: "100"
override: true
Example: Overriding a custom list field:
defaults:
- field: "custom_list"
value: "[a,b,c]"
Note: The field
values correspond to keys in the JSON request body, not fields in this CRD.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
field string |
The name of the field. | MinLength: 1 Required |
|
value string |
The field default value, which can be any JSON Data Type. | MinLength: 1 Required |
|
override boolean |
Whether to override the field’s value if it already exists. Defaults to false. |
false |
FileSink
FileSink represents the file sink configuration for access logs.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
path string |
the file path to which the file access logging service will sink | Required |
|
stringFormat string |
the format string by which envoy will format the log lines https://www.envoyproxy.io/docs/envoy/v1.33.0/configuration/observability/access_log/usage#format-strings |
||
jsonFormat RawExtension |
the format object by which to envoy will emit the logs in a structured way. https://www.envoyproxy.io/docs/envoy/v1.33.0/configuration/observability/access_log/usage#format-dictionaries |
FilterType
FilterType represents the type of filter to apply (only one of these should be set). Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#envoy-v3-api-msg-config-accesslog-v3-accesslogfilter
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
statusCodeFilter StatusCodeFilter |
|||
durationFilter DurationFilter |
|||
notHealthCheckFilter boolean |
Filters for requests that are not health check requests. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-nothealthcheckfilter |
||
traceableFilter boolean |
Filters for requests that are traceable. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-traceablefilter |
||
headerFilter HeaderFilter |
|||
responseFlagFilter ResponseFlagFilter |
|||
grpcStatusFilter GrpcStatusFilter |
|||
celFilter CELFilter |
GatewayExtension
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
GatewayExtension |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec GatewayExtensionSpec |
|||
status GatewayExtensionStatus |
GatewayExtensionSpec
GatewayExtensionSpec defines the desired state of GatewayExtension.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
type GatewayExtensionType |
Type indicates the type of the GatewayExtension to be used. | Enum: [ExtAuth ExtProc Extended] Required |
|
extAuth ExtAuthProvider |
ExtAuth configuration for ExtAuth extension type. | ||
extProc ExtProcProvider |
ExtProc configuration for ExtProc extension type. |
GatewayExtensionStatus
GatewayExtensionStatus defines the observed state of GatewayExtension.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
conditions Condition array |
Conditions is the list of conditions for the GatewayExtension. | MaxItems: 8 |
GatewayExtensionType
Underlying type: string
GatewayExtensionType indicates the type of the GatewayExtension.
Appears in:
Field | Description |
---|---|
ExtAuth |
GatewayExtensionTypeExtAuth is the type for Extauth extensions. |
ExtProc |
GatewayExtensionTypeExtProc is the type for ExtProc extensions. |
GatewayParameters
A GatewayParameters contains configuration that is used to dynamically provision kgateway’s data plane (Envoy proxy instance), based on a Kubernetes Gateway.
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
GatewayParameters |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec GatewayParametersSpec |
|||
status GatewayParametersStatus |
GatewayParametersSpec
A GatewayParametersSpec describes the type of environment/platform in which the proxy will be provisioned.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
kube KubernetesProxyConfig |
The proxy will be deployed on Kubernetes. | Optional |
|
selfManaged SelfManagedGateway |
The proxy will be self-managed and not auto-provisioned. | Optional |
GatewayParametersStatus
The current conditions of the GatewayParameters. This is not currently implemented.
Appears in:
GeminiConfig
GeminiConfig settings for the Gemini LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the Gemini API. This token is automatically sent in the key query parameter of the request. |
Required |
|
model string |
The Gemini model to use. For more information, see the Gemini models docs. |
Required |
|
apiVersion string |
The version of the Gemini API to use. For more information, see the Gemini API version docs. |
Required |
GracefulShutdownSpec
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enabled boolean |
Enable grace period before shutdown to finish current requests while Envoy health checks fail to e.g. notify external load balancers. NOTE: This will not have any effect if you have not defined health checks via the health check filter | Optional |
|
sleepTimeSeconds integer |
Time (in seconds) for the preStop hook to wait before allowing Envoy to terminate | Optional |
GrpcService
GrpcService represents the gRPC service configuration for access logs.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
logName string |
name of log stream | Required |
|
backendRef BackendRef |
The backend gRPC service. Can be any type of supported backend (Kubernetes Service, kgateway Backend, etc..) | Required |
|
additionalRequestHeadersToLog string array |
Additional request headers to log in the access log | ||
additionalResponseHeadersToLog string array |
Additional response headers to log in the access log | ||
additionalResponseTrailersToLog string array |
Additional response trailers to log in the access log |
GrpcStatusFilter
GrpcStatusFilter filters gRPC requests based on their response status. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#enum-config-accesslog-v3-grpcstatusfilter-status
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
statuses GrpcStatus array |
Enum: [OK CANCELED UNKNOWN INVALID_ARGUMENT DEADLINE_EXCEEDED NOT_FOUND ALREADY_EXISTS PERMISSION_DENIED RESOURCE_EXHAUSTED FAILED_PRECONDITION ABORTED OUT_OF_RANGE UNIMPLEMENTED INTERNAL UNAVAILABLE DATA_LOSS UNAUTHENTICATED] MinItems: 1 |
||
exclude boolean |
HTTPListenerPolicy
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
HTTPListenerPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec HTTPListenerPolicySpec |
|||
status SimpleStatus |
HTTPListenerPolicySpec
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
targetRefs LocalPolicyTargetReference array |
MaxItems: 16 MinItems: 1 |
||
accessLog AccessLog array |
AccessLoggingConfig contains various settings for Envoy’s access logging service. See here for more information: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto |
HeaderFilter
HeaderFilter filters requests based on headers. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-headerfilter
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
header HTTPHeaderMatch |
Required |
HeaderName
Underlying type: string
Appears in:
HeaderTransformation
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
name HeaderName |
Name is the name of the header to interact with. | ||
value InjaTemplate |
Value is the template to apply to generate the output value for the header. |
Host
Host defines a static backend host.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
host string |
Host is the host name to use for the backend. | MinLength: 1 |
|
port PortNumber |
Port is the port to use for the backend. | Required |
Image
A container image. See https://kubernetes.io/docs/concepts/containers/images for details.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
registry string |
The image registry. | Optional |
|
repository string |
The image repository (name). | Optional |
|
tag string |
The image tag. | Optional |
|
digest string |
The hash digest of the image, e.g. sha256:12345... |
Optional |
|
pullPolicy PullPolicy |
The image pull policy for the container. See https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy for details. |
Optional |
InjaTemplate
Underlying type: string
Appears in:
IstioContainer
Configuration for the container running the istio-proxy.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
image Image |
The envoy container image. See https://kubernetes.io/docs/concepts/containers/images for details. |
Optional |
|
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
Optional |
|
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
Optional |
|
logLevel string |
Log level for istio-proxy. Options include “info”, “debug”, “warning”, and “error”. Default level is info Default is “warning”. |
Optional |
|
istioDiscoveryAddress string |
The address of the istio discovery service. Defaults to “istiod.istio-system.svc:15012”. | Optional |
|
istioMetaMeshId string |
The mesh id of the istio mesh. Defaults to “cluster.local”. | Optional |
|
istioMetaClusterId string |
The cluster id of the istio cluster. Defaults to “Kubernetes”. | Optional |
IstioIntegration
Configuration for the Istio integration settings used by a Gloo Gateway’s data plane (Envoy proxy instance)
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
istioProxyContainer IstioContainer |
Configuration for the container running istio-proxy. Note that if Istio integration is not enabled, the istio container will not be injected into the gateway proxy deployment. |
Optional |
|
customSidecars Container array |
do not use slice of pointers: https://github.com/kubernetes/code-generator/issues/166 Override the default Istio sidecar in gateway-proxy with a custom container. |
Optional |
KubernetesProxyConfig
Configuration for the set of Kubernetes resources that will be provisioned for a given Gateway.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
deployment ProxyDeployment |
Use a Kubernetes deployment as the proxy workload type. Currently, this is the only supported workload type. |
Optional |
|
envoyContainer EnvoyContainer |
Configuration for the container running Envoy. | Optional |
|
sdsContainer SdsContainer |
Configuration for the container running the Secret Discovery Service (SDS). | Optional |
|
podTemplate Pod |
Configuration for the pods that will be created. | Optional |
|
service Service |
Configuration for the Kubernetes Service that exposes the Envoy proxy over the network. |
Optional |
|
serviceAccount ServiceAccount |
Configuration for the Kubernetes ServiceAccount used by the Envoy pod. | Optional |
|
istio IstioIntegration |
Configuration for the Istio integration. | Optional |
|
stats StatsConfig |
Configuration for the stats server. | Optional |
|
aiExtension AiExtension |
Configuration for the AI extension. | Optional |
|
floatingUserId boolean |
Used to unset the runAsUser values in security contexts. |
LLMProvider
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
provider SupportedLLMProvider |
The LLM provider type to configure. | MaxProperties: 1 MinProperties: 1 |
|
hostOverride Host |
Send requests to a custom host and port, such as to proxy the request, or to use a different backend that is API-compliant with the Backend version. |
LocalPolicyTargetReference
Select the object to attach the policy to. The object must be in the same namespace as the policy. You can target only one object at a time.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
group Group |
The API group of the target resource. For Kubernetes Gateway API resources, the group is gateway.networking.k8s.io . |
MaxLength: 253 Pattern: ^$|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$ |
|
kind Kind |
The API kind of the target resource, such as Gateway or HTTPRoute. |
MaxLength: 63 MinLength: 1 Pattern: ^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$ |
|
name ObjectName |
The name of the target resource. | MaxLength: 253 MinLength: 1 |
LocalRateLimitPolicy
LocalRateLimitPolicy represents a policy for local rate limiting. It defines the configuration for rate limiting using a token bucket mechanism.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
tokenBucket TokenBucket |
TokenBucket represents the configuration for a token bucket local rate-limiting mechanism. It defines the parameters for controlling the rate at which requests are allowed. |
Message
An entry for a message to prepend or append to each prompt.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
role string |
Role of the message. The available roles depend on the backend LLM provider model, such as SYSTEM or USER in the OpenAI API. |
||
content string |
String content of the message. |
Moderation
Moderation configures an external moderation model endpoint. This endpoint evaluates request prompt data against predefined content rules to determine if the content adheres to those rules.
Any requests routed through the AI Gateway are processed by the specified moderation model. If the model identifies the content as harmful based on its rules, the request is automatically rejected.
You can configure a moderation endpoint either as a standalone prompt guard setting or alongside other request and response guard settings.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
openAIModeration OpenAIConfig |
Pass prompt data through an external moderation model endpoint, which compares the request prompt input to predefined content rules. Configure an OpenAI moderation endpoint. |
MultiPoolConfig
MultiPoolConfig configures the backends for multiple hosts or models from the same provider in one Backend resource. This method can be useful for creating one logical endpoint that is backed by multiple hosts or models.
In the priorities
section, the order of pool
entries defines the priority of the backend endpoints.
The pool
entries can either define a list of backends or a single backend.
Note: Only two levels of nesting are permitted. Any nested entries after the second level are ignored.
multi:
priorities:
- pool:
- azureOpenai:
deploymentName: gpt-4o-mini
apiVersion: 2024-02-15-preview
endpoint: ai-gateway.openai.azure.com
authToken:
secretRef:
name: azure-secret
namespace: kgateway-system
- pool:
- azureOpenai:
deploymentName: gpt-4o-mini-2
apiVersion: 2024-02-15-preview
endpoint: ai-gateway-2.openai.azure.com
authToken:
secretRef:
name: azure-secret-2
namespace: kgateway-system
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
priorities Priority array |
The priority list of backend pools. Each entry represents a set of LLM provider backends. The order defines the priority of the backend endpoints. |
MaxItems: 20 MinItems: 1 Required |
OpenAIConfig
OpenAIConfig settings for the OpenAI LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the OpenAI API. This token is automatically sent in the Authorization header of therequest and prefixed with Bearer . |
Required |
|
model string |
Optional: Override the model name, such as gpt-4o-mini .If unset, the model name is taken from the request. This setting can be useful when setting up model failover within the same LLM provider. |
Pod
Configuration for a Kubernetes Pod template.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
extraLabels object (keys:string, values:string) |
Additional labels to add to the Pod object metadata. | Optional |
|
extraAnnotations object (keys:string, values:string) |
Additional annotations to add to the Pod object metadata. | Optional |
|
securityContext PodSecurityContext |
The pod security context. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#podsecuritycontext-v1-core for details. |
Optional |
|
imagePullSecrets LocalObjectReference array |
An optional list of references to secrets in the same namespace to use for pulling any of the images used by this Pod spec. See https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod for details. |
Optional |
|
nodeSelector object (keys:string, values:string) |
A selector which must be true for the pod to fit on a node. See https://kubernetes.io/docs/concepts/configuration/assign-pod-node/ for details. |
Optional |
|
affinity Affinity |
If specified, the pod’s scheduling constraints. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#affinity-v1-core for details. |
Optional |
|
tolerations Toleration array |
do not use slice of pointers: https://github.com/kubernetes/code-generator/issues/166 If specified, the pod’s tolerations. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#toleration-v1-core for details. |
Optional |
|
gracefulShutdown GracefulShutdownSpec |
If specified, the pod’s graceful shutdown spec. | Optional |
|
terminationGracePeriodSeconds integer |
If specified, the pod’s termination grace period in seconds. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#pod-v1-core for details |
Optional |
|
readinessProbe Probe |
If specified, the pod’s readiness probe. Periodic probe of container service readiness. Container will be removed from service endpoints if the probe fails. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#probe-v1-core for details. |
Optional |
|
livenessProbe Probe |
If specified, the pod’s liveness probe. Periodic probe of container service readiness. Container will be restarted if the probe fails. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#probe-v1-core for details. |
Optional |
PolicyAncestorStatus
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
ancestorRef ParentReference |
AncestorRef corresponds with a ParentRef in the spec that this PolicyAncestorStatus struct describes the status of. |
||
controllerName string |
ControllerName is a domain/path string that indicates the name of the controller that wrote this status. This corresponds with the controllerName field on GatewayClass. Example: “example.net/gateway-controller”. The format of this field is DOMAIN “/” PATH, where DOMAIN and PATH are valid Kubernetes names (https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names). Controllers MUST populate this field when writing status. Controllers should ensure that entries to status populated with their ControllerName are cleaned up when they are no longer necessary. |
||
conditions Condition array |
Conditions describes the status of the Policy with respect to the given Ancestor. | MaxItems: 8 MinItems: 1 |
Priority
Priority configures the priority of the backend endpoints.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
pool LLMProvider array |
A list of LLM provider backends within a single endpoint pool entry. | MaxItems: 20 MinItems: 1 |
ProcessingMode
ProcessingMode defines how the filter should interact with the request/response streams
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
requestHeaderMode string |
RequestHeaderMode determines how to handle the request headers | SEND | Enum: [DEFAULT SEND SKIP] |
responseHeaderMode string |
ResponseHeaderMode determines how to handle the response headers | SEND | Enum: [DEFAULT SEND SKIP] |
requestBodyMode string |
RequestBodyMode determines how to handle the request body | NONE | Enum: [NONE STREAMED BUFFERED BUFFERED_PARTIAL FULL_DUPLEX_STREAMED] |
responseBodyMode string |
ResponseBodyMode determines how to handle the response body | NONE | Enum: [NONE STREAMED BUFFERED BUFFERED_PARTIAL FULL_DUPLEX_STREAMED] |
requestTrailerMode string |
RequestTrailerMode determines how to handle the request trailers | SKIP | Enum: [DEFAULT SEND SKIP] |
responseTrailerMode string |
ResponseTrailerMode determines how to handle the response trailers | SKIP | Enum: [DEFAULT SEND SKIP] |
PromptguardRequest
PromptguardRequest defines the prompt guards to apply to requests sent by the client. Multiple prompt guard configurations can be set, and they will be executed in the following order: webhook → regex → moderation for requests, where each step can reject the request and stop further processing.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
customResponse CustomResponse |
A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
||
regex Regex |
Regular expression (regex) matching for prompt guards and data masking. | ||
webhook Webhook |
Configure a webhook to forward requests to for prompt guarding. | ||
moderation Moderation |
Pass prompt data through an external moderation model endpoint, which compares the request prompt input to predefined content rules. |
PromptguardResponse
PromptguardResponse configures the response that the prompt guard applies to responses returned by the LLM provider. Both webhook and regex can be set, they will be executed in the following order: webhook → regex, where each step can reject the request and stop further processing.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
regex Regex |
Regular expression (regex) matching for prompt guards and data masking. | ||
webhook Webhook |
Configure a webhook to forward responses to for prompt guarding. |
ProxyDeployment
Configuration for the Proxy deployment in Kubernetes.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
replicas integer |
The number of desired pods. Defaults to 1. | Optional |
Publisher
Underlying type: string
Publisher configures the type of publisher model to use for VertexAI. Currently, only Google is supported.
Appears in:
Field | Description |
---|---|
GOOGLE |
RateLimit
RateLimit defines a rate limiting policy.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
local LocalRateLimitPolicy |
Local defines a local rate limiting policy. |
Regex
Regex configures the regular expression (regex) matching for prompt guards and data masking.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
matches RegexMatch array |
A list of regex patterns to match against the request or response. Matches and built-ins are additive. |
||
builtins BuiltIn array |
A list of built-in regex patterns to match against the request or response. Matches and built-ins are additive. |
Enum: [SSN CREDIT_CARD PHONE_NUMBER EMAIL] |
|
action Action |
The action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. PromptguardResponse matches are always masked by default. Defaults to MASK . |
MASK |
RegexMatch
RegexMatch configures the regular expression (regex) matching for prompt guards and data masking.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
pattern string |
The regex pattern to match against the request or response. | ||
name string |
An optional name for this match, which can be used for debugging purposes. |
ResponseFlagFilter
ResponseFlagFilter filters based on response flags. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#config-accesslog-v3-responseflagfilter
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
flags string array |
MinItems: 1 |
RouteType
Underlying type: string
RouteType is the type of route to the LLM provider API.
Appears in:
Field | Description |
---|---|
CHAT |
The LLM generates the full response before responding to a client. |
CHAT_STREAMING |
Stream responses to a client, which allows the LLM to stream out tokens as they are generated. |
SdsBootstrap
Configuration for the SDS instance that is provisioned from a Kubernetes Gateway.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
logLevel string |
Log level for SDS. Options include “info”, “debug”, “warn”, “error”, “panic” and “fatal”. Default level is “info”. |
Optional |
SdsContainer
Configuration for the container running Gloo SDS.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
image Image |
The SDS container image. See https://kubernetes.io/docs/concepts/containers/images for details. |
Optional |
|
securityContext SecurityContext |
The security context for this container. See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#securitycontext-v1-core for details. |
Optional |
|
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
Optional |
|
bootstrap SdsBootstrap |
Initial SDS container configuration. | Optional |
SelfManagedGateway
Appears in:
Service
Configuration for a Kubernetes Service.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
type ServiceType |
The Kubernetes Service type. | Optional |
|
clusterIP string |
The manually specified IP address of the service, if a randomly assigned IP is not desired. See https://kubernetes.io/docs/concepts/services-networking/service/#choosing-your-own-ip-address and https://kubernetes.io/docs/concepts/services-networking/service/#headless-services on the implications of setting clusterIP . |
Optional |
|
extraLabels object (keys:string, values:string) |
Additional labels to add to the Service object metadata. | Optional |
|
extraAnnotations object (keys:string, values:string) |
Additional annotations to add to the Service object metadata. | Optional |
ServiceAccount
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
extraLabels object (keys:string, values:string) |
Additional labels to add to the ServiceAccount object metadata. | Optional |
|
extraAnnotations object (keys:string, values:string) |
Additional annotations to add to the ServiceAccount object metadata. | Optional |
SimpleStatus
SimpleStatus defines the observed state of the policy.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
conditions Condition array |
Conditions is the list of conditions for the policy. | MaxItems: 8 |
SingleAuthToken
SingleAuthToken configures the authorization token that the AI gateway uses to access the LLM provider API. This token is automatically sent in a request header, depending on the LLM provider.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
kind SingleAuthTokenKind |
Kind specifies which type of authorization token is being used. Must be one of: “Inline”, “SecretRef”, “Passthrough”. |
Enum: [Inline SecretRef Passthrough] |
|
inline string |
Provide the token directly in the configuration for the Backend. This option is the least secure. Only use this option for quick tests such as trying out AI Gateway. |
||
secretRef LocalObjectReference |
Store the API key in a Kubernetes secret in the same namespace as the Backend. Then, refer to the secret in the Backend configuration. This option is more secure than an inline token, because the API key is encoded and you can restrict access to secrets through RBAC rules. You might use this option in proofs of concept, controlled development and staging environments, or well-controlled prod environments that use secrets. |
SingleAuthTokenKind
Underlying type: string
Appears in:
Field | Description |
---|---|
Inline |
Inline provides the token directly in the configuration for the Backend. |
SecretRef |
SecretRef provides the token directly in the configuration for the Backend. |
Passthrough |
Passthrough the existing token. This token can either come directly from the client, or be generated by an OIDC flow early in the request lifecycle. This option is useful for backends which have federated identity setup and can re-use the token from the client. Currently, this token must exist in the Authorization header. |
StaticBackend
StaticBackend references a static list of hosts.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
hosts Host array |
Hosts is a list of hosts to use for the backend. | MinItems: 1 |
StatsConfig
Configuration for the stats server.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
enabled boolean |
Whether to expose metrics annotations and ports for scraping metrics. | Optional |
|
routePrefixRewrite string |
The Envoy stats endpoint to which the metrics are written | Optional |
|
enableStatsRoute boolean |
Enables an additional route to the stats cluster defaulting to /stats | Optional |
|
statsRoutePrefixRewrite string |
The Envoy stats endpoint with general metrics for the additional stats route | Optional |
StatusCodeFilter
Underlying type: ComparisonFilter
StatusCodeFilter filters based on HTTP status code. Based on: https://www.envoyproxy.io/docs/envoy/v1.33.0/api-v3/config/accesslog/v3/accesslog.proto#envoy-v3-api-msg-config-accesslog-v3-statuscodefilter
Appears in:
SupportedLLMProvider
SupportedLLMProvider configures the AI gateway to use a single LLM provider backend.
Validation:
- MaxProperties: 1
- MinProperties: 1
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
openai OpenAIConfig |
|||
azureopenai AzureOpenAIConfig |
|||
anthropic AnthropicConfig |
|||
gemini GeminiConfig |
|||
vertexai VertexAIConfig |
TokenBucket
TokenBucket defines the configuration for a token bucket rate-limiting mechanism. It controls the rate at which tokens are generated and consumed for a specific operation.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
maxTokens integer |
MaxTokens specifies the maximum number of tokens that the bucket can hold. This value must be greater than or equal to 1. It determines the burst capacity of the rate limiter. |
Minimum: 1 |
|
tokensPerFill integer |
TokensPerFill specifies the number of tokens added to the bucket during each fill interval. If not specified, it defaults to 1. This controls the steady-state rate of token generation. |
1 | |
fillInterval string |
FillInterval defines the time duration between consecutive token fills. This value must be a valid duration string (e.g., “1s”, “500ms”). It determines the frequency of token replenishment. |
Format: duration |
TrafficPolicy
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
gateway.kgateway.dev/v1alpha1 |
||
kind string |
TrafficPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec TrafficPolicySpec |
|||
status SimpleStatus |
TrafficPolicySpec
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
targetRefs LocalPolicyTargetReference array |
MaxItems: 16 MinItems: 1 |
||
ai AIPolicy |
AI is used to configure AI-based policies for the policy. | ||
transformation TransformationPolicy |
Transformation is used to mutate and transform requests and responses before forwarding them to the destination. |
||
extProc ExtProcPolicy |
ExtProc specifies the external processing configuration for the policy. | ||
extAuth ExtAuthPolicy |
ExtAuth specifies the external authentication configuration for the policy. This controls what external server to send requests to for authentication. |
||
rateLimit RateLimit |
RateLimit specifies the rate limiting configuration for the policy. This controls the rate at which requests are allowed to be processed. |
Transform
Transform defines the operations to be performed by the transformation. These operations may include changing the actual request/response but may also cause side effects. Side effects may include setting info that can be used in future steps (e.g. dynamic metadata) and can cause envoy to buffer.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
set HeaderTransformation array |
Set is a list of headers and the value they should be set to. | MaxItems: 16 |
|
add HeaderTransformation array |
Add is a list of headers to add to the request and what that value should be set to. If there is already a header with these values then append the value as an extra entry. |
MaxItems: 16 |
|
remove string array |
Remove is a list of header names to remove from the request/response. | MaxItems: 16 |
|
body BodyTransformation |
Body controls both how to parse the body and if needed how to set. If empty, body will not be buffered. |
TransformationPolicy
TransformationPolicy config is used to modify envoy behavior at a route level. These modifications can be performed on the request and response paths.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
request Transform |
Request is used to modify the request path. | ||
response Transform |
Response is used to modify the response path. |
VertexAIConfig
VertexAIConfig settings for the Vertex AI LLM provider.
To find the values for the project ID, project location, and publisher, you can check the fields of an API request, such as
https://{LOCATION}-aiplatform.googleapis.com/{VERSION}/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/{PROVIDER}/<model-path>
.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
authToken SingleAuthToken |
The authorization token that the AI gateway uses to access the Vertex AI API. This token is automatically sent in the key header of the request. |
Required |
|
model string |
The Vertex AI model to use. For more information, see the Vertex AI model docs. |
MinLength: 1 Required |
|
apiVersion string |
The version of the Vertex AI API to use. For more information, see the Vertex AI API reference. |
MinLength: 1 Required |
|
projectId string |
The ID of the Google Cloud Project that you use for the Vertex AI. | MinLength: 1 Required |
|
location string |
The location of the Google Cloud Project that you use for the Vertex AI. | MinLength: 1 Required |
|
modelPath string |
Optional: The model path to route to. Defaults to the Gemini model path, generateContent . |
||
publisher Publisher |
The type of publisher model to use. Currently, only Google is supported. | Enum: [GOOGLE] |
Webhook
Webhook configures a webhook to forward requests or responses to for prompt guarding.
Appears in:
Field | Description | Default | Validation |
---|---|---|---|
host Host |
Host to send the traffic to. | Required |
|
forwardHeaders HTTPHeaderMatch array |
ForwardHeaders define headers to forward with the request to the webhook. |