API reference
Review the API reference documentation for kgateway with the agentgateway data plane.
Looking for the Envoy data plane APIs instead? See the kgateway with Envoy API docs.
Packages
agentgateway.dev/v1alpha1
Resource Types
AIBackend
AIBackend specifies the AI backend configuration
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
provider LLMProvider |
provider specifies configuration for how to reach the configured LLM provider. | ||
groups PriorityGroup array |
groups specifies a list of groups in priority order where each group defines a set of LLM providers. The priority determines the priority of the backend endpoints chosen. Note: provider names must be unique across all providers in all priority groups. Backend policies may target a specific provider by name using targetRefs[].sectionName. Example configuration with two priority groups: yaml<br />groups:<br />- providers:<br /> - azureopenai:<br /> deploymentName: gpt-4o-mini<br /> apiVersion: 2024-02-15-preview<br /> endpoint: ai-gateway.openai.azure.com<br />- providers:<br /> - azureopenai:<br /> deploymentName: gpt-4o-mini-2<br /> apiVersion: 2024-02-15-preview<br /> endpoint: ai-gateway-2.openai.azure.com<br /> policies:<br /> auth:<br /> secretRef:<br /> name: azure-secret<br />TODO: enable this rule when we don’t need to support older k8s versions where this rule breaks // +kubebuilder:validation:XValidation:message=“provider names must be unique across groups”,rule=“self.map(pg, pg.providers.map(pp, pp.name)).map(p, self.map(pg, pg.providers.map(pp, pp.name)).filter(cp, cp != p).exists(cp, p.exists(pn, pn in cp))).exists(p, !p)” |
MaxItems: 32 MinItems: 1 |
AIPromptEnrichment
AIPromptEnrichment defines the config to enrich requests sent to the LLM provider by appending and prepending system prompts.
Prompt enrichment allows you to add additional context to the prompt before sending it to the model. Unlike RAG or other dynamic context methods, prompt enrichment is static and is applied to every request.
Note: Some providers, including Anthropic, do not support SYSTEM role messages, and instead have a dedicated
system field in the input JSON. In this case, use the defaults setting to set the system field.
The following example prepends a system prompt of Answer all questions in French.
and appends Describe the painting as if you were a famous art critic from the 17th century.
to each request that is sent to the openai HTTPRoute.
name: openai-opt
namespace: kgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: openai
ai:
promptEnrichment:
prepend:
- role: SYSTEM
content: "Answer all questions in French."
append:
- role: USER
content: "Describe the painting as if you were a famous art critic from the 17th century."Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
prepend Message array |
A list of messages to be prepended to the prompt sent by the client. | ||
append Message array |
A list of messages to be appended to the prompt sent by the client. |
AIPromptGuard
AIPromptGuard configures a prompt guards to block unwanted requests to the LLM provider and mask sensitive data. Prompt guards can be used to reject requests based on the content of the prompt, as well as mask responses based on the content of the response.
This example rejects any request prompts that contain the string “credit card”, and masks any credit card numbers in the response.
promptGuard:
request:
- response:
message: "Rejected due to inappropriate content"
regex:
action: REJECT
matches:
- pattern: "credit card"
name: "CC"
response:
- regex:
builtins:
- CREDIT_CARD
action: MASKAppears in:
| Field | Description | Default | Validation |
|---|---|---|---|
request PromptguardRequest array |
Prompt guards to apply to requests sent by the client. | MaxItems: 8 MinItems: 1 |
|
response PromptguardResponse array |
Prompt guards to apply to responses returned by the LLM provider. | MaxItems: 8 MinItems: 1 |
APIKeyAuthentication
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mode APIKeyAuthenticationMode |
Validation mode for api key authentication. | Strict | Enum: [Strict Optional] |
secretRef LocalObjectReference |
secretRef references a Kubernetes secret storing a set of API Keys. If there are many keys, ‘secretSelector’ can be used instead. Each entry in the Secret represents one API Key. The key is an arbitrary identifier. The value can either be: * A string, representing the API Key. * A JSON object, with two fields, key and metadata. key contains the API Key. metadata contains arbitrary JSONmetadata associated with the key, which may be used by other policies. For example, you may write an authorization policy allow apiKey.group == 'sales'.Example: apiVersion: v1 kind: Secret metadata: name: api-key stringData: client1: | { “key”: “k-123”, “metadata”: { “group”: “sales”, “created_at”: “2024-10-01T12:00:00Z”, } } client2: “k-456” |
||
secretSelector SecretSelector |
secretSelector selects multiple secrets containing API Keys. If the same key is defined in multiple secrets, the behavior is undefined. Each entry in the Secret represents one API Key. The key is an arbitrary identifier. The value can either be: * A string, representing the API Key. * A JSON object, with two fields, key and metadata. key contains the API Key. metadata contains arbitrary JSONmetadata associated with the key, which may be used by other policies. For example, you may write an authorization policy allow apiKey.group == 'sales'.Example: apiVersion: v1 kind: Secret metadata: name: api-key stringData: client1: | { “key”: “k-123”, “metadata”: { “group”: “sales”, “created_at”: “2024-10-01T12:00:00Z”, } } client2: “k-456” |
APIKeyAuthenticationMode
Underlying type: string
Validation:
- Enum: [Strict Optional]
Appears in:
| Field | Description |
|---|---|
Strict |
A valid API Key must be present. This is the default option. |
Optional |
If an API Key exists, validate it. Warning: this allows requests without an API Key! |
AWSGuardrailConfig
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
identifier string |
GuardrailIdentifier is the identifier of the Guardrail policy to use for the backend. | ||
version string |
GuardrailVersion is the version of the Guardrail policy to use for the backend. |
AccessLog
accessLogs specifies how per-request access logs are emitted.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
filter CELExpression |
filter specifies a CEL expression that is used to filter logs. A log will only be emitted if the expression evaluates to ’true’. |
||
attributes LogTracingAttributes |
attributes specifies customizations to the key-value pairs that are logged |
Action
Underlying type: string
Action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. PromptguardResponse matches are always masked by default.
Validation:
- Enum: [Mask Reject]
Appears in:
| Field | Description |
|---|---|
Mask |
Mask the matched data in the request. |
Reject |
Reject the request if the regex matches content in the request. |
AgentExtAuthGRPC
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
contextExtensions object (keys:string, values:string) |
contextExtensions specifies additional arbitrary key-value pairs to send to the authorization server in the context_extensions field. |
MaxProperties: 64 |
|
requestMetadata object (keys:string, values:CELExpression) |
requestMetadata specifies metadata to be sent to the authorization server. This maps to the metadata_context.filter_metadata field of the request, and allows dynamic CEL expressions.If unset, by default the envoy.filters.http.jwt_authn key is set if the JWT policy is used as well, for compatibility. |
MaxProperties: 64 |
AgentExtAuthHTTP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
path CELExpression |
path specifies the path to send to the authorization server. If unset, this defaults to the original request path. This is a CEL expression, which allows customizing the path based on the incoming request. For example, to add a prefix: path: '"/prefix/" + request.path'. |
||
redirect CELExpression |
redirect defines an optional expression to determine a path to redirect to on authorization failure. This is useful to redirect to a sign-in page. |
||
allowedRequestHeaders string array |
allowedRequestHeaders specifies what additional headers from the client request will be sent to the authorization server. If unset, the following headers are sent by default: Authorization. |
MaxItems: 64 |
|
addRequestHeaders object (keys:string, values:CELExpression) |
addRequestHeaders specifies what additional headers to add to the request to the authorization server. While allowedRequestHeaders just passes the original headers through, addRequestHeaders allows defining custom headers based on CEL Expressions. |
MaxProperties: 64 |
|
allowedResponseHeaders string array |
allowedResponseHeaders specifies what headers from the authorization response will be copied into the request to the backend. |
MaxItems: 64 |
|
responseMetadata object (keys:string, values:CELExpression) |
responseMetadata specifies what metadata fields should be constructed from the authorization response. These will be included under the extauthz variable in future CEL expressions. Setting this is useful to do things like loggingusernames, without needing to include them as headers to the backend (as allowedResponseHeaders would). |
MaxProperties: 64 |
AgentgatewayBackend
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
agentgateway.dev/v1alpha1 |
||
kind string |
AgentgatewayBackend |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AgentgatewayBackendSpec |
spec defines the desired state of AgentgatewayBackend. | ||
status AgentgatewayBackendStatus |
status defines the current state of AgentgatewayBackend. |
AgentgatewayBackendSpec
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
static StaticBackend |
static represents a static hostname. | ||
ai AIBackend |
ai represents a LLM backend. | ||
mcp MCPBackend |
mcp represents an MCP backend | ||
dynamicForwardProxy DynamicForwardProxyBackend |
dynamicForwardProxy configures the proxy to dynamically send requests to the destination based on the incoming request HTTP host header, or TLS SNI for TLS traffic. Note: this Backend type enables users to send trigger the proxy to send requests to arbitrary destinations. Proper access controls must be put in place when using this backend type. |
||
policies BackendFull |
policies controls policies for communicating with this backend. Policies may also be set in AgentgatewayPolicy; policies are merged on a field-level basis, with policies on the Backend (this field) taking precedence. |
AgentgatewayBackendStatus
AgentgatewayBackend defines the observed state of AgentgatewayBackend.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
conditions Condition array |
Conditions is the list of conditions for the backend. | MaxItems: 8 |
AgentgatewayParameters
AgentgatewayParameters are configuration that is used to dynamically provision the agentgateway data plane. Labels and annotations that apply to all resources may be specified at a higher level; see https://gateway-api.sigs.k8s.io/reference/spec/#gatewayinfrastructure
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
agentgateway.dev/v1alpha1 |
||
kind string |
AgentgatewayParameters |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AgentgatewayParametersSpec |
spec defines the desired state of AgentgatewayParameters. | ||
status AgentgatewayParametersStatus |
status defines the current state of AgentgatewayParameters. |
AgentgatewayParametersConfigs
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
logging AgentgatewayParametersLogging |
logging configuration for Agentgateway. By default, all logs are set to “info” level. | ||
rawConfig JSON |
rawConfig provides an opaque mechanism to configure the agentgateway config file (the agentgateway binary has a ‘-f’ option to specify a config file, and this is that file). This will be merged with configuration derived from typed fields like AgentgatewayParametersLogging.Format, and those typed fields will take precedence. Example: rawConfig: binds: - port: 3000 listeners: - routes: - policies: cors: allowOrigins: - “*" allowHeaders: - mcp-protocol-version - content-type - cache-control backends: - mcp: targets: - name: everything stdio: cmd: npx args: ["@modelcontextprotocol/server-everything”] |
Type: object |
|
image Image |
The agentgateway container image. See https://kubernetes.io/docs/concepts/containers/images for details. Default values, which may be overridden individually: registry: cr.agentgateway.dev repository: agentgateway tag: pullPolicy: <omitted, relying on Kubernetes defaults which depend on the tag> |
||
env EnvVar array |
The container environment variables. These override any existing values. If you want to delete an environment variable entirely, use $patch: delete with AgentgatewayParametersOverlays instead. Note thatvariable expansion does apply, but is highly discouraged – to set dependent environment variables, you can use $(VAR_NAME), but it’s highly discouraged. $$(VAR_NAME) avoids expansion and results in a literal$(VAR_NAME). |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
||
shutdown ShutdownSpec |
Shutdown delay configuration. How graceful planned or unplanned data plane changes happen is in tension with how quickly rollouts of the data plane complete. How long a data plane pod must wait for shutdown to be perfectly graceful depends on how you have configured your Gateways. |
AgentgatewayParametersLogging
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
level string |
Logging level in standard RUST_LOG syntax, e.g. ‘info’, the default, or by module, comma-separated. E.g., “rmcp=warn,hickory_server::server::server_future=off,typespec_client_core::http::policies::logging=warn” |
||
format AgentgatewayParametersLoggingFormat |
Enum: [json text] |
AgentgatewayParametersLoggingFormat
Underlying type: string
The default logging format is text.
Validation:
- Enum: [json text]
Appears in:
| Field | Description |
|---|---|
json |
|
text |
AgentgatewayParametersObjectMetadata
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
labels object (keys:string, values:string) |
Map of string keys and values that can be used to organize and categorize (scope and select) objects. May match selectors of replication controllers and services. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels |
||
annotations object (keys:string, values:string) |
Annotations is an unstructured key value map stored with a resource that may be set by external tools to store and retrieve arbitrary metadata. They are not queryable and should be preserved when modifying objects. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations |
AgentgatewayParametersOverlays
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
deployment KubernetesResourceOverlay |
deployment allows specifying overrides for the generated Deployment resource. | ||
service KubernetesResourceOverlay |
service allows specifying overrides for the generated Service resource. | ||
serviceAccount KubernetesResourceOverlay |
serviceAccount allows specifying overrides for the generated ServiceAccount resource. |
AgentgatewayParametersSpec
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
logging AgentgatewayParametersLogging |
logging configuration for Agentgateway. By default, all logs are set to “info” level. | ||
rawConfig JSON |
rawConfig provides an opaque mechanism to configure the agentgateway config file (the agentgateway binary has a ‘-f’ option to specify a config file, and this is that file). This will be merged with configuration derived from typed fields like AgentgatewayParametersLogging.Format, and those typed fields will take precedence. Example: rawConfig: binds: - port: 3000 listeners: - routes: - policies: cors: allowOrigins: - “*" allowHeaders: - mcp-protocol-version - content-type - cache-control backends: - mcp: targets: - name: everything stdio: cmd: npx args: ["@modelcontextprotocol/server-everything”] |
Type: object |
|
image Image |
The agentgateway container image. See https://kubernetes.io/docs/concepts/containers/images for details. Default values, which may be overridden individually: registry: cr.agentgateway.dev repository: agentgateway tag: pullPolicy: <omitted, relying on Kubernetes defaults which depend on the tag> |
||
env EnvVar array |
The container environment variables. These override any existing values. If you want to delete an environment variable entirely, use $patch: delete with AgentgatewayParametersOverlays instead. Note thatvariable expansion does apply, but is highly discouraged – to set dependent environment variables, you can use $(VAR_NAME), but it’s highly discouraged. $$(VAR_NAME) avoids expansion and results in a literal$(VAR_NAME). |
||
resources ResourceRequirements |
The compute resources required by this container. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ for details. |
||
shutdown ShutdownSpec |
Shutdown delay configuration. How graceful planned or unplanned data plane changes happen is in tension with how quickly rollouts of the data plane complete. How long a data plane pod must wait for shutdown to be perfectly graceful depends on how you have configured your Gateways. |
||
deployment KubernetesResourceOverlay |
deployment allows specifying overrides for the generated Deployment resource. | ||
service KubernetesResourceOverlay |
service allows specifying overrides for the generated Service resource. | ||
serviceAccount KubernetesResourceOverlay |
serviceAccount allows specifying overrides for the generated ServiceAccount resource. |
AgentgatewayParametersStatus
The current conditions of the AgentgatewayParameters. This is not currently implemented.
Appears in:
AgentgatewayPolicy
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
agentgateway.dev/v1alpha1 |
||
kind string |
AgentgatewayPolicy |
||
kind string |
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds |
||
apiVersion string |
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AgentgatewayPolicySpec |
spec defines the desired state of AgentgatewayPolicy. | ||
status PolicyStatus |
status defines the current state of AgentgatewayPolicy. |
AgentgatewayPolicySpec
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
targetRefs LocalPolicyTargetReferenceWithSectionName array |
targetRefs specifies the target resources by reference to attach the policy to. | MaxItems: 16 MinItems: 1 |
|
targetSelectors LocalPolicyTargetSelectorWithSectionName array |
targetSelectors specifies the target selectors to select resources to attach the policy to. | MaxItems: 16 MinItems: 1 |
|
frontend Frontend |
frontend defines settings for how to handle incoming traffic. A frontend policy can only target a Gateway. Listener and ListenerSet are not valid targets. When multiple policies are selected for a given request, they are merged on a field-level basis, but not a deep merge. For example, policy A sets ’tcp’ and ’tls’, and policy B sets ’tls’, the effective policy would be ’tcp’ from policy A, and ’tls’ from policy B. |
||
traffic Traffic |
traffic defines settings for how process traffic. A traffic policy can target a Gateway (optionally, with a sectionName indicating the listener), ListenerSet, Route (optionally, with a sectionName indicating the route rule). When multiple policies are selected for a given request, they are merged on a field-level basis, but not a deep merge. Precedence is given to more precise policies: Gateway < Listener < Route < Route Rule. For example, policy A sets ’timeouts’ and ‘retries’, and policy B sets ‘retries’, the effective policy would be ’timeouts’ from policy A, and ‘retries’ from policy B. |
||
backend BackendFull |
backend defines settings for how to connect to destination backends. A backend policy can target a Gateway (optionally, with a sectionName indicating the listener), ListenerSet, Route (optionally, with a sectionName indicating the route rule), or a Service/Backend (optionally, with a sectionName indicating the port (for Service) or sub-backend (for Backend). Note that a backend policy applies when connecting to a specific destination backend. Targeting a higher level resource, like Gateway, is just a way to easily apply a policy to a group of backends. When multiple policies are selected for a given request, they are merged on a field-level basis, but not a deep merge. Precedence is given to more precise policies: Gateway < Listener < Route < Route Rule < Backend/Service. For example, if a Gateway policy sets ’tcp’ and ’tls’, and a Backend policy sets ’tls’, the effective policy would be ‘tcp’ from the Gateway, and ’tls’ from the Backend. |
AnthropicConfig
AnthropicConfig settings for the Anthropic LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
Optional: Override the model name, such as gpt-4o-mini.If unset, the model name is taken from the request. |
AttributeAdd
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
|||
expression CELExpression |
AwsAuth
AwsAuth specifies the authentication method to use for the backend.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
secretRef LocalObjectReference |
SecretRef references a Kubernetes Secret containing the AWS credentials. The Secret must have keys “accessKey”, “secretKey”, and optionally “sessionToken”. |
AzureOpenAIConfig
AzureOpenAIConfig settings for the Azure OpenAI LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
endpoint string |
The endpoint for the Azure OpenAI API to use, such as my-endpoint.openai.azure.com.If the scheme is included, it is stripped. |
MinLength: 1 |
|
deploymentName string |
The name of the Azure OpenAI model deployment to use. For more information, see the Azure OpenAI model docs. This is required if ApiVersion is not ‘v1’. For v1, the model can be set in the request. |
MinLength: 1 |
|
apiVersion string |
The version of the Azure OpenAI API to use. For more information, see the Azure OpenAI API version reference. If unset, defaults to “v1” |
BackendAI
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
prompt AIPromptEnrichment |
Enrich requests sent to the LLM provider by appending and prepending system prompts. This can be configured only for LLM providers that use the CHAT or CHAT_STREAMING API route type. |
Optional | |
promptGuard AIPromptGuard |
promptGuard enables adding guardrails to LLM requests and responses. | Optional | |
defaults FieldDefault array |
Provide defaults to merge with user input fields. If the field is already set, the field in the request is used. | Optional MinItems: 1 MaxItems: 64 |
|
overrides FieldDefault array |
Provide overrides to merge with user input fields. If the field is already set, the field will be overwritten. | Optional MinItems: 1 MaxItems: 64 |
|
modelAliases object (keys:string, values:string) |
ModelAliases maps friendly model names to actual provider model names. Example: {“fast”: “gpt-3.5-turbo”, “smart”: “gpt-4-turbo”} Note: This field is only applicable when using the agentgateway data plane. | Optional MinItems: 1 MaxItems: 64 |
|
promptCaching PromptCachingConfig |
promptCaching enables automatic prompt caching for supported providers (AWS Bedrock). Reduces API costs by caching static content like system prompts and tool definitions. Only applicable for Bedrock Claude 3+ and Nova models. | Optional | |
routes object (keys:string, values:RouteType) |
routes defines how to identify the type of traffic to handle. The keys are URL path suffixes matched using ends-with comparison (e.g., “/v1/chat/completions”). The special “*” wildcard matches any path. If not specified, all traffic defaults to “completions” type. | Optional |
BackendAuth
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
key string |
key provides an inline key to use as the value of the Authorization header. This option is the least secure; usage of a Secret is preferred. | Optional MaxLength: 2048 |
|
secretRef LocalObjectReference |
secretRef references a Kubernetes secret storing the key to use the authorization value. This must be stored in the ‘Authorization’ key. | Optional MaxLength: 2048 |
|
passthrough BackendAuthPassthrough |
passthrough passes through an existing token that has been sent by the client and validated. Other policies, like JWT and API Key authentication, will strip the original client credentials. Passthrough backend authentication causes the original token to be added back into the request. If there are no client authentication policies on the request, the original token would be unchanged, so this would have no effect. | Optional | |
aws AwsAuth |
TODO: azure, gcp Auth specifies an explicit AWS authentication method for the backend. When omitted, we will try to use the default AWS SDK authentication methods. | Optional |
BackendAuthPassthrough
Appears in:
BackendFull
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
ai BackendAI |
ai specifies settings for AI workloads. This is only applicable when connecting to a Backend of type ‘ai’. | ||
mcp BackendMCP |
mcp specifies settings for MCP workloads. This is only applicable when connecting to a Backend of type ‘mcp’. |
BackendHTTP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
version HTTPVersion |
version specifies the HTTP protocol version to use when connecting to the backend. If not specified, the version is automatically determined: * Service types can specify it with ‘appProtocol’ on the Service port. * If traffic is identified as gRPC, HTTP2 is used. * If the incoming traffic was plaintext HTTP, the original protocol will be used. * If the incoming traffic was HTTPS, HTTP1 will be used. This is because most clients will transparently upgrade HTTPS traffic to HTTP2, even if the backend doesn’t support it | Optional | |
requestTimeout Duration |
requestTimeout specifies the deadline for receiving a response from the backend. | Optional |
BackendMCP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
authorization Authorization |
authorization defines MCPBackend level authorization. Unlike authorization at the HTTP level, which will reject unauthorized requests with a 403 error, this policy works at the MCPBackend level. List operations, such as list_tools, will have each item evaluated. Items that do not meet the rule will be filtered. Get or call operations, such as call_tool, will evaluate the specific item and reject requests that do not meet the rule. | Optional | |
authentication MCPAuthentication |
authentication defines MCPBackend specific authentication rules. | Optional |
BackendSimple
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
tcp BackendTCP |
tcp defines settings for managing TCP connections to the backend. | Optional | |
tls BackendTLS |
tls defines settings for managing TLS connections to the backend. If this field is set, TLS will be initiated to the backend; the system trusted CA certificates will be used to validate the server, and the SNI will automatically be set based on the destination. | Optional | |
http BackendHTTP |
http defines settings for managing HTTP requests to the backend. | Optional | |
auth BackendAuth |
auth defines settings for managing authentication to the backend | Optional |
BackendTCP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
keepalive Keepalive |
keepAlive defines settings for enabling TCP keepalives on the connection. | Optional | |
connectTimeout Duration |
connectTimeout defines the deadline for establishing a connection to the destination. | Optional XValidation |
BackendTLS
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mtlsCertificateRef LocalObjectReference array |
mtlsCertificateRef enables mutual TLS to the backend, using the specified key (tls.key) and cert (tls.crt) from the refenced Secret. An optional ‘ca.cert’ field, if present, will be used to verify the server certificate if present. If caCertificateRefs is also specified, the caCertificateRefs field takes priority. If unspecified, no client certificate will be used. | Optional MaxItems: 1 |
|
caCertificateRefs LocalObjectReference array |
caCertificateRefs defines the CA certificate ConfigMap to use to verify the server certificate. If unset, the system’s trusted certificates are used. | Optional MaxItems: 1 |
|
insecureSkipVerify InsecureTLSMode |
insecureSkipVerify originates TLS but skips verification of the backend’s certificate. WARNING: This is an insecure option that should only be used if the risks are understood. There are two modes: * All disables all TLS verification * Hostname verifies the CA certificate is trusted, but ignores any mismatch of hostname/SANs. Note that this method is still insecure; prefer setting verifySubjectAltNames to customize the valid hostnames if possible. | Optional | |
sni SNI |
sni specifies the Server Name Indicator (SNI) to be used in the TLS handshake. If unset, the SNI is automatically set based on the destination hostname. | Optional | |
verifySubjectAltNames ShortString array |
verifySubjectAltNames specifies the Subject Alternative Names (SAN) to verify in the server certificate. If not present, the destination hostname is automatically used. | Optional MinItems: 1 MaxItems: 16 |
|
alpnProtocols []TinyString |
alpnProtocols sets the Application Level Protocol Negotiation (ALPN) value to use in the TLS handshake. If not present, defaults to [“h2”, “http/1.1”]. | Optional MinItems: 1 MaxItems: 16 |
BackendWithAI
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
ai BackendAI |
ai specifies settings for AI workloads. This is only applicable when connecting to a Backend of type ‘ai’. |
BackendWithMCP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mcp BackendMCP |
mcp specifies settings for MCP workloads. This is only applicable when connecting to a Backend of type ‘mcp’. |
BasicAuthentication
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mode BasicAuthenticationMode |
validation mode for basic auth authentication. | Strict | Enum: [Strict Optional] |
realm string |
realm specifies the ‘realm’ to return in the WWW-Authenticate header for failed authentication requests. If unset, “Restricted” will be used. |
||
users string array |
users provides an inline list of username/password pairs that will be accepted. Each entry represents one line of the htpasswd format: https://httpd.apache.org/docs/2.4/programs/htpasswd.html. Note: passwords should be the hash of the password, not the raw password. Use the htpasswd or similar commandsto generate a hash. MD5, bcrypt, crypt, and SHA-1 are supported. Example: users: - “user1:$apr1$ivPt0D4C$DmRhnewfHRSrb3DQC.WHC." - “user2:$2y$05$r3J4d3VepzFkedkd/q1vI.pBYIpSqjfN0qOARV3ScUHysatnS0cL2” |
MaxItems: 256 MinItems: 1 |
|
secretRef LocalObjectReference |
secretRef references a Kubernetes secret storing the .htaccess file. The Secret must have a key named ‘.htaccess’, and should contain the complete .htaccess file. Note: passwords should be the hash of the password, not the raw password. Use the htpasswd or similar commandsto generate a hash. MD5, bcrypt, crypt, and SHA-1 are supported. Example: apiVersion: v1 kind: Secret metadata: name: basic-auth stringData: .htaccess: | alice:$apr1$3zSE0Abt$IuETi4l5yO87MuOrbSE4V. bob:$apr1$Ukb5LgRD$EPY2lIfY.A54jzLELNIId/ |
BasicAuthenticationMode
Underlying type: string
Validation:
- Enum: [Strict Optional]
Appears in:
| Field | Description |
|---|---|
Strict |
A valid username and password must be present. This is the default option. |
Optional |
If a username and password exists, validate it. Warning: this allows requests without a username! |
BedrockConfig
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
region string |
Region is the AWS region to use for the backend. Defaults to us-east-1 if not specified. |
us-east-1 | MaxLength: 63 MinLength: 1 Pattern: ^[a-z0-9-]+$ |
model string |
Optional: Override the model name, such as gpt-4o-mini.If unset, the model name is taken from the request. |
||
guardrail AWSGuardrailConfig |
Guardrail configures the Guardrail policy to use for the backend. See https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html If not specified, the AWS Guardrail policy will not be used. |
BuiltIn
Underlying type: string
BuiltIn regex patterns for specific types of strings in prompts.
For example, if you specify CreditCard, any credit card numbers
in the request or response are matched.
Validation:
- Enum: [Ssn CreditCard PhoneNumber Email CaSin]
Appears in:
| Field | Description |
|---|---|
Ssn |
Default regex matching for Social Security numbers. |
CreditCard |
Default regex matching for credit card numbers. |
PhoneNumber |
Default regex matching for phone numbers. |
Email |
Default regex matching for email addresses. |
CaSin |
Default regex matching for Canadian Social Insurance Numbers. |
CORS
Appears in:
CSRF
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
additionalOrigins string array |
additionalOrigin specifies additional source origins that will be allowed in addition to the destination origin. TheOrigin consists of a scheme and a host, with an optional port, and takes the form <scheme>://<host>(:<port>). |
MaxItems: 16 MinItems: 1 |
CustomResponse
CustomResponse configures a response to return to the client if request content
is matched against a regex pattern and the action is REJECT.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
message string |
A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
The request was rejected due to inappropriate content | |
statusCode integer |
The status code to return to the client. Defaults to 403. | 403 | Maximum: 599 Minimum: 200 |
DirectResponse
DirectResponse defines the policy to send a direct response to the client.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
status integer |
StatusCode defines the HTTP status code to return for this route. | Maximum: 599 Minimum: 200 |
|
body string |
Body defines the content to be returned in the HTTP response body. The maximum length of the body is restricted to prevent excessively large responses. If this field is omitted, no body is included in the response. |
MaxLength: 4096 MinLength: 1 |
DynamicForwardProxyBackend
Appears in:
ExtAuth
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the External Authorization server to reach. Supported types: Service and Backend. |
||
grpc AgentExtAuthGRPC |
grpc specifies that the gRPC External Authorization protocol should be used. |
||
http AgentExtAuthHTTP |
http specifies that the HTTP protocol should be used for connecting to the authorization server. The authorization server must return a 200 status code, otherwise the request is considered an authorization failure. |
||
forwardBody ExtAuthBody |
forwardBody configures whether to include the HTTP body in the request. If enabled, the request body will be buffered. |
ExtAuthBody
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
maxSize integer |
maxSize specifies how large in bytes the largest body that will be buffered and sent to the authorization server. If the body size is larger than maxSize, then the request will be rejected with a response. |
Minimum: 1 |
ExtProc
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the External Processor server to reach. Supported types: Service and Backend. |
FieldDefault
FieldDefault provides default values for specific fields in the JSON request body sent to the LLM provider. These defaults are merged with the user-provided request to ensure missing fields are populated.
User input fields here refer to the fields in the JSON request body that a client sends when making a request to the LLM provider.
Defaults set here do not override those user-provided values unless you explicitly set override to true.
Example: Setting a default system field for Anthropic, which does not support system role messages:
defaults:
- field: "system"
value: "answer all questions in French"Example: Setting a default temperature and overriding max_tokens:
defaults:
- field: "temperature"
value: "0.5"
- field: "max_tokens"
value: "100"
override: trueExample: Setting custom lists fields:
defaults:
- field: "custom_integer_list"
value: [1,2,3]
overrides:
- field: "custom_string_list"
value: ["one","two","three"]Note: The field values correspond to keys in the JSON request body, not fields in this CRD.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
field string |
The name of the field. | MinLength: 1 |
|
value JSON |
The field default value, which can be any JSON Data Type. |
Frontend
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
tcp FrontendTCP |
tcp defines settings on managing incoming TCP connections. | ||
tls FrontendTLS |
tls defines settings on managing incoming TLS connections. | ||
http FrontendHTTP |
http defines settings on managing incoming HTTP requests. | ||
accessLog AccessLog |
AccessLoggingConfig contains access logging configuration | ||
tracing Tracing |
Tracing contains various settings for OpenTelemetry tracer. TODO: not currently implemented |
FrontendHTTP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
maxBufferSize integer |
maxBufferSize defines the maximum size HTTP body that will be buffered into memory. Bodies will only be buffered for policies which require buffering. If unset, this defaults to 2mb. |
Minimum: 1 |
|
http1MaxHeaders integer |
http1MaxHeaders defines the maximum number of headers that are allowed in HTTP/1.1 requests. If unset, this defaults to 100. |
Maximum: 4096 Minimum: 1 |
|
http1IdleTimeout Duration |
http1IdleTimeout defines the timeout before an unused connection is closed. If unset, this defaults to 10 minutes. |
||
http2WindowSize integer |
http2WindowSize indicates the initial window size for stream-level flow control for received data. | Minimum: 1 |
|
http2ConnectionWindowSize integer |
http2ConnectionWindowSize indicates the initial window size for connection-level flow control for received data. | Minimum: 1 |
|
http2FrameSize integer |
http2FrameSize sets the maximum frame size to use. If unset, this defaults to 16kb |
Maximum: 1.677215e+06 Minimum: 16384 |
|
http2KeepaliveInterval Duration |
|||
http2KeepaliveTimeout Duration |
FrontendTCP
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
keepalive Keepalive |
keepalive defines settings for enabling TCP keepalives on the connection. |
FrontendTLS
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
handshakeTimeout Duration |
handshakeTimeout specifies the deadline for a TLS handshake to complete. If unset, this defaults to 15s. |
||
alpnProtocols string |
alpnProtocols sets the Application Level Protocol Negotiation (ALPN) value to use in the TLS handshake. If not present, defaults to [“h2”, “http/1.1”]. |
MaxItems: 16 MinItems: 1 |
GeminiConfig
GeminiConfig settings for the Gemini LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
Optional: Override the model name, such as gemini-2.5-pro.If unset, the model name is taken from the request. |
GlobalRateLimit
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the Rate Limit server to reach. Supported types: Service and Backend. |
||
domain string |
domain specifies the domain under which this limit should apply. This is an arbitrary string that enables a rate limit server to distinguish between different applications. |
||
descriptors RateLimitDescriptor array |
Descriptors define the dimensions for rate limiting. These values are passed to the rate limit service which applies configured limits based on them. Each descriptor represents a single rate limit rule with one or more entries. |
MaxItems: 16 MinItems: 1 |
HTTPVersion
Underlying type: string
Appears in:
| Field | Description |
|---|---|
HTTP1 |
|
HTTP2 |
HeaderName
Underlying type: string
An HTTP Header Name.
Validation:
- MaxLength: 256
- MinLength: 1
- Pattern:
^:?[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$
Appears in:
HeaderTransformation
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name HeaderName |
the name of the header to add. | MaxLength: 256 MinLength: 1 Pattern: ^:?[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$ |
|
value CELExpression |
value is the CEL expression to apply to generate the output value for the header. |
HostnameRewrite
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mode HostnameRewriteMode |
mode sets the hostname rewrite mode. The following may be specified: * Auto: automatically set the Host header based on the destination. * None: do not rewrite the Host header. The original Host header will be passed through. This setting defaults to Auto when connecting to hostname-based Backend types, and None otherwise (for Service or IP-based Backends). |
HostnameRewriteMode
Underlying type: string
Appears in:
| Field | Description |
|---|---|
Auto |
|
None |
Image
A container image. See https://kubernetes.io/docs/concepts/containers/images for details.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
registry string |
The image registry. | ||
repository string |
The image repository (name). | ||
tag string |
The image tag. | ||
digest string |
The hash digest of the image, e.g. sha256:12345... |
||
pullPolicy PullPolicy |
The image pull policy for the container. See https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy for details. |
InsecureTLSMode
Underlying type: string
Appears in:
| Field | Description |
|---|---|
All |
InsecureTLSModeInsecure disables all TLS verification |
Hostname |
InsecureTLSModeHostname enables verifying the CA certificate, but disables verification of the hostname/SAN. Note this is still, generally, very “insecure” as the name suggests. |
JWKS
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
remote RemoteJWKS |
remote specifies how to reach the JSON Web Key Set from a remote address. | ||
inline string |
inline specifies an inline JSON Web Key Set used validate the signature of the JWT. | MaxLength: 65536 MinLength: 2 |
JWTAuthentication
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
mode JWTAuthenticationMode |
validation mode for JWT authentication. | Strict | Enum: [Strict Optional Permissive] |
providers JWTProvider array |
MaxItems: 64 MinItems: 1 |
JWTAuthenticationMode
Underlying type: string
Validation:
- Enum: [Strict Optional Permissive]
Appears in:
| Field | Description |
|---|---|
Strict |
A valid token, issued by a configured issuer, must be present. This is the default option. |
Optional |
If a token exists, validate it. Warning: this allows requests without a JWT token! |
Permissive |
Requests are never rejected. This is useful for usage of claims in later steps (authorization, logging, etc). Warning: this allows requests without a JWT token! |
JWTProvider
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
issuer string |
issuer identifies the IdP that issued the JWT. This corresponds to the ‘iss’ claim (https://tools.ietf.org/html/rfc7519#section-4.1.1). | ||
audiences string array |
audiences specifies the list of allowed audiences that are allowed access. This corresponds to the ‘aud’ claim (https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.3). If unset, any audience is allowed. |
MaxItems: 64 MinItems: 1 |
|
jwks JWKS |
jwks defines the JSON Web Key Set used to validate the signature of the JWT. |
Keepalive
TCP Keepalive settings
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
retries integer |
retries specifies the maximum number of keep-alive probes to send before dropping the connection. If unset, this defaults to 9. |
Maximum: 64 Minimum: 1 |
|
time Duration |
time specifies the number of seconds a connection needs to be idle before keep-alive probes start being sent. If unset, this defaults to 180s. |
||
interval Duration |
interval specifies the number of seconds between keep-alive probes. If unset, this defaults to 180s. |
KubernetesResourceOverlay
KubernetesResourceOverlay provides a mechanism to customize generated Kubernetes resources using Strategic Merge Patch semantics.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
metadata AgentgatewayParametersObjectMetadata |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec JSON |
Spec provides an opaque mechanism to configure the resource Spec. This field accepts a complete or partial Kubernetes resource spec (e.g., PodSpec, ServiceSpec) and will be merged with the generated configuration using Strategic Merge Patch semantics. The patch is applied after all other fields are applied. If you merge-patch the same resource from AgentgatewayParameters on the GatewayClass and also from AgentgatewayParameters on the Gateway, then the GatewayClass merge-patch happens first. # Strategic Merge Patch & Deletion Guide This merge strategy allows you to override individual fields, merge lists, or delete items without needing to provide the entire resource definition. 1. Replacing Values (Scalars): Simple fields (strings, integers, booleans) in your config will overwrite the generated defaults. 2. Merging Lists (Append/Merge): Lists with “merge keys” (like containers which merges on name, or tolerations which merges on key)will append your items to the generated list, or update existing items if keys match. 3. Deleting List Items ($patch: delete): To remove an item from a generated list (e.g., removing a default sidecar), you must use the special $patch: delete directive.spec: containers: - name: agentgateway # Delete the securityContext using $patch: delete securityContext: $patch: delete 4. Deleting/Clearing Map Fields (null): To remove a map field or a scalar entirely, set its value to null.spec: template: spec: nodeSelector: null # Removes default nodeSelector 5. Replacing Lists Entirely ($patch: replace): If you want to strictly define a list and ignore all generated defaults, use $patch: replace.service: spec: ports: - $patch: replace - name: http port: 80 targetPort: 8080 protocol: TCP - name: https port: 443 targetPort: 8443 protocol: TCP |
Type: object |
LLMProvider
LLMProvider specifies the target large language model provider that the backend should route requests to.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
openai OpenAIConfig |
OpenAI provider | ||
azureopenai AzureOpenAIConfig |
Azure OpenAI provider | ||
anthropic AnthropicConfig |
Anthropic provider | ||
gemini GeminiConfig |
Gemini provider | ||
vertexai VertexAIConfig |
Vertex AI provider | ||
bedrock BedrockConfig |
Bedrock provider | ||
host string |
Host specifies the hostname to send the requests to. If not specified, the default hostname for the provider is used. |
||
port integer |
Port specifies the port to send the requests to. | Maximum: 65535 Minimum: 1 |
|
path string |
Path specifies the URL path to use for the LLM provider API requests. This is useful when you need to route requests to a different API endpoint while maintaining compatibility with the original provider’s API structure. If not specified, the default path for the provider is used. |
LocalRateLimit
Policy for local rate limiting. Local rate limits are handled locally on a per-proxy basis, without co-ordination between instances of the proxy.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
requests integer |
requests specifies the number of HTTP requests per unit of time that are allowed. Requests exceeding this limit will fail with a 429 error. |
Minimum: 1 |
|
tokens integer |
tokens specifies the number of LLM tokens per unit of time that are allowed. Requests exceeding this limit will fail with a 429 error. Both input and output tokens are counted. However, token counts are not known until the request completes. As a result, token-based rate limits will apply to future requests only. |
Minimum: 1 |
|
unit LocalRateLimitUnit |
unit specifies the unit of time that requests are limited based on. | Enum: [Seconds Minutes Hours] |
|
burst integer |
burst specifies an allowance of requests above the request-per-unit that should be allowed within a short period of time. |
LocalRateLimitUnit
Underlying type: string
Appears in:
| Field | Description |
|---|---|
Seconds |
|
Minutes |
|
Hours |
LogTracingAttributes
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
remove string array |
remove lists the default fields that should be removed. For example, “http.method”. | MaxItems: 32 MinItems: 1 |
|
add AttributeAdd array |
add specifies additional key-value pairs to be added to each entry. The value is a CEL expression. If the CEL expression fails to evaluate, the pair will be excluded. |
MinItems: 1 |
MCPAuthentication
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
resourceMetadata object (keys:string, values:JSON) |
ResourceMetadata defines the metadata to use for MCP resources. | ||
provider McpIDP |
McpIDP specifies the identity provider to use for authentication | Enum: [Auth0 Keycloak] |
|
issuer string |
Issuer identifies the IdP that issued the JWT. This corresponds to the ‘iss’ claim (https://tools.ietf.org/html/rfc7519#section-4.1.1). | ||
audiences string array |
audiences specify the list of allowed audiences that are allowed access. This corresponds to the ‘aud’ claim (https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.3). If unset, any audience is allowed. |
MaxItems: 64 MinItems: 1 |
|
jwks RemoteJWKS |
jwks defines the remote JSON Web Key used to validate the signature of the JWT. | ||
mode JWTAuthenticationMode |
validation mode for JWT authentication. | Enum: [Strict Optional Permissive] |
MCPBackend
MCPBackend configures mcp backends
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
targets McpTargetSelector array |
Targets is a list of MCPBackend targets to use for this backend. Policies targeting MCPBackend targets must use targetRefs[].sectionName to select the target by name. |
MaxItems: 32 MinItems: 1 |
|
sessionRouting SessionRouting |
SessionRouting configures MCP session behavior for requests. Defaults to Stateful if not set. |
Enum: [Stateful Stateless] |
MCPProtocol
Underlying type: string
MCPProtocol defines the protocol to use for the MCPBackend target
Validation:
- Enum: [StreamableHTTP SSE]
Appears in:
| Field | Description |
|---|---|
StreamableHTTP |
MCPProtocolStreamableHTTP specifies Streamable HTTP must be used as the protocol |
SSE |
MCPProtocolSSE specifies Server-Sent Events (SSE) must be used as the protocol |
McpIDP
Underlying type: string
Appears in:
| Field | Description |
|---|---|
Auth0 |
|
Keycloak |
McpSelector
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
namespaces LabelSelector |
namespace is the label selector in which namespaces Services should be selected from. If unset, only the namespace of the AgentgatewayBackend is searched. |
||
services LabelSelector |
services is the label selector for which Services should be selected. |
McpTarget
McpTarget defines a single MCPBackend target configuration.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
host string |
Host is the hostname or IP address of the MCPBackend target. | ||
port integer |
Port is the port number of the MCPBackend target. | Maximum: 65535 Minimum: 1 |
|
path string |
Path is the URL path of the MCPBackend target endpoint. Defaults to “/sse” for SSE protocol or “/mcp” for StreamableHTTP protocol if not specified. |
||
protocol MCPProtocol |
Protocol is the protocol to use for the connection to the MCPBackend target. | Enum: [StreamableHTTP SSE] |
|
policies BackendWithMCP |
policies controls policies for communicating with this backend. Policies may also be set in AgentgatewayPolicy, or in the top level AgentgatewayBackend. Policies are merged on a field-level basis, with order: AgentgatewayPolicy «br />AgentgatewayBackend < AgentgatewayBackend MCP (this field). |
McpTargetSelector
McpTargetSelector defines the MCPBackend target to use for this backend.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name SectionName |
Name of the MCPBackend target. | ||
selector McpSelector |
selector is a label selector is the selector to use to select Services. If policies are needed on a per-service basis, AgentgatewayPolicy can target the desired Service. |
||
static McpTarget |
static configures a static MCP destination. When connecting to in-cluster Services, it is recommended to use ‘selector’ instead. |
Message
An entry for a message to prepend or append to each prompt.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
role string |
Role of the message. The available roles depend on the backend LLM provider model, such as SYSTEM or USER in the OpenAI API. |
||
content string |
String content of the message. |
NamedLLMProvider
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name SectionName |
Name of the provider. Policies can target this provider by name. | ||
policies BackendWithAI |
policies controls policies for communicating with this backend. Policies may also be set in AgentgatewayPolicy, or in the top level AgentgatewayBackend. policies are merged on a field-level basis, with order: AgentgatewayPolicy «br />AgentgatewayBackend < AgentgatewayBackend LLM provider (this field). |
||
openai OpenAIConfig |
OpenAI provider | ||
azureopenai AzureOpenAIConfig |
Azure OpenAI provider | ||
anthropic AnthropicConfig |
Anthropic provider | ||
gemini GeminiConfig |
Gemini provider | ||
vertexai VertexAIConfig |
Vertex AI provider | ||
bedrock BedrockConfig |
Bedrock provider | ||
host string |
Host specifies the hostname to send the requests to. If not specified, the default hostname for the provider is used. |
||
port integer |
Port specifies the port to send the requests to. | Maximum: 65535 Minimum: 1 |
|
path string |
Path specifies the URL path to use for the LLM provider API requests. This is useful when you need to route requests to a different API endpoint while maintaining compatibility with the original provider’s API structure. If not specified, the default path for the provider is used. |
OpenAIConfig
OpenAIConfig settings for the OpenAI LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
Optional: Override the model name, such as gpt-4o-mini.If unset, the model name is taken from the request. |
OpenAIModeration
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
model specifies the moderation model to use. For example, omni-moderation. |
||
policies BackendSimple |
policies controls policies for communicating with OpenAI. |
PolicyPhase
Underlying type: string
Validation:
- Enum: [PreRouting PostRouting]
Appears in:
| Field | Description |
|---|---|
PreRouting |
|
PostRouting |
PriorityGroup
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
providers NamedLLMProvider array |
providers specifies a list of LLM providers within this group. Each provider is treated equally in terms of priority, with automatic weighting based on health. |
MaxItems: 32 MinItems: 1 |
PromptCachingConfig
PromptCachingConfig configures automatic prompt caching for supported LLM providers. Currently only AWS Bedrock supports this feature (Claude 3+ and Nova models).
When enabled, the gateway automatically inserts cache points at strategic locations to reduce API costs. Bedrock charges lower rates for cached tokens (90% discount).
Example:
promptCaching:
cacheSystem: true # Cache system prompts
cacheMessages: true # Cache conversation history
cacheTools: false # Don't cache tool definitions
minTokens: 1024 # Only cache if ≥1024 tokens
Cost savings example:
- Without caching: 10,000 tokens × $3/MTok = $0.03
- With caching (90% cached): 1,000 × $3/MTok + 9,000 × $0.30/MTok = $0.0057 (81% savings)
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
cacheSystem boolean |
CacheSystem enables caching for system prompts. Inserts a cache point after all system messages. |
true | |
cacheMessages boolean |
CacheMessages enables caching for conversation messages. Caches all messages in the conversation for cost savings. |
true | |
cacheTools boolean |
CacheTools enables caching for tool definitions. Inserts a cache point after all tool specifications. |
false | |
minTokens integer |
MinTokens specifies the minimum estimated token count before caching is enabled. Uses rough heuristic (word count × 1.3) to estimate tokens. Bedrock requires at least 1,024 tokens for caching to be effective. |
1024 | Minimum: 0 |
PromptguardRequest
PromptguardRequest defines the prompt guards to apply to requests sent by the client.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
response CustomResponse |
A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
||
regex Regex |
Regular expression (regex) matching for prompt guards and data masking. | ||
webhook Webhook |
Configure a webhook to forward requests to for prompt guarding. | ||
openAIModeration OpenAIModeration |
openAIModeration passes prompt data through the OpenAI Moderations endpoint. See https://platform.openai.com/docs/api-reference/moderations for more information. |
PromptguardResponse
PromptguardResponse configures the response that the prompt guard applies to responses returned by the LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
response CustomResponse |
A custom response message to return to the client. If not specified, defaults to “The response was rejected due to inappropriate content”. |
||
regex Regex |
Regular expression (regex) matching for prompt guards and data masking. | ||
webhook Webhook |
Configure a webhook to forward responses to for prompt guarding. |
RateLimitDescriptor
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
entries RateLimitDescriptorEntry array |
entries are the individual components that make up this descriptor. | MaxItems: 16 MinItems: 1 |
|
unit RateLimitUnit |
unit defines what to use as the cost function. If unspecified, Requests is used. | Enum: [Requests Tokens] |
RateLimitDescriptorEntry
A descriptor entry defines a single entry in a rate limit descriptor.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name TinyString |
name specifies the name of the descriptor. | Required | |
expression CELExpression |
expression is a Common Expression Language (CEL) expression that defines the value for the descriptor. For example, to rate limit based on the Client IP: source.address. See https://agentgateway.dev/docs/reference/cel/ for more info. |
Required |
RateLimits
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
local LocalRateLimit array |
Local defines a local rate limiting policy. | MaxItems: 16 MinItems: 1 |
|
global GlobalRateLimit |
Global defines a global rate limiting policy using an external service. |
Regex
Regex configures the regular expression (regex) matching for prompt guards and data masking.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
matches string array |
A list of regex patterns to match against the request or response. Matches and built-ins are additive. |
||
builtins BuiltIn array |
A list of built-in regex patterns to match against the request or response. Matches and built-ins are additive. |
Enum: [Ssn CreditCard PhoneNumber Email CaSin] |
|
action Action |
The action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. PromptguardResponse matches are always masked by default. Defaults to Mask. |
Mask | Enum: [Mask Reject] |
RemoteJWKS
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
jwksPath string |
Path to IdP jwks endpoint, relative to the root, commonly “.well-known/jwks.json”. | Required MinLength: 1 MaxLength: 2000 |
|
cacheDuration Duration |
5m | Required Optional MinLength: 1 MaxLength: 2000 XValidation |
|
backendRef BackendObjectReference |
backendRef references the remote JWKS server to reach. Supported types are Service and (static) Backend. An AgentgatewayPolicy containing backend tls config can then be attached to the service/backend in order to set tls options for a connection to the remote jwks source. | 5m | Required Optional XValidation |
Retry
Retry defines the retry policy
Appears in:
RouteType
Underlying type: string
RouteType specifies how the AI gateway should process incoming requests based on the URL path and the API format expected.
Validation:
- Enum: [Completions Messages Models Passthrough Responses AnthropicTokenCount Embeddings]
Appears in:
| Field | Description |
|---|---|
Completions |
RouteTypeCompletions processes OpenAI /v1/chat/completions format requests |
Messages |
RouteTypeMessages processes Anthropic /v1/messages format requests |
Models |
RouteTypeModels handles /v1/models endpoint (returns available models) |
Passthrough |
RouteTypePassthrough sends requests to upstream as-is without LLM processing |
Responses |
RouteTypeResponses processes OpenAI /v1/responses format requests |
AnthropicTokenCount |
RouteTypeAnthropicTokenCount processes Anthropic /v1/messages/count_tokens format requests |
Embeddings |
RouteTypeEmbeddings processes OpenAI /v1/embeddings format requests |
SecretSelector
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
matchLabels object (keys:string, values:string) |
Label selector to select the target resource. |
SessionRouting
Underlying type: string
Validation:
- Enum: [Stateful Stateless]
Appears in:
| Field | Description |
|---|---|
Stateful |
Stateful mode creates an MCP session (via mcp-session-id) and internally ensures requests for that session are routed to a consistent backend replica. |
Stateless |
ShutdownSpec
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
min integer |
Minimum time (in seconds) to wait before allowing Agentgateway to terminate. Refer to the CONNECTION_MIN_TERMINATION_DEADLINE environment variable for details. |
Maximum: 3.1536e+07 Minimum: 0 |
|
max integer |
Maximum time (in seconds) to wait before allowing Agentgateway to terminate. Refer to the TERMINATION_GRACE_PERIOD_SECONDS environment variable for details. |
Maximum: 3.1536e+07 Minimum: 0 |
StaticBackend
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
host string |
host to connect to. | ||
port integer |
port to connect to. | Maximum: 65535 Minimum: 1 |
Timeouts
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
request Duration |
request specifies a timeout for an individual request from the gateway to a backend. This covers the time from when the request first starts being sent from the gateway to when the full response has been received from the backend. |
Tracing
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the OTLP server to reach. Supported types: Service and Backend. |
||
protocol TracingProtocol |
protocol specifies the OTLP protocol variant to use. | HTTP | Enum: [HTTP GRPC] |
attributes LogTracingAttributes |
attributes specifies customizations to the key-value pairs that are included in the trace | ||
randomSampling CELExpression |
randomSampling is an expression to determine the amount of random sampling. Random sampling will initiate a new trace span if the incoming request does not have a trace initiated already. This should evaluate to a float between 0.0-1.0, or a boolean (true/false) If unspecified, random sampling is disabled. |
||
clientSampling CELExpression |
clientSampling is an expression to determine the amount of client sampling. Client sampling determines whether to initiate a new trace span if the incoming request does have a trace already. This should evaluate to a float between 0.0-1.0, or a boolean (true/false) If unspecified, client sampling is 100% enabled. |
TracingProtocol
Underlying type: string
Appears in:
| Field | Description |
|---|---|
HTTP |
|
GRPC |
Traffic
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
phase PolicyPhase |
The phase to apply the traffic policy to. If the phase is PreRouting, the targetRef must be a Gateway or a Listener. PreRouting is typically used only when a policy needs to influence the routing decision. Even when using PostRouting mode, the policy can target the Gateway/Listener. This is a helper for applying the policy to all routes under that Gateway/Listener, and follows the merging logic described above. Note: PreRouting and PostRouting rules do not merge together. These are independent execution phases. That is, all PreRouting rules will merge and execute, then all PostRouting rules will merge and execute. If unset, this defaults to PostRouting. |
Enum: [PreRouting PostRouting] |
|
transformation Transformation |
transformation is used to mutate and transform requests and responses before forwarding them to the destination. |
||
extProc ExtProc |
extProc specifies the external processing configuration for the policy. | ||
extAuth ExtAuth |
extAuth specifies the external authentication configuration for the policy. This controls what external server to send requests to for authentication. |
||
rateLimit RateLimits |
rateLimit specifies the rate limiting configuration for the policy. This controls the rate at which requests are allowed to be processed. |
||
cors CORS |
cors specifies the CORS configuration for the policy. | ||
csrf CSRF |
csrf specifies the Cross-Site Request Forgery (CSRF) policy for this traffic policy. The CSRF policy has the following behavior: * Safe methods (GET, HEAD, OPTIONS) are automatically allowed * Requests without Sec-Fetch-Site or Origin headers are assumed to be same-origin or non-browser requests and are allowed. * Otherwise, the Sec-Fetch-Site header is checked, with a fallback to comparing the Origin header to the Host header. |
||
headerModifiers HeaderModifiers |
headerModifiers defines the policy to modify request and response headers. | ||
hostRewrite HostnameRewrite |
hostRewrite specifies how to rewrite the Host header for requests. If the HTTPRoute urlRewrite filter already specifies a host rewrite, this setting is ignored. |
Enum: [Auto None] |
|
timeouts Timeouts |
timeouts defines the timeouts for requests It is applicable to HTTPRoutes and ignored for other targeted kinds. |
||
retry Retry |
retry defines the policy for retrying requests. | ||
authorization Authorization |
authorization specifies the access rules based on roles and permissions. If multiple authorization rules are applied across different policies (at the same, or different, attahcment points), all rules are merged. |
||
jwtAuthentication JWTAuthentication |
jwtAuthentication authenticates users based on JWT tokens. | ||
basicAuthentication BasicAuthentication |
basicAuthentication authenticates users based on the “Basic” authentication scheme (RFC 7617), where a username and password are encoded in the request. |
||
apiKeyAuthentication APIKeyAuthentication |
apiKeyAuthentication authenticates users based on a configured API Key. | ||
directResponse DirectResponse |
direct response configures the policy to send a direct response to the client. |
Transform
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
set HeaderTransformation array |
set is a list of headers and the value they should be set to. | MaxItems: 16 MinItems: 1 |
|
add HeaderTransformation array |
add is a list of headers to add to the request and what that value should be set to. If there is already a header with these values then append the value as an extra entry. |
MaxItems: 16 MinItems: 1 |
|
remove HeaderName array |
Remove is a list of header names to remove from the request/response. | MaxItems: 16 MaxLength: 256 MinItems: 1 MinLength: 1 Pattern: ^:?[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$ |
|
body CELExpression |
body controls manipulation of the HTTP body. |
Transformation
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
request Transform |
request is used to modify the request path. | ||
response Transform |
response is used to modify the response path. |
VertexAIConfig
VertexAIConfig settings for the Vertex AI LLM provider.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string |
Optional: Override the model name, such as gpt-4o-mini.If unset, the model name is taken from the request. |
||
projectId string |
The ID of the Google Cloud Project that you use for the Vertex AI. | MinLength: 1 |
|
region string |
The location of the Google Cloud Project that you use for the Vertex AI. | MinLength: 1 |
Webhook
Webhook configures a webhook to forward requests or responses to for prompt guarding.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
backendRef BackendObjectReference |
backendRef references the webhook server to reach. Supported types: Service and Backend. |
||
forwardHeaderMatches HTTPHeaderMatch array |
ForwardHeaderMatches defines a list of HTTP header matches that will be used to select the headers to forward to the webhook. Request headers are used when forwarding requests and response headers are used when forwarding responses. By default, no headers are forwarded. |