Observe traffic
Review LLM-specific metrics and logs.
Before you begin
Complete an LLM guide, such as an LLM provider-specific guide. This guide sends a request to the LLM and receives a response. You can use this request and response example to verify metrics and logs.
View LLM metrics
You can access the agentgateway metrics endpoint to view LLM-specific metrics, such as the number of tokens that you used during a request or response.
- Port-forward the agentgateway proxy on port 15020.
kubectl port-forward deployment/agentgateway -n kgateway-system 15020
- Open the agentgateway metrics endpoint.
- Look for the
agentgateway_gen_ai_client_token_usage
metric. This metric is a histogram and includes important information about the request and the response from the LLM, such as:gen_ai_token_type
: Whether this metric is about a request (input
) or response (output
).gen_ai_operation_name
: The name of the operation that was performed.gen_ai_system
: The LLM provider that was used for the request/response.gen_ai_request_model
: The model that was used for the request.gen_ai_response_model
: The model that was used for the response.
For more information, see the Semantic conventions for generative AI metrics in the OpenTelemetry docs.
View logs
Agentgateway automatically logs information to stdout. When you run agentgateway on your local machine, you can view a log entry for each request that is sent to agentgateway in your CLI output.
To view the logs:
kubectl logs <agentgateway-pod> -n kgateway-system
Example for a successful request to the OpenAI LLM:
2025-09-12T18:23:54.661414Z info request gateway=kgateway-system/agentgateway listener=http
route=kgateway-system/openai endpoint=api.openai.com:443 src.addr=10.0.9.76:38655
http.method=POST http.host=a1cff4bd974a34d8b882b2fa01d357f0-119963959.us-east-2.elb.amazonaws.com
http.path=/openai http.version=HTTP/1.1 http.status=200 llm.provider=openai
llm.request.model= llm.request.tokens=39 llm.response.model=gpt-3.5-turbo-0125
llm.response.tokens=181 duration=3804ms