Observe traffic

Review LLM-specific metrics and logs.

Before you begin

Complete an LLM guide, such as an LLM provider-specific guide. This guide sends a request to the LLM and receives a response. You can use this request and response example to verify metrics and logs.

View LLM metrics

You can access the agentgateway metrics endpoint to view LLM-specific metrics, such as the number of tokens that you used during a request or response.

  1. Port-forward the agentgateway proxy on port 15020.
    kubectl port-forward deployment/agentgateway -n kgateway-system 15020  
  2. Open the agentgateway metrics endpoint.
  3. Look for the agentgateway_gen_ai_client_token_usage metric. This metric is a histogram and includes important information about the request and the response from the LLM, such as:
    • gen_ai_token_type: Whether this metric is about a request (input) or response (output).
    • gen_ai_operation_name: The name of the operation that was performed.
    • gen_ai_system: The LLM provider that was used for the request/response.
    • gen_ai_request_model: The model that was used for the request.
    • gen_ai_response_model: The model that was used for the response.

For more information, see the Semantic conventions for generative AI metrics in the OpenTelemetry docs.

View logs

Agentgateway automatically logs information to stdout. When you run agentgateway on your local machine, you can view a log entry for each request that is sent to agentgateway in your CLI output.

To view the logs:

kubectl logs <agentgateway-pod> -n kgateway-system

Example for a successful request to the OpenAI LLM:

2025-09-12T18:23:54.661414Z	info	request gateway=kgateway-system/agentgateway listener=http 
route=kgateway-system/openai endpoint=api.openai.com:443 src.addr=10.0.9.76:38655 
http.method=POST http.host=a1cff4bd974a34d8b882b2fa01d357f0-119963959.us-east-2.elb.amazonaws.com
http.path=/openai http.version=HTTP/1.1 http.status=200 llm.provider=openai
llm.request.model= llm.request.tokens=39 llm.response.model=gpt-3.5-turbo-0125
llm.response.tokens=181 duration=3804ms