Prompt enrichment

Documentation

AI Gateway

Prompt enrichment

Effectively manage system and user prompts to improve LLM outputs.

The docs in this section use the Envoy-based kgateway proxy. The docs do not work with the agentgateway proxy.

About prompt enrichment

Prompts are basic building blocks for guiding LLMs to produce relevant and accurate responses. By effectively managing both system prompts, which set initial guidelines, and user prompts, which provide specific context, you can significantly enhance the quality and coherence of the model’s outputs.

System prompts include initialization instructions, behavior guidelines, and background information. You use system prompts to set the foundation for the model’s behavior. For example, you might instruct your LLM to respond to users with a polite tone, or according to specific organizational policies, with a system prompt such as “You are a helpful customer service assistant. Always be polite, and conclude conversations by asking customers to rate their experience.”

User prompts encompass direct queries, sequential inputs, and task-oriented instructions. They ensure that the model responds accurately to specific user needs. This includes all interactions that end users have with your LLM, such as “Summarize this article in 3 key points” or “What kind of dinner can I make with these ingredients?”.

Note that system and user prompts are not mutually exclusive, and can be combined in a single request to an LLM. For example, in the following steps, the prompt Parse the unstructured text into CSV format: Seattle, Los Angeles, and Chicago are cities in North America. London, Paris, and Berlin are cities in Europe. contains both system prompt and user prompt components.

Before you begin

Set up AI Gateway.
Authenticate to the LLM.

Get the external address of the gateway and save it in an environment variable.

export INGRESS_GW_ADDRESS=$(kubectl get svc -n kgateway-system ai-gateway -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}")
echo $INGRESS_GW_ADDRESS

kubectl port-forward deployment/ai-gateway -n kgateway-system 8080:8080

Refactor LLM prompts

In the following example, you explore how to refactor system and user prompts to parse and turn unstructured text into valid CSV format.

Send a request to the AI API with the following prompt: Parse the unstructured text into CSV format: Seattle, Los Angeles, and Chicago are cities in North America. London, Paris, and Berlin are cities in Europe. Note that in this prompt, the system prompt is not separated from the user prompt.

curl "$INGRESS_GW_ADDRESS:8080/openai" -H content-type:application/json -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "user",
        "content": "Parse the unstructured text into CSV format: Seattle, Los Angeles, and Chicago are cities in North America. London, Paris, and Berlin are cities in Europe."
      }
   ]
  }' | jq -r '.choices[].message.content'

curl "localhost:8080/openai" -H content-type:application/json -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "user",
        "content": "Parse the unstructured text into CSV format: Seattle, Los Angeles, and Chicago are cities in North America. London, Paris, and Berlin are cities in Europe."
      }
   ]
  }' | jq -r '.choices[].message.content'

Verify that the request succeeds and that you get back a structured CSV response.

City,Continent
Seattle,North America
Los Angeles,North America
Chicago,North America
London,Europe
Paris,Europe
Berlin,Europe

Refactor the request to improve readability and management of the prompt. In the following example, the instructions are separated from the unstructured text. The instructions are added as a system prompt and the unstructured text is added as a user prompt.

curl "$INGRESS_GW_ADDRESS:8080/openai" -H content-type:application/json -d '{
   "model": "gpt-3.5-turbo",
   "messages": [
     {
       "role": "system",
       "content": "Parse the unstructured text into CSV format."
     },
     {
       "role": "user",
       "content": "Seattle, Los Angeles, and Chicago are cities in North America. London, Paris, and Berlin are cities in Europe."
     }
   ]
 }' | jq -r '.choices[].message.content'

curl "localhost:8080/openai" -H content-type:application/json -d '{
   "model": "gpt-3.5-turbo",
   "messages": [
     {
       "role": "system",
       "content": "Parse the unstructured text into CSV format."
     },
     {
       "role": "user",
       "content": "Seattle, Los Angeles, and Chicago are cities in North America. London, Paris, and Berlin are cities in Europe."
     }
   ]
 }' | jq -r '.choices[].message.content'

Verify that you get back the same output as in the previous step.

City, Continent  
Seattle, North America  
Los Angeles, North America  
Chicago, North America  
London, Europe  
Paris, Europe  
Berlin, Europe

Append or prepend prompts

Use a TrafficPolicy resource to enrich prompts by appending or prepending system and user prompts to each request. This way, you can centrally manage common prompts that you want to add to each request.

Create a TrafficPolicy resource to enrich your prompts and configure additional settings. The following example prepends a system prompt of Parse the unstructured text into CSV format. to each request that is sent to the openai HTTPRoute.

kubectl apply -f- <<EOF
apiVersion: gateway.kgateway.dev/v1alpha1
kind: TrafficPolicy
metadata:
  name: openai-opt
  namespace: kgateway-system
  labels:
    app: ai-gateway
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: openai
  ai:
    promptEnrichment:
      prepend:
      - role: SYSTEM
        content: "Parse the unstructured text into CSV format."
EOF

Send a request without a system prompt. Although the system prompt instructions are missing in the request, the unstructured text in the user prompt is still transformed into structured CSV format. This is because the system prompt is automatically prepended from the TrafficPolicy resource before it is sent to the LLM provider.

curl "$INGRESS_GW_ADDRESS:8080/openai" -H content-type:application/json -d '{
   "model": "gpt-3.5-turbo",
   "messages": [
     {
       "role": "user",
       "content": "The recipe called for eggs, flour and sugar. The price was $5, $3, and $2."
     }
   ]
 }' | jq -r '.choices[].message.content'

curl "localhost:8080/openai" -H content-type:application/json -d '{
   "model": "gpt-3.5-turbo",
   "messages": [
     {
       "role": "user",
       "content": "The recipe called for eggs, flour and sugar. The price was $5, $3, and $2."
     }
   ]
 }' | jq -r '.choices[].message.content'

Example output:

Item, Price
Eggs, $5
Flour, $3
Sugar, $2

Overwrite settings on the route level

To overwrite a setting that you added to a TrafficPolicy resource, you simply include that setting in your request.

Send a request to the AI API and include a custom system prompt that instructs the API to transform unstructured text into JSON format.

curl "$INGRESS_GW_ADDRESS:8080/openai" -H content-type:application/json -d '{
   "model": "gpt-3.5-turbo",
   "messages": [
     {
       "role": "system",
       "content": "Parse the unstructured content and give back a JSON format"
     },
     {
       "role": "user",
       "content": "The recipe called for eggs, flour and sugar. The price was $5, $3, and $2."
     }
   ]
 }' | jq -r '.choices[].message.content'

curl "localhost:8080/openai" -H content-type:application/json -d '{
   "model": "gpt-3.5-turbo",
   "messages": [
     {
       "role": "system",
       "content": "Parse the unstructured content and give back a JSON format"
     },
     {
       "role": "user",
       "content": "The recipe called for eggs, flour and sugar. The price was $5, $3, and $2."
     }
   ]
 }' | jq -r '.choices[].message.content'

Example output:

{
  "recipe": [
    {
      "ingredient": "eggs",
      "price": "$5"
    },
    {
      "ingredient": "flour",
      "price": "$3"
    },
    {
      "ingredient": "sugar",
      "price": "$2"
    }
  ]
}

Send another request. This time, you do not include a system prompt. Because the default setting in the TrafficPolicy resource is applied, the unstructured text is returned in CSV format.

curl "$INGRESS_GW_ADDRESS:8080/openai" -H content-type:application/json -d '{
   "model": "gpt-3.5-turbo",
   "messages": [
     {
       "role": "user",
       "content": "The recipe called for eggs, flour and sugar. The price was $5, $3, and $2."
     }
   ]
 }' | jq -r '.choices[].message.content'

curl "localhost:8080/openai" -H content-type:application/json -d '{
   "model": "gpt-3.5-turbo",
   "messages": [
     {
       "role": "user",
       "content": "The recipe called for eggs, flour and sugar. The price was $5, $3, and $2."
     }
   ]
 }' | jq -r '.choices[].message.content'

Example output:

Item, Price
Eggs, $5
Flour, $3
Sugar, $2

Cleanup

You can remove the resources that you created in this guide.

kubectl delete TrafficPolicy -n kgateway-system -l app=ai-gateway

Explore how to set up prompt guards to block unwanted requests and mask sensitive data.

Function calling Prompt guards

Prompt enrichment

About prompt enrichment

Before you begin

Refactor LLM prompts

Append or prepend prompts

Overwrite settings on the route level

Cleanup

Next