Middleware

Middleware is the primary extension point in Auwgent. It lets you intercept every stage of the agent’s execution pipeline — before and after runs, before and after individual model calls, and on every intent the model emits. Every language binding exposes the same hooks with the same behaviour.

The agentic loop

Before understanding middleware hooks it helps to understand what Auwgent actually does when you call agent.run(). It does not make a single LLM call and return. It runs a loop.

agent.run() → [LLM call] → tool result → [LLM call] → response

A single agent.run() can make multiple LLM calls — one for the initial response, and more each time the model calls a tool and needs to see the result before deciding what to do next. Understanding this loop is the key to understanding which hook to use for any given task.

Hooks

Auwgent provides seven middleware hooks.

Hook	Fires	Receives
`onRunStart`	Once when `agent.run()` is called	Full session state
`onRunComplete`	Once when `agent.run()` fully exits	Final session state
`onLLMStart`	Every time the LLM is called within the loop	The prompt string for this call
`onLLMEnd`	Only when the LLM produces a terminal response	The terminal response value
`onIntent`	Every time the model emits an intent	Intent name, value, and context
`onIntentPartial`	During streaming partial updates	Intent name, partial value, and context
`onError`	When runtime or tool execution errors occur	Error, session, and context

Function shapes

Below are the middleware function shapes exposed by the SDKs.

TypeScript
Python

type Middleware = {
  name: string
  target?: string | string[]

  onRunStart?: (session, ctx) => SessionState | Promise<SessionState>
  onLLMStart?: (prompt, ctx) => void | string | Promise<void | string>
  onIntent?: (name, value, ctx) => IntentControl | Promise<IntentControl>
  onIntentPartial?: (name, value, ctx) => void | Promise<void>
  onLLMEnd?: (response, ctx) => void | Promise<void>
  onRunComplete?: (finalSession, ctx) => void | Promise<void>
  onError?: (error, session, ctx) => boolean | void | Promise<boolean | void>
}

class Middleware(Protocol):
    name: ClassVar[str]
    target: ClassVar[Optional[Union[str, List[str]]]]

    async def onRunStart(self, session, ctx) -> dict: ...
    async def onLLMStart(self, prompt: str, ctx) -> Optional[str]: ...
    async def onIntent(self, name: str, value, ctx) -> Optional[dict]: ...
    async def onIntentPartial(self, name: str, value, ctx) -> None: ...
    async def onLLMEnd(self, response, ctx) -> None: ...
    async def onRunComplete(self, finalSession, ctx) -> None: ...
    async def onError(self, error: Exception, session, ctx) -> bool: ...

Run hooks vs LLM hooks

This is the most important distinction in the middleware system.

Run hooks — onRunStart and onRunComplete — wrap the entire agentic loop. They fire once per agent.run() call regardless of how many LLM calls happen inside it.

LLM hooks — onLLMStart and onLLMEnd — wrap individual model calls within the loop.

In a run where the model calls one tool before responding:

onRunStart            ← fires once

  onLLMStart          ← turn 1: model decides to call a tool
  (tool executes)
  onLLMStart          ← turn 2: model sees tool result and responds
  onLLMEnd            ← terminal response emitted

onRunComplete         ← fires once

onLLMStart fires twice — once before each model call. onLLMEnd fires once — only when the model produces something terminal. Intermediate calls that result in tool invocations do not trigger onLLMEnd.

	`onRunStart`	`onLLMStart`
Fires	Once per run	Once per LLM call
Receives	Full session history	Current prompt string
Can modify	Entire session	Prompt being sent
Use for	Loading sessions, history management	RAG injection, prompt enrichment

	`onRunComplete`	`onLLMEnd`
Fires	Once per run	Only on terminal responses
Receives	Final session state	Terminal response value
Use for	Saving sessions, run analytics	Response auditing, post-processing

The MiddlewareContext

Every hook receives a MiddlewareContext as its last argument. This object carries everything you need to understand and influence the current execution.

`activeAgent`

The name of the agent or helper currently executing. Starts as the root agent name and changes to a helper’s name when that helper is running. Use this to scope logic in a global middleware without needing separate middleware instances.

TypeScript
Python

const middleware: Middleware = {
    name: "scoped-logic",
    onIntent: (name, value, ctx) => {
        if (ctx.activeAgent === "BillingHelper") {
            // only runs when BillingHelper is active
        }
    }
}

class Middlware(AuwgentMiddleware):
    name= "scoped-logic"

  async def onIntent(self, name: str, value, ctx):
      if ctx['activeAgent'] == 'BillingHelper':
         pass  # only runs when BillingHelper is active

`rootAgent`

The name of the top-level agent that was originally called. Never changes during a run, even as helpers come and go. Use this when you need to know who owns the session regardless of delegations.

`systemPrompt`

The fully evaluated system prompt for the activeAgent at this point in the run. Available in all hooks. Useful for auditing what was actually sent to the model, or confirming that prompt composition resolved as expected.

`rawBlock`

Only available in onIntent. The raw, unparsed output the model produced for this intent — exactly as it came out of the LLM, before Auwgent parsed it into a structured object.

Two main uses: strict auditing where you need to log the exact model output rather than the parsed interpretation, and custom parsing where you need to extract something the default parser does not expose.

`setContext(data)`

Replaces the agent’s runtime context with the object you pass. This is what your ctx.name, ctx.is_vip, and other context values are resolved from at prompt evaluation time.

setContext is runtime injection, not schema declaration. The schema/shape is declared in DSL using context { ... }, while setContext(...) supplies values for that schema during execution.

The most common use is inside onRunStart — load user-specific data from your application and inject it before the prompts evaluate.

TypeScript
Python

const contextMiddleware: Middleware = {
    name: "context-injection",
    onRunStart: async (session, ctx) => {
        const user = await db.users.get(userId)
        ctx.setContext({ name: user.name, is_vip: user.plan === "premium" })
        return session
    }
}

class ContextMiddleware(AuwgentMiddleware):
    name = "context-injection"

    async def onRunStart(self, session, ctx):
        user = await db.users.get(user_id)
        ctx["set_context"]({"name": user.name, "is_vip": user.plan == "premium"})
        return session

`embed` and `embedBatch`

Utilities for vector embedding available on the context. Covered in detail in the Embeddings chapter.

The shared blackboard

MiddlewareContext is an open object. You can attach any property to it and it will be available across every hook in that run — including hooks in other middleware plugins. It lives for exactly the duration of one agent.run() call.

This makes it a natural place to share state between hooks without reaching for external variables.

TypeScript
Python

const observabilityMiddleware: Middleware = {
    name: "observability",
    onRunStart: async (session, ctx) => {
        ctx.traceId = crypto.randomUUID()
        ctx.startTime = Date.now()
        return session
    },
    onRunComplete: async (session, ctx) => {
        logger.log({
            traceId: ctx.traceId,
            duration: Date.now() - ctx.startTime
        })
    }
}

import time
import uuid

class ObservabilityMiddleware(AuwgentMiddleware):
  name = "observability"

  async def onRunStart(self, session, ctx):
      ctx["trace_id"] = str(uuid.uuid4())
      ctx["start_time"] = time.time()
      return session

  async def onRunComplete(self, finalSession, ctx):
      duration = time.time() - ctx.get("start_time", 0)
      logger.info({"trace_id": ctx.get("trace_id"), "duration": duration})

Scoping middleware to a specific helper

The target field on a middleware definition scopes it to a specific agent or helper. When set, every hook in that middleware only fires when activeAgent matches the target name. As a bonus, the type system narrows ctx.activeAgent to that specific name inside every hook — no guard checks needed.

TypeScript
Python

const researcherMiddleware: Middleware<IR, never, any, any, "Researcher"> = {
    name: "researcher-middleware",
    target: "Researcher",
    onLLMStart: (prompt, ctx) => {
        // ctx.activeAgent is typed as "Researcher" here
    }
}

class ResearcherMiddleware(AuwgentMiddleware):
    name = "researcher-middleware"
    target = "Researcher"  # only fires when Researcher helper is active

    async def onLLMStart(self, prompt: str, ctx):
        # ctx["activeAgent"] is guaranteed to be "Researcher" here
        pass

Putting it together

A complete middleware setup combining session persistence, context injection, and observability:

TypeScript
Python

import { auwgent, AuwgentConfig, Middleware } from "./generated/main.agent.types"

const sessionMiddleware: Middleware = {
    name: "session-persistence",
    onRunStart: async (session, ctx) => {
        const user = await db.users.get(userId)
        ctx.setContext({ name: user.name, is_vip: user.plan === "premium" })
        const saved = await db.sessions.get(userId)
        return saved || session
    },
    onRunComplete: async (session, ctx) => {
        await db.sessions.save(userId, session)
    }
}

const observabilityMiddleware: Middleware = {
    name: "observability",
    onRunStart: async (session, ctx) => {
        ctx.traceId = crypto.randomUUID()
        ctx.startTime = Date.now()
        return session
    },
    onRunComplete: async (session, ctx) => {
        logger.log({
            traceId: ctx.traceId,
            duration: Date.now() - ctx.startTime
        })
    }
}

const config: AuwgentConfig = {
    apiKeys: {
        geminiApiKey: "YOUR_API_KEY"
    },
    middleware: [sessionMiddleware, observabilityMiddleware]
}

const agent = auwgent(config)

await agent.run("Hello")

import time, uuid
from auwgent import auwgent


class SessionMiddleware:
name = "session-persistence"

async def onRunStart(self, session, ctx):
     user = await db.users.get(user_id)
     ctx["set_context"]({"name": user.name, "is_vip": user.plan == "premium"})
     saved = await db.sessions.get(user_id)
     return saved or session

async def onRunComplete(self, finalSession, ctx):
    await db.sessions.save(user_id, finalSession)

class ObservabilityMiddleware:
 name = "observability"

 async def onRunStart(self, session, ctx):
     ctx["trace_id"] = str(uuid.uuid4())
     ctx["start_time"] = time.time()
     return session

 async def onRunComplete(self, finalSession: dict, ctx: dict):
     duration = time.time() - ctx.get("start_time", 0)
     logger.info({"trace_id": ctx.get("trace_id"), "duration": duration})

agent = auwgent({
  "apiKeys": {
      "geminiApiKey": "YOUR_API_KEY"
  },
  "middleware": [SessionMiddleware, ObservabilityMiddleware]
})

await agent.run("Hello")

Next steps

With middleware covered you have the full picture of how Auwgent’s execution pipeline works. The next topic builds directly on middleware context to explore how embeddings and vector search fit into the system.

→ See Embedding to learn how to use embed and embedBatch inside your middleware.