Integrations and observability

After the workflow shape is clear, the next questions are which external surfaces should live inside the agent loop and how you will inspect what actually happened at runtime.

Choose what lives in the SDK

Need	Start with	Why
Give an agent access to public, remotely hosted MCP tools	Hosted MCP tools in the SDK	The model can call the remote MCP server through the hosted surface
Connect local or private MCP servers from your runtime	SDK-managed MCP servers over stdio or streamable HTTP	Your runtime owns the connection, approvals, and network boundaries
Debug prompts, tools, handoffs, or approvals	Built-in tracing	Traces show the end-to-end record before you formalize evals

Tool capability semantics still live in Using tools. This page focuses on the SDK-specific MCP wiring and observability loop.

MCP

Use hosted MCP tools when the remote server should run through the model surface.

Attach a hosted MCP server

import { Agent, hostedMcpTool } from "@openai/agents";

const agent = new Agent({
  name: "MCP assistant",
  instructions: "Use the MCP tools to answer questions.",
  tools: [
    hostedMcpTool({
      serverLabel: "gitmcp",
      serverUrl: "https://gitmcp.io/openai/codex",
    }),
  ],
});

from agents import Agent, HostedMCPTool

agent = Agent(
    name="MCP assistant",
    instructions="Use the MCP tools to answer questions.",
    tools=[
        HostedMCPTool(
            tool_config={
                "type": "mcp",
                "server_label": "gitmcp",
                "server_url": "https://gitmcp.io/openai/codex",
                "require_approval": "never",
            }
        )
    ],
)

Use local transports when your application should connect to the MCP server directly.

Connect a local MCP server

import { Agent, MCPServerStdio, run } from "@openai/agents";

const server = new MCPServerStdio({
  name: "Filesystem MCP Server",
  fullCommand: "npx -y @modelcontextprotocol/server-filesystem ./sample_files",
});

await server.connect();

try {
  const agent = new Agent({
    name: "Filesystem assistant",
    instructions: "Read files with the MCP tools before answering.",
    mcpServers: [server],
  });

  const result = await run(agent, "Read the files and list them.");
  console.log(result.finalOutput);
} finally {
  await server.close();
}

import asyncio

from agents import Agent, Runner
from agents.mcp import MCPServerStdio


async def main() -> None:
    async with MCPServerStdio(
        name="Filesystem MCP Server",
        params={
            "command": "npx",
            "args": [
                "-y",
                "@modelcontextprotocol/server-filesystem",
                "./sample_files",
            ],
        },
    ) as server:
        agent = Agent(
            name="Filesystem assistant",
            instructions="Read files with the MCP tools before answering.",
            mcp_servers=[server],
        )
        result = await Runner.run(agent, "Read the files and list them.")
        print(result.final_output)


if __name__ == "__main__":
    asyncio.run(main())

The practical split is:

Use hosted MCP for public remote servers that fit the platform trust model.
Use local or private MCP when your runtime should own connectivity, filtering, or approvals.

For the platform-wide concept, trust model, and product support story, keep MCP and Connectors as the canonical reference.

Tracing

Tracing is built into the Agents SDK and is enabled by default in the normal server-side SDK path. Every run can emit a structured record of model calls, tool calls, handoffs, guardrails, and custom spans, which you can inspect in the Traces dashboard.

The default trace usually gives you:

the overall run or workflow
each model call
tool calls and their outputs
handoffs and guardrails
any custom spans you wrap around the workflow

If you need less tracing, use the SDK-level or per-run tracing controls rather than removing all observability from the workflow.

Wrap multiple runs in one trace

import { Agent, run, withTrace } from "@openai/agents";

const agent = new Agent({
  name: "Joke generator",
  instructions: "Tell funny jokes.",
});

await withTrace("Joke workflow", async () => {
  const first = await run(agent, "Tell me a joke");
  const second = await run(agent, \`Rate this joke: \${first.finalOutput}\`);
  console.log(first.finalOutput);
  console.log(second.finalOutput);
});

import asyncio

from agents import Agent, Runner, trace

agent = Agent(
    name="Joke generator",
    instructions="Tell funny jokes.",
)


async def main() -> None:
    with trace("Joke workflow"):
        first = await Runner.run(agent, "Tell me a joke")
        second = await Runner.run(
            agent,
            f"Rate this joke: {first.final_output}",
        )
        print(first.final_output)
        print(second.final_output)


if __name__ == "__main__":
    asyncio.run(main())

Use traces for two jobs:

Debug one workflow run and understand what happened.
Feed higher-signal examples into agent workflow evaluation once you are ready to score behavior systematically.

Next steps

Once the external surfaces are wired in, continue with the guide that covers capability design, review boundaries, or evaluation.

See how hosted tools, function tools, and agents-as-tools fit beside MCP. Add approval or validation boundaries around sensitive capabilities. Move from one-off traces into repeatable grading once behavior stabilizes.

guides/agents/integrations-observability.md +229 −0 created

1# Integrations and observability

3After the workflow shape is clear, the next questions are which external surfaces should live inside the agent loop and how you will inspect what actually happened at runtime.

5## Choose what lives in the SDK

7| Need | Start with | Why |

8| --------------------------------------------------------- | ----------------------------------------------------- | ------------------------------------------------------------------- |

9| Give an agent access to public, remotely hosted MCP tools | Hosted MCP tools in the SDK | The model can call the remote MCP server through the hosted surface |

10| Connect local or private MCP servers from your runtime | SDK-managed MCP servers over stdio or streamable HTTP | Your runtime owns the connection, approvals, and network boundaries |

11| Debug prompts, tools, handoffs, or approvals | Built-in tracing | Traces show the end-to-end record before you formalize evals |

13Tool capability semantics still live in [Using tools](https://developers.openai.com/api/docs/guides/tools). This page focuses on the SDK-specific MCP wiring and observability loop.

15## MCP

17Use hosted MCP tools when the remote server should run through the model surface.

19Attach a hosted MCP server

21```typescript

22import { Agent, hostedMcpTool } from "@openai/agents";

24const agent = new Agent({

25 name: "MCP assistant",

26 instructions: "Use the MCP tools to answer questions.",

27 tools: [

28 hostedMcpTool({

29 serverLabel: "gitmcp",

30 serverUrl: "https://gitmcp.io/openai/codex",

31 }),

32 ],

33});

34```

36```python

37from agents import Agent, HostedMCPTool

39agent = Agent(

40 name="MCP assistant",

41 instructions="Use the MCP tools to answer questions.",

42 tools=[

43 HostedMCPTool(

44 tool_config={

45 "type": "mcp",

46 "server_label": "gitmcp",

47 "server_url": "https://gitmcp.io/openai/codex",

48 "require_approval": "never",

49 }

50 )

51 ],

52)

53```

56Use local transports when your application should connect to the MCP server directly.

58Connect a local MCP server

60```typescript

61import { Agent, MCPServerStdio, run } from "@openai/agents";

63const server = new MCPServerStdio({

64 name: "Filesystem MCP Server",

65 fullCommand: "npx -y @modelcontextprotocol/server-filesystem ./sample_files",

66});

68await server.connect();

70try {

71 const agent = new Agent({

72 name: "Filesystem assistant",

73 instructions: "Read files with the MCP tools before answering.",

74 mcpServers: [server],

75 });

77 const result = await run(agent, "Read the files and list them.");

78 console.log(result.finalOutput);

79} finally {

80 await server.close();

81}

82```

84```python

85import asyncio

87from agents import Agent, Runner

88from agents.mcp import MCPServerStdio

91async def main() -> None:

92 async with MCPServerStdio(

93 name="Filesystem MCP Server",

94 params={

95 "command": "npx",

96 "args": [

97 "-y",

98 "@modelcontextprotocol/server-filesystem",

99 "./sample_files",

100 ],

101 },

102 ) as server:

103 agent = Agent(

104 name="Filesystem assistant",

105 instructions="Read files with the MCP tools before answering.",

106 mcp_servers=[server],

107 )

108 result = await Runner.run(agent, "Read the files and list them.")

109 print(result.final_output)

110

111

112if __name__ == "__main__":

113 asyncio.run(main())

114```

115

116

117The practical split is:

118

119- Use **hosted MCP** for public remote servers that fit the platform trust model.

120- Use **local or private MCP** when your runtime should own connectivity, filtering, or approvals.

121

122For the platform-wide concept, trust model, and product support story, keep [MCP and Connectors](https://developers.openai.com/api/docs/guides/tools-connectors-mcp) as the canonical reference.

123

124## Tracing

125

126Tracing is built into the Agents SDK and is enabled by default in the normal server-side SDK path. Every run can emit a structured record of model calls, tool calls, handoffs, guardrails, and custom spans, which you can inspect in the [Traces dashboard](https://platform.openai.com/traces).

127

128The default trace usually gives you:

129

130- the overall run or workflow

131- each model call

132- tool calls and their outputs

133- handoffs and guardrails

134- any custom spans you wrap around the workflow

135

136If you need less tracing, use the SDK-level or per-run tracing controls rather than removing all observability from the workflow.

137

138Wrap multiple runs in one trace

139

140```typescript

141import { Agent, run, withTrace } from "@openai/agents";

142

143const agent = new Agent({

144 name: "Joke generator",

145 instructions: "Tell funny jokes.",

146});

147

148await withTrace("Joke workflow", async () => {

149 const first = await run(agent, "Tell me a joke");

150 const second = await run(agent, \`Rate this joke: \${first.finalOutput}\`);

151 console.log(first.finalOutput);

152 console.log(second.finalOutput);

153});

154```

155

156```python

157import asyncio

158

159from agents import Agent, Runner, trace

160

161agent = Agent(

162 name="Joke generator",

163 instructions="Tell funny jokes.",

164)

165

166

167async def main() -> None:

168 with trace("Joke workflow"):

169 first = await Runner.run(agent, "Tell me a joke")

170 second = await Runner.run(

171 agent,

172 f"Rate this joke: {first.final_output}",

173 )

174 print(first.final_output)

175 print(second.final_output)

176

177

178if __name__ == "__main__":

179 asyncio.run(main())

180```

181

182

183Use traces for two jobs:

184

185- Debug one workflow run and understand what happened.

186- Feed higher-signal examples into [agent workflow evaluation](https://developers.openai.com/api/docs/guides/agent-evals) once you are ready to score behavior systematically.

187

188## Next steps

189

190Once the external surfaces are wired in, continue with the guide that covers capability design, review boundaries, or evaluation.

191

192<div class="not-prose mt-4 grid gap-3">

193 <a

194 href="/api/docs/guides/tools#usage-in-the-agents-sdk"

195 class="block no-underline hover:no-underline"

196 >

197

198

199<span slot="icon">

200 </span>

201 See how hosted tools, function tools, and agents-as-tools fit beside MCP.

202

203

204 </a>

205 <a

206 href="/api/docs/guides/agents/guardrails-approvals"

207 class="block no-underline hover:no-underline"

208 >

209

210

211<span slot="icon">

212 </span>

213 Add approval or validation boundaries around sensitive capabilities.

214

215

216 </a>

217 <a

218 href="/api/docs/guides/agent-evals"

219 class="block no-underline hover:no-underline"

220 >

221

222

223<span slot="icon">

224 </span>

225 Move from one-off traces into repeatable grading once behavior stabilizes.

226

227

228 </a>

229</div>