bridge() solver for integrating with external agent frameworks #1181

jjallaire · 2025-01-23T17:09:04Z

This PR introduces the ability to integrate an external agent with no Inspect dependencies by converting
it to a Solver. The only requirements of the agent are that it use the standard OpenAI API and that it consume and produce dict values as described below. While the agent function calls the standard OpenAI API, these calls are intercepted by Inspect and sent to the requisite Inspect model provider.

Protocol

Here is the type contract for bridged solvers (you don't need to use or import these types in your agent, your dict usage should just conform to the protocol):

from openai.types.chat import ChatCompletionMessageParam

class SampleDict(TypedDict):
    model: str
    sample_id: str
    epoch: int
    messages: list[ChatCompletionMessageParam]
    metadata: dict[str, Any]
    target: list[str]

class ScoreDict(TypedDict):
    value: (
        str
        | int
        | float
        | bool
        | list[str | int | float | bool]
        | dict[str, str | int | float | bool | None]
    )
    answer: NotRequired[str]
    explanation: NotRequired[str]
    metadata: NotRequired[dict[str, Any]]

class ResultDict(TypedDict):
    output: str
    messages: NotRequired[list[ChatCompletionMessageParam]]
    scores: NotRequired[dict[str, ScoreDict]]

async def agent(sample: SampleDict) -> ResultDict: 
    ...

The agent function must be async, and should accept and return dict values as-per the type declarations (you aren't required to use these types exactly (they merely document the requirements) so long as you consume and produce dict values that match their declarations.

Returning messages is not required but is highly recommended so that people running the agent can see the full message history in the Inspect log viewer.

Returning scores is entirely optional (most agents will in fact rely on Inspect native scorers, this is here as an
escape hatch for agents that want to do their own scoring).

Example

Here is the simplest possible agent definition:

from openai import AsyncOpenAI

async def my_agent(sample: dict[str, Any]) -> dict[str, Any]:
    client = AsyncOpenAI()
    completion = await client.chat.completions.create(
        messages=sample["messages"],
        model=sample["model"]
    )
    return {
        "output": completion.choices[0].message.content
    }

Note that you should always pass the "model" along to OpenAI exactly as passed in the sample. While you are calling the standard OpenAI API, these calls are intercepted by Inspect and sent to the requisite Inspect model provider.

Here is how you can use the bridge() function to use this agent as a solver:

from inspect_ai import Task, task
from inspect_ai.dataset import Sample
from inspect_ai.scorer import includes
from inspect_ai.solver import bridge

from agents import my_agent

@task
def hello():
    return Task(
        dataset=[Sample(input="Please print the word 'hello'?", target="hello")],
        solver=bridge(my_agent),
        scorer=includes(),
    )

"""

Inspect Features

The bridge() function enables you to create an agent with zero Inspect dependencies and still get the benefit of most of Inspect's infrastructure.

Limits

Sample level token and time limits are are enforced as normal with bridged solvers. Message limits are enforced when calling the main model being evaluated (the number of messages sent to the model are counted and compared against the limit).

Observability

Agents incorporated using the bridge() function still get to take advantage of most of Inspect's core observability features --- all model calls still go through the Inspect model interface so appear in the transcript as normal. If you return messages in your result then the messages are also populated for the log viewer. Standard Python logger calls also continue to be routed into the Inspect sample log.

If you want to take advantage of additional observability features you can optionally import the Inspect transcript() function and use it as normal.

Sandboxes

If you need to execute arbitrary model generated code, you can use the Inspect sandbox() functions directly. If you need your agent to run both inside and outside of Inspect you can abstract code execution into an interface and only call sandbox() when running inside the Inspect agent wrapper.

jjallaire added 25 commits January 17, 2025 13:50

initial work on solver bridge

6bd992e

patch example

2e92eb8

extract openai marshalling helper

8c6a125

wip on patch

dda4948

factor out more openai handlers

e150593

patch w/o lint

50bdb5d

Merge remote-tracking branch 'origin/main' into feature/solver-bridge

e1254a7

remove openai from dev

b41c79a

more work on patch

b5d6e3f

tweak patch

48cb2f9

Merge remote-tracking branch 'origin/main' into feature/solver-bridge

1501128

more on patch

ebb5e01

more work on bridge/patch

44ff1c6

complete bridge impl

2916bf1

basic mechanism working

e916b88

remove unused --suffix option

f908e40

complete marshalling

59f9c4c

remove debug printing

4f4d4e4

bridge comments

b586132

Merge remote-tracking branch 'origin/main' into feature/solver-bridge

253e853

cleanup and export bridge/patch

f88447b

additional bridge tests

148f773

remove runner

5b64eb2

doc comments for bridge

65536d8

unused imports

a05c767

jjallaire requested a review from dragonstyle January 23, 2025 17:09

jjallaire marked this pull request as draft January 23, 2025 17:09

jjallaire added 3 commits January 23, 2025 12:13

fix typos

fea5b3e

remove lower bound

b1afd45

fix typo

a8b51a4

jjallaire added 9 commits January 23, 2025 13:04

only require openai when using the bridge

75f8195

use pydantic for bridge types

6edf3bc

use model_validate

8e4c2a4

impoved errors/validation for bridge

dd47a4b

remove langchain examples and exempt bridge examples from type checking

88aaeac

improve marshalling to openai messages

f23a240

langchain example

c8fbaf9

update readme

44dfe31

more readme

4a91c3c

jjallaire changed the title ~~bridge() function for converting agents with no inspect dependencies into solvers~~ bridge() solver for integrating with external agent frameworks Jan 23, 2025

Update README.md

228362b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bridge() solver for integrating with external agent frameworks #1181

bridge() solver for integrating with external agent frameworks #1181

jjallaire commented Jan 23, 2025

bridge() solver for integrating with external agent frameworks #1181

Are you sure you want to change the base?

bridge() solver for integrating with external agent frameworks #1181

Conversation

jjallaire commented Jan 23, 2025

Protocol

Example

Inspect Features

Limits

Observability

Sandboxes