Skip to main content

The agent Step

The agent step is a core component in Bosun tasks, allowing you to leverage AI agents to perform complex, intelligent actions on your codebase. An agent step defines a specific task for an AI agent, providing it with instructions, a role, and constraints to guide its behavior.

Key properties of an agent step include:

  • extends: Specifies a base agent type. For example, extends: Coding indicates that the agent has general programming abilities. If omitted, the agent defaults to extends: Default.
  • instructions: A detailed prompt or set of directives for the agent, outlining what it needs to achieve.
  • role: Defines the persona or expertise the agent should adopt, influencing its approach and output style.
  • constraints: Specific rules, limitations, or requirements the agent must adhere to during execution. These are crucial for ensuring the agent's actions align with your project's standards.
  • toolboxes: A list of capabilities or categories of tools the agent has access to (e.g., repository_read for reading files, dangerous for actions like creating PRs).
  • tools: Defines specific tools the agent can use, often with extends to inherit functionality from a base tool and with to provide parameters.

Agent Types

Bosun provides two primary agent types you can extend from:

  • Default Agent: This is the base agent type. It is used for general orchestration and tool usage. If you do not specify an extends property, your agent will default to this type.
  • Coding Agent: This agent type is specialized for code-related tasks. When using extends: Coding, the agent automatically commits code changes it makes and uses an initial repository context to get started quickly. It also has the repository_read, repository_write, research, and dangerous toolboxes activated by default, providing it with broad capabilities for code analysis, modification, and external research.
  • pull_request Agent: Purpose-built for preparing pull requests. It understands how to inspect pending changes, gathers context automatically, and calls the pull request tooling for you.

Example from task

Here's an example of an agent step from an example task, demonstrating how an agent is configured to perform a code refactoring task:

  - name: Refactor legacy code
agent:
extends: Coding
instructions: >
Refactor the identified legacy code sections to use modern JavaScript syntax.
Ensure backward compatibility is maintained where necessary.
role: "You are a senior frontend architect."
constraints:
- "The project uses React and TypeScript"
- "You must use functional components and hooks"
- "Do not introduce any breaking changes to the public API"
- "Add comments for complex logic"
- "Ensure all changes pass existing tests"
- "Do not make any unrelated changes"

Another example shows an agent creating a pull request using the dedicated agent:

  - name: Create a draft pull request
agent:
extends: pull_request
instructions: "Create a pull request based on the git diff"
tools:
- name: update_pull_request
extends: create_or_update_pull_request
with:
draft: true

Here, the agent uses the pre-configured pull request tooling to create a draft PR, with specific constraints on the PR's content.

Agent Outcomes

Agents signal how their run concluded using structured payloads. When an agent determines that it has completed its work, it calls the stop tool. By default the tool expects an output string, but you can attach a custom stop_schema in your manifest to require richer JSON (for example, a checklist of applied migrations). If the agent cannot satisfy the task, it calls the task_failed tool. That tool follows the schema from fail_schema, and Bosun copies the resulting object into both the step output and the shared errors.<step>[i].reason entry so later steps can branch on the data without parsing prose. See Custom Schemas for the predefined primitives and authoring tips.

Both outcomes render directly in the run view: the stop or fail card shows the structured payload, whether the agent marked the event as retryable, and links to any tool calls that informed the decision.

steps:
- name: Harden secrets
agent:
extends: Coding
instructions: "Audit secrets usage and patch the code."
stop_schema:
type: object
required: [summary, files]
properties:
summary:
type: string
files:
type: array
items:
type: string
fail_schema:
type: object
required: [summary, retryable]
properties:
summary:
type: string
retryable:
type: boolean

If the agent fails with { "summary": "token rotation requires infra access", "retryable": false }, the UI mirrors that JSON and errors.harden_secrets[0].reason.retryable is ready for templating in the next step.

Inspecting rendered instructions

Every agent run begins with a system message that shows the fully rendered instructions the agent received after templating. You no longer need to reconstruct how inputs, outputs, or variables were interpolated—the run view pins the exact text so reviewers can verify context before reading the transcript. This is especially helpful for manifests that stitch together for_each inputs or build long instruction lists from previous steps.

steps:
- id: lint_errors
run: ./scripts/list-lint-errors.sh

- name: Fix each lint error with an agent
for_each:
from: '{{ outputs.lint_errors | split(pat="\n") | filter(value) | json_encode() }}'
agent:
extends: Coding
instructions: |
Apply automatic lint fixes to {{ for_each.value }}.
Focus solely on the specified file: {{ for_each.value }}
role: "You are a meticulous code formatter and linter expert."
constraints:
- "Only apply fixes that do not change code logic."
continue_on_error: true

Successful iterations display their rendered role, instructions, and constraints in the run view so you can verify that each agent really focused on the intended file path. If an iteration fails to render—for example, because one entry in the list was empty—Bosun halts that iteration before launching the agent. The failure is recorded under both the step output and errors.<step>, and the rest of the loop continues because continue_on_error is enabled. Downstream steps can summarize the failures with errors.fix_each | json_encode(pretty=true) while still consuming the outputs produced by the successful agents.

Error handling

By default an agent failure stops the task. Set continue_on_error: true to let the workflow proceed—the runtime records the failure (including the agent-provided reason when available) in both the step output and the shared errors collection so follow-up steps can react. See Error handling for patterns that leverage this data.