Skip to main content

Agentic Tools and Toolboxes

Bosun agents interact with your codebase and external services through specialized tools organized into toolboxes. Each toolbox gates a set of capabilities, giving you fine-grained control over what an agent may do.

Toolboxes

Toolboxes are collections of related tools that an agent can be granted access to. By specifying which toolboxes an agent can use, you control its operational scope and capabilities.

Here are the currently available toolboxes:

  • repository_read: Grants the agent permission to read files and directories within the connected repository.

  • repository_write: Grants the agent permission to modify, create, or delete files and directories within the connected repository.

  • dangerous: Provides access to powerful capabilities, including:

    • Shell access: Execute arbitrary shell commands.
    • Git access: Perform Git operations (e.g., git add, git commit).
    warning

    The dangerous toolbox lets the agent run any shell command inside the repository workspace. Only enable it for steps that truly need full system access. Coding agents have it enabled by default.

  • research: Equips the agent with research capabilities, including:

    • URL fetching: Access and read content from specified web URLs.
    • Web search: Perform searches on the internet to gather information.
    • GitHub search: Search for repositories, code, and issues on GitHub.
  • code_docs: Lets the agent index and query dependency documentation (public registries or your private workspace) via the code_docs tool. Enable it when you need API usage answers that are not already present in the run context.

Ad-hoc documentation retrieval (code_docs)

The code_docs toolbox exposes Bosun’s ad-hoc documentation engine so agents can answer API-level questions about any package referenced in the run. When the tool is invoked, Bosun:

  1. Resolves the package through the appropriate registry or the active workspace, then (for private packages) checks out the canonical repository path before indexing.
  2. Harvests documentation sources: README.* files, docs/, doc/, or site/ directories, curated registry links, and any generated outputs (Cargo doc, Typedoc, PyDoc, Storybook, or other scripted commands defined by the package).
  3. Builds an org-scoped vector index and runs the query using the latest documents.

Because this happens on demand, every tool call captures the package’s current documentation—even for internal libraries that only exist within your repository or registry.

Enabling the toolbox in manifests

Add code_docs wherever an agent may need to answer SDK usage questions:

steps:
- id: investigate-sdk-regression
agent:
extends: Coding
toolboxes:
- repository_read
- code_docs
instructions: |
Only call `code_docs` when you need a function signature or usage example
that does not already exist in the task context. Provide the full package
name and a specific question ("How do I use FooClient.upload?"), then
summarize the answer with citations.

starts_with: investigate-sdk-regression
ends_with: investigate-sdk-regression
edges: []

The tool enforces the language argument so the agent must specify which registry to consult (rust, javascript, python, etc.). Encourage precise prompts in your agent instruction block so the model knows when and how to call the tool.

Public vs. private documentation

  • Public packages share an index across organizations. Repeated queries become faster because results are cached globally.
  • Private packages are scoped to your organization.
  • Bosun automatically checks out private packages that live inside node_modules (or similar vendor paths) so it can run generators outside the dependency tree and keep your workspace clean.

Repository Context Files

When you connect a repository, Bosun automatically injects guidance from AGENTS.md files into every agent. The resolver looks for .bosun/AGENTS.md first and falls back to the repository-root AGENTS.md if the scoped file is missing. No manifest changes are required—any agent step that targets the repository receives these instructions before its own prompt content, keeping organization-wide guardrails consistent across runs.

Render context access

Bosun now exposes repository metadata directly inside the render context so templates can reference repository information without hard-coding paths. The repositories object is available anywhere templating runs (agent instructions, run commands, for_each inputs, etc.). Each entry inside repositories is keyed by the repository’s project_name and currently includes:

  • name: the project name, matching the key you use inside repositories.
  • path: the executor’s absolute path to that repository checkout.

This makes it easy to generate instructions or commands that depend on the primary checkout or on secondary repositories included in the task.

Conversation context and audit summaries

Bosun’s default coding agents ship with Context Compression, an experimental CAT-inspired summarizer that rewrites long conversations into durable audit trails. Instead of steering the next tool call, these summaries act as a second-order record of everything the agent tried, why it made key decisions, and which artifacts it produced. That makes multi-hour tasks easier to review later or to resume in a follow-up run without replaying every log message.

What the summarizer captures

Context Compression v2 runs automatically after a few completions or whenever the session nears the model’s context window. It:

  • Replays the recent user/agent exchange and bundles it into a structured markdown layout (Task semantics, Decision log, Audit trail, Outcome summaries, Relevant files, Open issues).
  • Expands every tool call into an explicit audit entry with the tool name, key arguments, result type (text, fail, stop), and any artifacts or files referenced.
  • Includes the latest git diff snippet so reviewers can connect the prose summary to exact file changes.
  • Threads the previous summary into the prompt as read-only context so the new summary consolidates everything learned so far without drifting.
  • Appends a short reminder to tie off unfinished work (“If there are open issues… do your best to complete your task”) so people picking up the run know whether anything remains.

Because audit summaries live directly in the session history, Bosun can collapse dozens of chat turns while still preserving a searchable log of decisions, tool executions, and file edits. The summaries also emit as dedicated summary entries in the session log, so teammates scanning a run can expand just the durable checkpoints instead of scrolling linearly.

Why it improves audit trails

  • Traceability: Every live diff, shell run, plan update, or custom tool call shows up in the audit log with parameters and outcomes, making it easy to answer “what changed and why?” without parsing raw tool output.
  • Durable context: When a run spans multiple days (or when a second agent takes over), the summarizer supplies the minimal facts needed to continue safely—goal, constraints, outstanding issues, and the files touched so far.
  • Noise reduction: Bosun strips forward-looking suggestions from these summaries to avoid steering the next turn. Only completed actions, decisions, errors, and artifacts make it through, so the audit trail stays factual.

Working with CAT summaries in manifests

No manifest changes are required to enable the summarizer, but you can lean on its structure in downstream steps:

steps:
- id: summarize-audit
agent:
extends: Coding
instructions: |
Read the latest conversation summary and draft a changelog note for humans.
Focus on the audit trail entries for tooling decisions and any open issues.

starts_with: summarize-audit
ends_with: summarize-audit
edges: []

Because the summaries live in the session history, any agent can call {{ history | json_encode() }} (or fetch the dedicated summary messages from the Sessions API) to build higher-level reports. This is useful for compliance exports, changelog automation, or dashboards that need concise status snapshots without ingesting the entire transcript.

Experimental feature

Context Compression v2 is rolled out as an experimental default. Expect the schema and cadence to evolve—Bosun treats new updates as the latest behavior, so you can rely on the structured layout above even as the implementation iterates.