Skip to main content

Error handling in Bosun tasks

Bosun's fluyt-2 runtime surfaces step failures immediately, but you control whether those failures should stop the task. Every step honours the continue_on_error flag so you can keep the workflow moving while still recording detailed diagnostics.

Default behaviour

If a step errors and continue_on_error is omitted (or set to false), execution stops at that point. Typical failure modes include:

  • agent: the agent calls the fail tool, instruction rendering fails, or the model session errors.
  • run: the shell command exits with a non-zero status or the executor encounters an error.
  • prompt / structured_prompt: templating fails, the model call errors, or (for structured prompts) the response cannot be validated against the schema.
  • for_each: any iteration fails while continue_on_error is disabled; in-flight work is cancelled and the error bubbles out.

Continuing after failures

Set continue_on_error: true on any step to log the failure and continue with the rest of the task.

steps:
- id: format_sources
run: npm run fmt
continue_on_error: true

- id: notify
prompt: |
Formatting left {{ errors.format_sources | length }} issues.
Details:\n{{ errors.format_sources | json_encode(pretty=true) }}

What gets recorded

When a step continues after an error:

  • The step output becomes an object describing the failure (kind, message, and additional fields such as reason, output, or source).
  • The same object is appended to the errors collection in the template context. Access it either by step index (errors.0) or by id (errors.format_sources). Each entry is an array because a for_each step can emit multiple failures.
  • for_each adds extra metadata for each failing iteration, including the index and the rendered input.

You can use this structured data to branch on specific failure types, produce human-readable summaries, or feed the details into a follow-up agent.

Designing resilient workflows

  • Enable continue_on_error on steps where a failure should not block the workflow, then add follow-up steps that inspect errors.<step> to decide on remediation.
  • Combine templating helpers like json_encode, length, or first to present concise summaries to humans or agents.
  • For for_each, collect the failures and spin up a targeted task (or rerun the loop) with the inputs that still need work.