THE ORCHESTRATOR (#600)

* feat(background-agent): add ConcurrencyManager for model-based limits

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* fix(background-agent): set default concurrency to 5

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(background-agent): support 0 as unlimited concurrency

Setting concurrency to 0 means unlimited (Infinity).
Works for defaultConcurrency, providerConcurrency, and modelConcurrency.

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(hooks): use auto flag for session resumption after compaction

- executor.ts: Added `auto: true` to summarize body, removed subsequent prompt_async call
- preemptive-compaction/index.ts: Added `auto: true` to summarize body, removed subsequent promptAsync call
- executor.test.ts: Updated test expectation to include `auto: true`

Instead of sending 'Continue' prompt after compaction, use SessionCompaction's `auto: true` feature which auto-resumes the session.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(agents): update sisyphus orchestrator

Update Sisyphus agent orchestrator with latest changes.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(features): update background agent manager

Update background agent manager with latest configuration changes.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(features): update init-deep template

Update initialization template with latest configuration.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(hooks): update hook constants and configuration

Update hook constants and configuration across agent-usage-reminder, keyword-detector, and claude-code-hooks.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(tools): remove background-task tool

Remove background-task tool module completely:
- src/tools/background-task/constants.ts
- src/tools/background-task/index.ts
- src/tools/background-task/tools.ts
- src/tools/background-task/types.ts

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(tools): update tool exports and main plugin entry

Update tool index exports and main plugin entry point after background-task tool removal.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(auth): update constants to match CLIProxyAPI (50min buffer, 2 endpoints)

- Changed ANTIGRAVITY_TOKEN_REFRESH_BUFFER_MS from 60,000ms (1min) to 3,000,000ms (50min)
- Removed autopush endpoint from ANTIGRAVITY_ENDPOINT_FALLBACKS (now 2 endpoints: daily → prod)
- Added comprehensive test suite with 6 tests covering all updated constants
- Updated comments to reflect CLIProxyAPI parity

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(auth): remove PKCE to match CLIProxyAPI

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(auth): implement port 51121 with OS fallback

Add port fallback logic to OAuth callback server:
- Try port 51121 (ANTIGRAVITY_CALLBACK_PORT) first
- Fallback to OS-assigned port on EADDRINUSE
- Add redirectUri property to CallbackServerHandle
- Return actual bound port in handle.port

Add comprehensive port handling tests (5 new tests):
- Should prefer port 51121
- Should return actual bound port
- Should fallback when port occupied
- Should cleanup and release port on close
- Should provide redirect URI with actual port

All 16 tests passing (11 existing + 5 new).

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* test(auth): add token expiry tests for 50-min buffer

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(agents): add Prometheus system prompt and planner methodology

Add prometheus-prompt.ts with comprehensive planner agent system prompt.
Update plan-prompt.ts with streamlined Prometheus workflow including:
- Context gathering via explore/librarian agents
- Metis integration for AI slop guardrails
- Structured plan output format

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(agents): add Metis plan consultant agent

Add Metis agent for pre-planning analysis that identifies:
- Hidden requirements and implicit constraints
- AI failure points and common mistakes
- Clarifying questions before planning begins

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(agents): add Momus plan reviewer agent

Add Momus agent for rigorous plan review against:
- Clarity and verifiability standards
- Completeness checks
- AI slop detection

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(agents): add Sisyphus-Junior focused executor agent

Add Sisyphus-Junior agent for focused task execution:
- Same discipline as Sisyphus, no delegation capability
- Used for category-based task spawning via sisyphus_task tool

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(agents): add orchestrator-sisyphus agent

Add orchestrator-sisyphus agent for complex workflow orchestration:
- Manages multi-agent workflows
- Coordinates between specialized agents
- Handles start-work command execution

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(skill-loader): add skill-content resolver for agent skills

Add resolveMultipleSkills() for resolving skill content to prepend to agent prompts.
Includes test coverage for resolution logic.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(agents): add category and skills support to buildAgent

Extend buildAgent() to support:
- category: inherit model/temperature from DEFAULT_CATEGORIES
- skills: prepend resolved skill content to agent prompt

Includes comprehensive test coverage for new functionality.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(agents): register new agents in index and types

- Export Metis, Momus, orchestrator-sisyphus in builtinAgents
- Add new agent names to BuiltinAgentName type
- Update AGENTS.md documentation with new agents

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(features): add boulder-state persistence

Add boulder-state feature for persisting workflow state:
- storage.ts: File I/O operations for state persistence
- types.ts: State interfaces
- Includes test coverage

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(skills): add frontend-ui-ux builtin skill

Add frontend-ui-ux skill for designer-turned-developer UI work:
- SKILL.md with comprehensive design principles
- skills.ts updated with skill template

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(tools): add sisyphus_task tool for category-based delegation

Add sisyphus_task tool supporting:
- Category-based task delegation (visual, business-logic, etc.)
- Direct agent targeting
- Background execution with resume capability
- DEFAULT_CATEGORIES configuration

Includes test coverage.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(background-agent): add resume capability and model field

- Add resume() method for continuing existing agent sessions
- Add model field to BackgroundTask and LaunchInput types
- Update launch() to pass model to session.prompt()
- Comprehensive test coverage for resume functionality

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(hooks): add task-resume-info hook

Add hook for injecting task resume information into tool outputs.
Enables seamless continuation of background agent sessions.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(hooks): add prometheus-md-only write restriction hook

Add hook that restricts Prometheus planner to writing only .md files
in the .sisyphus/ directory. Prevents planners from implementing.
Includes test coverage.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(hooks): add start-work hook for Sisyphus workflow

Add hook for detecting /start-work command and triggering
orchestrator-sisyphus agent for plan execution.
Includes test coverage.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(hooks): add sisyphus-orchestrator hook

Add hook for orchestrating Sisyphus agent workflows:
- Coordinates task execution between agents
- Manages workflow state persistence
- Handles agent handoffs

Includes comprehensive test coverage.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(hooks): export new hooks in index

Export new hooks:
- createPrometheusMdOnlyHook
- createTaskResumeInfoHook
- createStartWorkHook
- createSisyphusOrchestratorHook

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(todo-enforcer): add skipAgents option and improve permission check

- Add skipAgents option to skip continuation for specified agents
- Default skip: Prometheus (Planner)
- Improve tool permission check to handle 'allow'/'deny' string values
- Add agent name detection from session messages

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(config): add categories, new agents and hooks to schema

Update Zod schema with:
- CategoryConfigSchema for task delegation categories
- CategoriesConfigSchema for user category overrides
- New agents: Metis (Plan Consultant)
- New hooks: prometheus-md-only, start-work, sisyphus-orchestrator
- New commands: start-work
- Agent category and skills fields

Includes schema test coverage.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(commands): add start-work command

Add /start-work command for executing Prometheus plans:
- start-work.ts: Command template for orchestrator-sisyphus
- commands.ts: Register command with agent binding
- types.ts: Add command name to type union

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(migration): add backup creation and category migration

- Create timestamped backup before migration writes
- Add migrateAgentConfigToCategory() for model→category migration
- Add shouldDeleteAgentConfig() for cleanup when matching defaults
- Add Prometheus and Metis to agent name map
- Comprehensive test coverage for new functionality

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(config-handler): add Sisyphus-Junior and orchestrator support

- Add Sisyphus-Junior agent creation
- Add orchestrator-sisyphus tool restrictions
- Rename Planner-Sisyphus to Prometheus (Planner)
- Use PROMETHEUS_SYSTEM_PROMPT and PROMETHEUS_PERMISSION

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(cli): add categories config for Antigravity auth

Add category model overrides for Gemini Antigravity authentication:
- visual: gemini-3-pro-high
- artistry: gemini-3-pro-high
- writing: gemini-3-pro-high

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(sisyphus): update to use sisyphus_task and add resume docs

- Update example code from background_task to sisyphus_task
- Add 'Resume Previous Agent' documentation section
- Remove model name from Oracle section heading
- Disable call_omo_agent tool for Sisyphus

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor: update tool references from background_task to sisyphus_task

Update all references across:
- agent-usage-reminder: Update AGENT_TOOLS and REMINDER_MESSAGE
- claude-code-hooks: Update comment
- call-omo-agent: Update constants and tool restrictions
- init-deep template: Update example code
- tools/index.ts: Export sisyphus_task, remove background_task

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(hook-message-injector): add ToolPermission type support

Add ToolPermission type union: boolean | 'allow' | 'deny' | 'ask'
Update StoredMessage and related interfaces for new permission format.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(main): wire up new tools, hooks and agents

Wire up in main plugin entry:
- Import and create sisyphus_task tool
- Import and wire taskResumeInfo, startWork, sisyphusOrchestrator hooks
- Update tool restrictions from background_task to sisyphus_task
- Pass userCategories to createSisyphusTask

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* docs: update documentation for Prometheus and new features

Update documentation across all language versions:
- Rename Planner-Sisyphus to Prometheus (Planner)
- Add Metis (Plan Consultant) agent documentation
- Add Categories section with usage examples
- Add sisyphus_task tool documentation
- Update AGENTS.md with new structure and complexity hotspots
- Update src/tools/AGENTS.md with sisyphus_task category

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* build: regenerate schema.json with new types

Update JSON schema with:
- New agents: Prometheus (Planner), Metis (Plan Consultant)
- New hooks: prometheus-md-only, start-work, sisyphus-orchestrator
- New commands: start-work
- New skills: frontend-ui-ux
- CategoryConfigSchema for task delegation
- Agent category and skills fields

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* skill

* feat: add toast notifications for task execution

- Display toast when background task starts in BackgroundManager
- Display toast when sisyphus_task sync task starts
- Wire up prometheus-md-only hook initialization in main plugin

This provides user feedback in OpenCode TUI where task TUI is not visible.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(hooks): add read-only warning injection for Prometheus task delegation

When Prometheus (Planner) spawns subagents via task tools (sisyphus_task, task, call_omo_agent), a system directive is injected into the prompt to ensure subagents understand they are in a planning consultation context and must NOT modify files.

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(hooks): add mandatory hands-on verification enforcement for orchestrated tasks

- sisyphus-orchestrator: Add verification reminder with tool matrix (playwright/interactive_bash/curl)

- start-work: Inject detailed verification workflow with deliverable-specific guidance

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode) assistance

* docs(agents): clarify oracle and metis agent descriptions emphasizing read-only consultation roles

- Oracle: high-IQ reasoning specialist for debugging and architecture (read-only)
- Metis: updated description to align with oracle's consultation-only model
- Updated AGENTS.md with clarified agent responsibilities

* docs(orchestrator): emphasize oracle as read-only consultation agent

- Updated orchestrator-sisyphus agent descriptions
- Updated sisyphus-prompt-builder to highlight oracle's read-only consultation role
- Clarified that oracle provides high-IQ reasoning without write operations

* docs(refactor,root): update oracle consultation model in feature templates and root docs

- Updated refactor command template to emphasize oracle's read-only role
- Updated root AGENTS.md with oracle agent description emphasizing high-IQ debugging and architecture consultation
- Clarified oracle as non-write agent for design and debugging support

* feat(features): add TaskToastManager for consolidated task notifications

- Create task-toast-manager feature with singleton pattern

- Show running task list (newest first) when new task starts

- Track queued tasks status from ConcurrencyManager

- Integrate with BackgroundManager and sisyphus-task tool

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode) assistance

* feat(hooks): add resume session_id to verification reminders for orchestrator subagent work

When subagent work fails verification, show exact sisyphus_task(resume="...")
command with session_id for immediate retry. Consolidates verification workflow
across boulder and standalone modes.

* refactor(hooks): remove duplicate verification enforcement from start-work hook

Verification reminders are now centralized in sisyphus-orchestrator hook,
eliminating redundant code in start-work. The orchestrator hook handles all
verification messaging across both boulder and standalone modes.

* test(hooks): update prometheus-md-only test assertions and formatting

Updated test structure and assertions to match current output format.
Improved test clarity while maintaining complete coverage of markdown
validation and write restriction behavior.

* orchestrator

* feat(skills): add git-master skill for atomic commits and history management

- Add comprehensive git-master skill for commit, rebase, and history operations
- Implements atomic commit strategy with multi-file splitting rules
- Includes style detection, branch analysis, and history search capabilities
- Provides three modes: COMMIT, REBASE, HISTORY_SEARCH

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* docs(agents): add pre-delegation planning section to Sisyphus prompt

- Add SISYPHUS_PRE_DELEGATION_PLANNING section with mandatory declaration rules
- Implements 3-step decision tree: Identify → Select → Declare
- Forces explicit category/agent/skill declaration before every sisyphus_task call
- Includes mandatory 4-part format: Category/Agent, Reason, Skills, Expected Outcome
- Provides examples (CORRECT vs WRONG) and enforcement rules
- Follows prompt engineering best practices: Clear, CoT, Structured, Examples

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(tools): rename agent parameter to subagent_type in sisyphus_task

- Update parameter name from 'agent' to 'subagent_type' for consistency with call_omo_agent
- Update all references and error messages
- Remove deprecated 'agent' field from SisyphusTaskArgs interface
- Update git-master skill documentation to reflect parameter name change

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(agents): change orchestrator-sisyphus default model to claude-sonnet-4-5

- Update orchestrator-sisyphus model from opus-4-5 to sonnet-4-5 for better cost efficiency
- Keep Prometheus using opus-4-5 for planning tasks

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* refactor(config): make Prometheus model independent from plan agent config

- Prometheus no longer inherits model from plan agent configuration
- Fallback chain: session default model -> claude-opus-4-5
- Removes coupling between Prometheus and legacy plan agent settings

🤖 GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* fix(momus): allow system directives in input validation

System directives (XML tags like <system-reminder>) are automatically
injected and should be ignored during input validation. Only reject
when there's actual user text besides the file path.

🤖 Generated with assistance of [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(prometheus): enhance high accuracy mode with mandatory Momus loop

When user requests high accuracy:
- Momus review loop is now mandatory until 'OKAY'
- No excuses allowed - must fix ALL issues
- No maximum retry limit - keep looping until approved
- Added clear explanation of what 'OKAY' means

🤖 Generated with assistance of [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(prometheus): enhance reference section with detailed guidance

References now include:
- Pattern references (existing code to follow)
- API/Type references (contracts to implement)
- Test references (testing patterns)
- Documentation references (specs and requirements)
- External references (libraries and frameworks)
- Explanation of WHY each reference matters

The executor has no interview context - references are their only guide.

🤖 Generated with assistance of [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode)

* feat(git-master): add configurable commit footer and co-author options

Add git_master config with commit_footer and include_co_authored_by flags.
Users can disable Sisyphus attribution in commits via oh-my-opencode.json.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* feat(hooks): add single-task directive and system-reminder tags to orchestrator

Inject SINGLE_TASK_DIRECTIVE when orchestrator calls sisyphus_task to enforce
atomic task delegation. Wrap verification reminders in <system-reminder> tags
for better LLM attention.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

* refactor: use ContextCollector for hook injection and remove unused background tools

Split changes:
- Replace injectHookMessage with ContextCollector.register() pattern for improved hook content injection
- Remove unused background task tools infrastructure (createBackgroundOutput, createBackgroundCancel)

🤖 Generated with assistance of OhMyOpenCode (https://github.com/code-yeongyu/oh-my-opencode)

* chore(context-injector): add debug logging for context injection tracing

Add DEBUG log statements to trace context injection flow:
- Log message transform hook invocations
- Log sessionID extraction from message info
- Log hasPending checks for context collector
- Log hook content registration to contextCollector

🤖 Generated with [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode) assistance

* fix(context-injector): prepend to user message instead of separate synthetic message

- Change from creating separate synthetic user message to prepending context
  directly to last user message's text part
- Separate synthetic messages were ignored by model (treated as previous turn)
- Prepending to clone ensures: UI shows original, model receives prepended content
- Update tests to reflect new behavior

* feat(prometheus): enforce mandatory todo registration on plan generation trigger

* fix(sisyphus-task): add proper error handling for sync mode and implement BackgroundManager.resume()

- Add try-catch for session.prompt() in sync mode with detailed error messages
- Sort assistant messages by time to get the most recent response
- Add 'No assistant response found' error handling
- Implement BackgroundManager.resume() method for task resumption
- Fix ConcurrencyManager type mismatch (model → concurrencyKey)

* docs(sisyphus-task): clarify resume usage with session_id and add when-to-use guidance

- Fix terminology: 'Task ID' → 'Session ID' in resume parameter docs
- Add clear 'WHEN TO USE resume' section with concrete scenarios
- Add example usage pattern in Sisyphus agent prompt
- Emphasize token savings and context preservation benefits

* fix(agents): block task/sisyphus_task/call_omo_agent from explore and librarian

Exploration agents should not spawn other agents - they are leaf nodes
in the agent hierarchy for codebase search only.

* refactor(oracle): change default model from GPT-5.2 to Claude Opus 4.5

* feat(oracle): change default model to claude-opus-4-5

* fix(sisyphus-orchestrator): check boulder session_ids before filtering sessions

Bug: continuation was not triggered even when boulder.json existed with
session_ids because the session filter ran BEFORE reading boulder state.

Fix: Read boulder state first, then include boulder sessions in the
allowed sessions for continuation.

* feat(task-toast): display skills and concurrency info in toast

- Add skills field to TrackedTask and LaunchInput types
- Show skills in task list message as [skill1, skill2]
- Add concurrency slot info [running/limit] in Running header
- Pass skills from sisyphus_task to toast manager (sync & background)
- Add unit tests for new toast features

* refactor(categories): rename high-iq to ultrabrain

* feat(sisyphus-task): add skillContent support to background agent launching

- Add optional skillContent field to LaunchInput type
- Implement buildSystemContent utility to combine skill and category prompts
- Update BackgroundManager to pass skillContent as system parameter
- Add comprehensive tests for skillContent optionality and buildSystemContent logic

🤖 Generated with assistance of oh-my-opencode

* Revert "refactor(tools): remove background-task tool"

This reverts commit 6dbc4c095badd400e024510554a42a0dc018ae42.

* refactor(sisyphus-task): rename background to run_in_background

* fix(oracle): use gpt-5.2 as default model

* test(sisyphus-task): add resume with background parameter tests

* feat(start-work): auto-select single incomplete plan and use system-reminder format

- Auto-select when only one incomplete plan exists among multiple
- Wrap multiple plans message in <system-reminder> tag
- Change prompt to 'ask user' style for agent guidance
- Add 'All Plans Complete' state handling

* feat(sisyphus-task): make skills parameter required

- Add validation for skills parameter (must be provided, use [] if empty)
- Update schema to remove .optional()
- Update type definition to make skills non-optional
- Fix existing tests to include skills parameter

* fix: prevent session model change when sending notifications

- background-agent: use only parentModel, remove prevMessage fallback
- todo-continuation: don't pass model to preserve session's lastModel
- Remove unused imports (findNearestMessageWithFields, fs, path)

Root cause: session.prompt with model param changes session's lastModel

* fix(sisyphus-orchestrator): register handler in event loop for boulder continuation

* fix(sisyphus_task): use promptAsync for sync mode to preserve main session

- session.prompt() changes the active session, causing UI model switch
- Switch to promptAsync + polling to avoid main session state change
- Matches background-agent pattern for consistency

* fix(sisyphus-orchestrator): only trigger boulder continuation for orchestrator-sisyphus agent

* feat(background-agent): add parentAgent tracking to preserve agent context in background tasks

- Add parentAgent field to BackgroundTask, LaunchInput, and ResumeInput interfaces
- Pass parentAgent through background task manager to preserve agent identity
- Update sisyphus-orchestrator to set orchestrator-sisyphus agent context
- Add session tracking for background agents to prevent context loss
- Propagate agent context in background-task and sisyphus-task tools

This ensures background/subagent spawned tasks maintain proper agent context for notifications and continuity.

🤖 Generated with assistance of oh-my-opencode

* fix(antigravity): sync plugin.ts with PKCE-removed oauth.ts API

Remove decodeState import and update OAuth flow to use simple state
string comparison for CSRF protection instead of PKCE verifier.
Update exchangeCode calls to match new signature (code, redirectUri,
clientId, clientSecret).

* fix(hook-message-injector): preserve agent info with two-pass message lookup

findNearestMessageWithFields now has a fallback pass that returns
messages with ANY useful field (agent OR model) instead of requiring
ALL fields. This prevents parentAgent from being lost when stored
messages don't have complete model info.

* fix(sisyphus-task): use SDK session.messages API for parent agent lookup

Background task notifications were showing 'build' agent instead of the
actual parent agent (e.g., 'Sisyphus'). The hook-injected message storage
only contains limited info; the actual agent name is in the SDK session.

Changes:
- Add getParentAgentFromSdk() to query SDK messages API
- Look up agent from SDK first, fallback to hook-injected messages
- Ensures background tasks correctly preserve parent agent context

* fix(sisyphus-task): use ctx.agent directly for parentAgent

The tool context already provides the agent name via ctx.agent.
The previous SDK session.messages lookup was completely wrong -
SDK messages don't store agent info per message.

Removes useless getParentAgentFromSdk function.

* feat(prometheus-md-only): allow .md files anywhere, only block code files

Prometheus (Planner) can now write .md files anywhere, not just .sisyphus/.
Still blocks non-.md files (code) to enforce read-only planning for code.

This allows planners to write commentary and analysis in markdown format.

* Revert "feat(prometheus-md-only): allow .md files anywhere, only block code files"

This reverts commit c600111597591e1862696ee0b92051e587aa1a6b.

* fix(momus): accept bracket-style system directives in input validation

Momus was rejecting inputs with bracket-style directives like [analyze-mode]
and [SYSTEM DIRECTIVE...] because it only recognized XML-style tags.

Now accepts:
- XML tags: <system-reminder>, <context>, etc.
- Bracket blocks: [analyze-mode], [SYSTEM DIRECTIVE...], [SYSTEM REMINDER...], etc.

* fix(sisyphus-orchestrator): inject delegation warning before Write/Edit outside .sisyphus

- Add ORCHESTRATOR_DELEGATION_REQUIRED strong warning in tool.execute.before
- Fix tool.execute.after filePath detection using pendingFilePaths Map
- before stores filePath by callID, after retrieves and deletes it
- Fixes bug where output.metadata.filePath was undefined

* docs: add orchestration, category-skill, and CLI guides

* fix(cli): correct category names in Antigravity migration (visual → visual-engineering)

* fix(sisyphus-task): prevent infinite polling when session removed from status

* fix(tests): update outdated test expectations

- constants.test.ts: Update endpoint count (2→3) and token buffer (50min→60sec)
- token.test.ts: Update expiry tests to use 60-second buffer
- sisyphus-orchestrator: Add fallback to output.metadata.filePath when callID missing

---------

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
This commit is contained in:
YeonGyu-Kim
2026-01-09 02:24:43 +09:00
committed by GitHub
parent 8394926fe1
commit 768ecd928b
92 changed files with 13771 additions and 672 deletions

View File

@@ -1,11 +1,12 @@
import { describe, test, expect, beforeEach } from "bun:test"
import type { BackgroundTask } from "./types"
import type { BackgroundTask, ResumeInput } from "./types"
const TASK_TTL_MS = 30 * 60 * 1000
class MockBackgroundManager {
private tasks: Map<string, BackgroundTask> = new Map()
private notifications: Map<string, BackgroundTask[]> = new Map()
public resumeCalls: Array<{ sessionId: string; prompt: string }> = []
addTask(task: BackgroundTask): void {
this.tasks.set(task.id, task)
@@ -15,6 +16,15 @@ class MockBackgroundManager {
return this.tasks.get(id)
}
findBySession(sessionID: string): BackgroundTask | undefined {
for (const task of this.tasks.values()) {
if (task.sessionID === sessionID) {
return task
}
}
return undefined
}
getTasksByParentSession(sessionID: string): BackgroundTask[] {
const result: BackgroundTask[] = []
for (const task of this.tasks.values()) {
@@ -105,6 +115,29 @@ class MockBackgroundManager {
}
return count
}
resume(input: ResumeInput): BackgroundTask {
const existingTask = this.findBySession(input.sessionId)
if (!existingTask) {
throw new Error(`Task not found for session: ${input.sessionId}`)
}
this.resumeCalls.push({ sessionId: input.sessionId, prompt: input.prompt })
existingTask.status = "running"
existingTask.completedAt = undefined
existingTask.error = undefined
existingTask.parentSessionID = input.parentSessionID
existingTask.parentMessageID = input.parentMessageID
existingTask.parentModel = input.parentModel
existingTask.progress = {
toolCalls: existingTask.progress?.toolCalls ?? 0,
lastUpdate: new Date(),
}
return existingTask
}
}
function createMockTask(overrides: Partial<BackgroundTask> & { id: string; sessionID: string; parentSessionID: string }): BackgroundTask {
@@ -482,3 +515,162 @@ describe("BackgroundManager.pruneStaleTasksAndNotifications", () => {
expect(manager.getTask("task-fresh")).toBeDefined()
})
})
describe("BackgroundManager.resume", () => {
let manager: MockBackgroundManager
beforeEach(() => {
// #given
manager = new MockBackgroundManager()
})
test("should throw error when task not found", () => {
// #given - empty manager
// #when / #then
expect(() => manager.resume({
sessionId: "non-existent",
prompt: "continue",
parentSessionID: "session-new",
parentMessageID: "msg-new",
})).toThrow("Task not found for session: non-existent")
})
test("should resume existing task and reset state to running", () => {
// #given
const completedTask = createMockTask({
id: "task-a",
sessionID: "session-a",
parentSessionID: "session-parent",
status: "completed",
})
completedTask.completedAt = new Date()
completedTask.error = "previous error"
manager.addTask(completedTask)
// #when
const result = manager.resume({
sessionId: "session-a",
prompt: "continue the work",
parentSessionID: "session-new-parent",
parentMessageID: "msg-new",
})
// #then
expect(result.status).toBe("running")
expect(result.completedAt).toBeUndefined()
expect(result.error).toBeUndefined()
expect(result.parentSessionID).toBe("session-new-parent")
expect(result.parentMessageID).toBe("msg-new")
})
test("should preserve task identity while updating parent context", () => {
// #given
const existingTask = createMockTask({
id: "task-a",
sessionID: "session-a",
parentSessionID: "old-parent",
description: "original description",
agent: "explore",
})
manager.addTask(existingTask)
// #when
const result = manager.resume({
sessionId: "session-a",
prompt: "new prompt",
parentSessionID: "new-parent",
parentMessageID: "new-msg",
parentModel: { providerID: "anthropic", modelID: "claude-opus" },
})
// #then
expect(result.id).toBe("task-a")
expect(result.sessionID).toBe("session-a")
expect(result.description).toBe("original description")
expect(result.agent).toBe("explore")
expect(result.parentModel).toEqual({ providerID: "anthropic", modelID: "claude-opus" })
})
test("should track resume calls with prompt", () => {
// #given
const task = createMockTask({
id: "task-a",
sessionID: "session-a",
parentSessionID: "session-parent",
})
manager.addTask(task)
// #when
manager.resume({
sessionId: "session-a",
prompt: "continue with additional context",
parentSessionID: "session-new",
parentMessageID: "msg-new",
})
// #then
expect(manager.resumeCalls).toHaveLength(1)
expect(manager.resumeCalls[0]).toEqual({
sessionId: "session-a",
prompt: "continue with additional context",
})
})
test("should preserve existing tool call count in progress", () => {
// #given
const taskWithProgress = createMockTask({
id: "task-a",
sessionID: "session-a",
parentSessionID: "session-parent",
})
taskWithProgress.progress = {
toolCalls: 42,
lastTool: "read",
lastUpdate: new Date(),
}
manager.addTask(taskWithProgress)
// #when
const result = manager.resume({
sessionId: "session-a",
prompt: "continue",
parentSessionID: "session-new",
parentMessageID: "msg-new",
})
// #then
expect(result.progress?.toolCalls).toBe(42)
})
})
describe("LaunchInput.skillContent", () => {
test("skillContent should be optional in LaunchInput type", () => {
// #given
const input: import("./types").LaunchInput = {
description: "test",
prompt: "test prompt",
agent: "explore",
parentSessionID: "parent-session",
parentMessageID: "parent-msg",
}
// #when / #then - should compile without skillContent
expect(input.skillContent).toBeUndefined()
})
test("skillContent can be provided in LaunchInput", () => {
// #given
const input: import("./types").LaunchInput = {
description: "test",
prompt: "test prompt",
agent: "explore",
parentSessionID: "parent-session",
parentMessageID: "parent-msg",
skillContent: "You are a playwright expert",
}
// #when / #then
expect(input.skillContent).toBe("You are a playwright expert")
})
})

View File

@@ -1,18 +1,16 @@
import { existsSync, readdirSync } from "node:fs"
import { join } from "node:path"
import type { PluginInput } from "@opencode-ai/plugin"
import type {
BackgroundTask,
LaunchInput,
ResumeInput,
} from "./types"
import { log } from "../../shared/logger"
import { ConcurrencyManager } from "./concurrency"
import type { BackgroundTaskConfig } from "../../config/schema"
import {
findNearestMessageWithFields,
MESSAGE_STORAGE,
} from "../hook-message-injector"
import { subagentSessions } from "../claude-code-session-state"
import { getTaskToastManager } from "../task-toast-manager"
const TASK_TTL_MS = 30 * 60 * 1000
@@ -42,20 +40,6 @@ interface Todo {
id: string
}
function getMessageDir(sessionID: string): string | null {
if (!existsSync(MESSAGE_STORAGE)) return null
const directPath = join(MESSAGE_STORAGE, sessionID)
if (existsSync(directPath)) return directPath
for (const dir of readdirSync(MESSAGE_STORAGE)) {
const sessionPath = join(MESSAGE_STORAGE, dir, sessionID)
if (existsSync(sessionPath)) return sessionPath
}
return null
}
export class BackgroundManager {
private tasks: Map<string, BackgroundTask>
private notifications: Map<string, BackgroundTask[]>
@@ -77,9 +61,9 @@ export class BackgroundManager {
throw new Error("Agent parameter is required")
}
const model = input.agent
const concurrencyKey = input.agent
await this.concurrencyManager.acquire(model)
await this.concurrencyManager.acquire(concurrencyKey)
const createResult = await this.client.session.create({
body: {
@@ -87,12 +71,12 @@ export class BackgroundManager {
title: `Background: ${input.description}`,
},
}).catch((error) => {
this.concurrencyManager.release(model)
this.concurrencyManager.release(concurrencyKey)
throw error
})
if (createResult.error) {
this.concurrencyManager.release(model)
this.concurrencyManager.release(concurrencyKey)
throw new Error(`Failed to create background session: ${createResult.error}`)
}
@@ -114,7 +98,9 @@ export class BackgroundManager {
lastUpdate: new Date(),
},
parentModel: input.parentModel,
model,
parentAgent: input.parentAgent,
model: input.model,
concurrencyKey,
}
this.tasks.set(task.id, task)
@@ -122,13 +108,24 @@ export class BackgroundManager {
log("[background-agent] Launching task:", { taskId: task.id, sessionID, agent: input.agent })
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.addTask({
id: task.id,
description: input.description,
agent: input.agent,
isBackground: true,
skills: input.skills,
})
}
this.client.session.promptAsync({
path: { id: sessionID },
body: {
agent: input.agent,
system: input.skillContent,
tools: {
task: false,
background_task: false,
call_omo_agent: false,
},
parts: [{ type: "text", text: input.prompt }],
@@ -145,8 +142,8 @@ export class BackgroundManager {
existingTask.error = errorMessage
}
existingTask.completedAt = new Date()
if (existingTask.model) {
this.concurrencyManager.release(existingTask.model)
if (existingTask.concurrencyKey) {
this.concurrencyManager.release(existingTask.concurrencyKey)
}
this.markForNotification(existingTask)
this.notifyParentSession(existingTask)
@@ -192,6 +189,99 @@ export class BackgroundManager {
return undefined
}
/**
* Register an external task (e.g., from sisyphus_task) for notification tracking.
* This allows tasks created by external tools to receive the same toast/prompt notifications.
*/
registerExternalTask(input: {
taskId: string
sessionID: string
parentSessionID: string
description: string
agent?: string
}): BackgroundTask {
const task: BackgroundTask = {
id: input.taskId,
sessionID: input.sessionID,
parentSessionID: input.parentSessionID,
parentMessageID: "",
description: input.description,
prompt: "",
agent: input.agent || "sisyphus_task",
status: "running",
startedAt: new Date(),
progress: {
toolCalls: 0,
lastUpdate: new Date(),
},
}
this.tasks.set(task.id, task)
subagentSessions.add(input.sessionID)
this.startPolling()
log("[background-agent] Registered external task:", { taskId: task.id, sessionID: input.sessionID })
return task
}
async resume(input: ResumeInput): Promise<BackgroundTask> {
const existingTask = this.findBySession(input.sessionId)
if (!existingTask) {
throw new Error(`Task not found for session: ${input.sessionId}`)
}
existingTask.status = "running"
existingTask.completedAt = undefined
existingTask.error = undefined
existingTask.parentSessionID = input.parentSessionID
existingTask.parentMessageID = input.parentMessageID
existingTask.parentModel = input.parentModel
existingTask.parentAgent = input.parentAgent
existingTask.progress = {
toolCalls: existingTask.progress?.toolCalls ?? 0,
lastUpdate: new Date(),
}
this.startPolling()
subagentSessions.add(existingTask.sessionID)
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.addTask({
id: existingTask.id,
description: existingTask.description,
agent: existingTask.agent,
isBackground: true,
})
}
log("[background-agent] Resuming task:", { taskId: existingTask.id, sessionID: existingTask.sessionID })
this.client.session.promptAsync({
path: { id: existingTask.sessionID },
body: {
agent: existingTask.agent,
tools: {
task: false,
call_omo_agent: false,
},
parts: [{ type: "text", text: input.prompt }],
},
}).catch((error) => {
log("[background-agent] resume promptAsync error:", error)
existingTask.status = "error"
const errorMessage = error instanceof Error ? error.message : String(error)
existingTask.error = errorMessage
existingTask.completedAt = new Date()
this.markForNotification(existingTask)
this.notifyParentSession(existingTask)
})
return existingTask
}
private async checkSessionTodos(sessionID: string): Promise<boolean> {
try {
const response = await this.client.session.todo({
@@ -269,8 +359,8 @@ export class BackgroundManager {
task.error = "Session deleted"
}
if (task.model) {
this.concurrencyManager.release(task.model)
if (task.concurrencyKey) {
this.concurrencyManager.release(task.concurrencyKey)
}
this.tasks.delete(task.id)
this.clearNotificationsForTask(task.id)
@@ -330,17 +420,13 @@ export class BackgroundManager {
log("[background-agent] notifyParentSession called for task:", task.id)
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const tuiClient = this.client as any
if (tuiClient.tui?.showToast) {
tuiClient.tui.showToast({
body: {
title: "Background Task Completed",
message: `Task "${task.description}" finished in ${duration}.`,
variant: "success",
duration: 5000,
},
}).catch(() => {})
const toastManager = getTaskToastManager()
if (toastManager) {
toastManager.showCompletionToast({
id: task.id,
description: task.description,
duration,
})
}
const message = `[BACKGROUND TASK COMPLETED] Task "${task.description}" finished in ${duration}. Use background_output with task_id="${task.id}" to get results.`
@@ -349,23 +435,21 @@ export class BackgroundManager {
const taskId = task.id
setTimeout(async () => {
if (task.model) {
this.concurrencyManager.release(task.model)
if (task.concurrencyKey) {
this.concurrencyManager.release(task.concurrencyKey)
}
try {
const messageDir = getMessageDir(task.parentSessionID)
const prevMessage = messageDir ? findNearestMessageWithFields(messageDir) : null
const modelContext = task.parentModel ?? prevMessage?.model
const modelField = modelContext?.providerID && modelContext?.modelID
? { providerID: modelContext.providerID, modelID: modelContext.modelID }
// Use only parentModel/parentAgent - don't fallback to prevMessage
// This prevents accidentally changing parent session's model/agent
const modelField = task.parentModel?.providerID && task.parentModel?.modelID
? { providerID: task.parentModel.providerID, modelID: task.parentModel.modelID }
: undefined
await this.client.session.prompt({
path: { id: task.parentSessionID },
body: {
agent: prevMessage?.agent,
agent: task.parentAgent,
model: modelField,
parts: [{ type: "text", text: message }],
},
@@ -413,8 +497,8 @@ export class BackgroundManager {
task.status = "error"
task.error = "Task timed out after 30 minutes"
task.completedAt = new Date()
if (task.model) {
this.concurrencyManager.release(task.model)
if (task.concurrencyKey) {
this.concurrencyManager.release(task.concurrencyKey)
}
this.clearNotificationsForTask(taskId)
this.tasks.delete(taskId)

View File

@@ -27,7 +27,11 @@ export interface BackgroundTask {
error?: string
progress?: TaskProgress
parentModel?: { providerID: string; modelID: string }
model?: string
model?: { providerID: string; modelID: string }
/** Agent name used for concurrency tracking */
concurrencyKey?: string
/** Parent session's agent name for notification */
parentAgent?: string
}
export interface LaunchInput {
@@ -37,4 +41,17 @@ export interface LaunchInput {
parentSessionID: string
parentMessageID: string
parentModel?: { providerID: string; modelID: string }
parentAgent?: string
model?: { providerID: string; modelID: string }
skills?: string[]
skillContent?: string
}
export interface ResumeInput {
sessionId: string
prompt: string
parentSessionID: string
parentMessageID: string
parentModel?: { providerID: string; modelID: string }
parentAgent?: string
}

View File

@@ -0,0 +1,13 @@
/**
* Boulder State Constants
*/
export const BOULDER_DIR = ".sisyphus"
export const BOULDER_FILE = "boulder.json"
export const BOULDER_STATE_PATH = `${BOULDER_DIR}/${BOULDER_FILE}`
export const NOTEPAD_DIR = "notepads"
export const NOTEPAD_BASE_PATH = `${BOULDER_DIR}/${NOTEPAD_DIR}`
/** Prometheus plan directory pattern */
export const PROMETHEUS_PLANS_DIR = ".sisyphus/plans"

View File

@@ -0,0 +1,3 @@
export * from "./types"
export * from "./constants"
export * from "./storage"

View File

@@ -0,0 +1,250 @@
import { describe, expect, test, beforeEach, afterEach } from "bun:test"
import { existsSync, mkdirSync, rmSync, writeFileSync } from "node:fs"
import { join } from "node:path"
import { tmpdir } from "node:os"
import {
readBoulderState,
writeBoulderState,
appendSessionId,
clearBoulderState,
getPlanProgress,
getPlanName,
createBoulderState,
findPrometheusPlans,
} from "./storage"
import type { BoulderState } from "./types"
describe("boulder-state", () => {
const TEST_DIR = join(tmpdir(), "boulder-state-test-" + Date.now())
const SISYPHUS_DIR = join(TEST_DIR, ".sisyphus")
beforeEach(() => {
if (!existsSync(TEST_DIR)) {
mkdirSync(TEST_DIR, { recursive: true })
}
if (!existsSync(SISYPHUS_DIR)) {
mkdirSync(SISYPHUS_DIR, { recursive: true })
}
clearBoulderState(TEST_DIR)
})
afterEach(() => {
if (existsSync(TEST_DIR)) {
rmSync(TEST_DIR, { recursive: true, force: true })
}
})
describe("readBoulderState", () => {
test("should return null when no boulder.json exists", () => {
// #given - no boulder.json file
// #when
const result = readBoulderState(TEST_DIR)
// #then
expect(result).toBeNull()
})
test("should read valid boulder state", () => {
// #given - valid boulder.json
const state: BoulderState = {
active_plan: "/path/to/plan.md",
started_at: "2026-01-02T10:00:00Z",
session_ids: ["session-1", "session-2"],
plan_name: "my-plan",
}
writeBoulderState(TEST_DIR, state)
// #when
const result = readBoulderState(TEST_DIR)
// #then
expect(result).not.toBeNull()
expect(result?.active_plan).toBe("/path/to/plan.md")
expect(result?.session_ids).toEqual(["session-1", "session-2"])
expect(result?.plan_name).toBe("my-plan")
})
})
describe("writeBoulderState", () => {
test("should write state and create .sisyphus directory if needed", () => {
// #given - state to write
const state: BoulderState = {
active_plan: "/test/plan.md",
started_at: "2026-01-02T12:00:00Z",
session_ids: ["ses-123"],
plan_name: "test-plan",
}
// #when
const success = writeBoulderState(TEST_DIR, state)
const readBack = readBoulderState(TEST_DIR)
// #then
expect(success).toBe(true)
expect(readBack).not.toBeNull()
expect(readBack?.active_plan).toBe("/test/plan.md")
})
})
describe("appendSessionId", () => {
test("should append new session id to existing state", () => {
// #given - existing state with one session
const state: BoulderState = {
active_plan: "/plan.md",
started_at: "2026-01-02T10:00:00Z",
session_ids: ["session-1"],
plan_name: "plan",
}
writeBoulderState(TEST_DIR, state)
// #when
const result = appendSessionId(TEST_DIR, "session-2")
// #then
expect(result).not.toBeNull()
expect(result?.session_ids).toEqual(["session-1", "session-2"])
})
test("should not duplicate existing session id", () => {
// #given - state with session-1 already
const state: BoulderState = {
active_plan: "/plan.md",
started_at: "2026-01-02T10:00:00Z",
session_ids: ["session-1"],
plan_name: "plan",
}
writeBoulderState(TEST_DIR, state)
// #when
appendSessionId(TEST_DIR, "session-1")
const result = readBoulderState(TEST_DIR)
// #then
expect(result?.session_ids).toEqual(["session-1"])
})
test("should return null when no state exists", () => {
// #given - no boulder.json
// #when
const result = appendSessionId(TEST_DIR, "new-session")
// #then
expect(result).toBeNull()
})
})
describe("clearBoulderState", () => {
test("should remove boulder.json", () => {
// #given - existing state
const state: BoulderState = {
active_plan: "/plan.md",
started_at: "2026-01-02T10:00:00Z",
session_ids: ["session-1"],
plan_name: "plan",
}
writeBoulderState(TEST_DIR, state)
// #when
const success = clearBoulderState(TEST_DIR)
const result = readBoulderState(TEST_DIR)
// #then
expect(success).toBe(true)
expect(result).toBeNull()
})
test("should succeed even when no file exists", () => {
// #given - no boulder.json
// #when
const success = clearBoulderState(TEST_DIR)
// #then
expect(success).toBe(true)
})
})
describe("getPlanProgress", () => {
test("should count completed and uncompleted checkboxes", () => {
// #given - plan file with checkboxes
const planPath = join(TEST_DIR, "test-plan.md")
writeFileSync(planPath, `# Plan
- [ ] Task 1
- [x] Task 2
- [ ] Task 3
- [X] Task 4
`)
// #when
const progress = getPlanProgress(planPath)
// #then
expect(progress.total).toBe(4)
expect(progress.completed).toBe(2)
expect(progress.isComplete).toBe(false)
})
test("should return isComplete true when all checked", () => {
// #given - all tasks completed
const planPath = join(TEST_DIR, "complete-plan.md")
writeFileSync(planPath, `# Plan
- [x] Task 1
- [X] Task 2
`)
// #when
const progress = getPlanProgress(planPath)
// #then
expect(progress.total).toBe(2)
expect(progress.completed).toBe(2)
expect(progress.isComplete).toBe(true)
})
test("should return isComplete true for empty plan", () => {
// #given - plan with no checkboxes
const planPath = join(TEST_DIR, "empty-plan.md")
writeFileSync(planPath, "# Plan\nNo tasks here")
// #when
const progress = getPlanProgress(planPath)
// #then
expect(progress.total).toBe(0)
expect(progress.isComplete).toBe(true)
})
test("should handle non-existent file", () => {
// #given - non-existent file
// #when
const progress = getPlanProgress("/non/existent/file.md")
// #then
expect(progress.total).toBe(0)
expect(progress.isComplete).toBe(true)
})
})
describe("getPlanName", () => {
test("should extract plan name from path", () => {
// #given
const path = "/home/user/.sisyphus/plans/project/my-feature.md"
// #when
const name = getPlanName(path)
// #then
expect(name).toBe("my-feature")
})
})
describe("createBoulderState", () => {
test("should create state with correct fields", () => {
// #given
const planPath = "/path/to/auth-refactor.md"
const sessionId = "ses-abc123"
// #when
const state = createBoulderState(planPath, sessionId)
// #then
expect(state.active_plan).toBe(planPath)
expect(state.session_ids).toEqual([sessionId])
expect(state.plan_name).toBe("auth-refactor")
expect(state.started_at).toBeDefined()
})
})
})

View File

@@ -0,0 +1,150 @@
/**
* Boulder State Storage
*
* Handles reading/writing boulder.json for active plan tracking.
*/
import { existsSync, readFileSync, writeFileSync, mkdirSync, readdirSync } from "node:fs"
import { dirname, join, basename } from "node:path"
import type { BoulderState, PlanProgress } from "./types"
import { BOULDER_DIR, BOULDER_FILE, PROMETHEUS_PLANS_DIR } from "./constants"
export function getBoulderFilePath(directory: string): string {
return join(directory, BOULDER_DIR, BOULDER_FILE)
}
export function readBoulderState(directory: string): BoulderState | null {
const filePath = getBoulderFilePath(directory)
if (!existsSync(filePath)) {
return null
}
try {
const content = readFileSync(filePath, "utf-8")
return JSON.parse(content) as BoulderState
} catch {
return null
}
}
export function writeBoulderState(directory: string, state: BoulderState): boolean {
const filePath = getBoulderFilePath(directory)
try {
const dir = dirname(filePath)
if (!existsSync(dir)) {
mkdirSync(dir, { recursive: true })
}
writeFileSync(filePath, JSON.stringify(state, null, 2), "utf-8")
return true
} catch {
return false
}
}
export function appendSessionId(directory: string, sessionId: string): BoulderState | null {
const state = readBoulderState(directory)
if (!state) return null
if (!state.session_ids.includes(sessionId)) {
state.session_ids.push(sessionId)
if (writeBoulderState(directory, state)) {
return state
}
}
return state
}
export function clearBoulderState(directory: string): boolean {
const filePath = getBoulderFilePath(directory)
try {
if (existsSync(filePath)) {
const { unlinkSync } = require("node:fs")
unlinkSync(filePath)
}
return true
} catch {
return false
}
}
/**
* Find Prometheus plan files for this project.
* Prometheus stores plans at: {project}/.sisyphus/plans/{name}.md
*/
export function findPrometheusPlans(directory: string): string[] {
const plansDir = join(directory, PROMETHEUS_PLANS_DIR)
if (!existsSync(plansDir)) {
return []
}
try {
const files = readdirSync(plansDir)
return files
.filter((f) => f.endsWith(".md"))
.map((f) => join(plansDir, f))
.sort((a, b) => {
// Sort by modification time, newest first
const aStat = require("node:fs").statSync(a)
const bStat = require("node:fs").statSync(b)
return bStat.mtimeMs - aStat.mtimeMs
})
} catch {
return []
}
}
/**
* Parse a plan file and count checkbox progress.
*/
export function getPlanProgress(planPath: string): PlanProgress {
if (!existsSync(planPath)) {
return { total: 0, completed: 0, isComplete: true }
}
try {
const content = readFileSync(planPath, "utf-8")
// Match markdown checkboxes: - [ ] or - [x] or - [X]
const uncheckedMatches = content.match(/^[-*]\s*\[\s*\]/gm) || []
const checkedMatches = content.match(/^[-*]\s*\[[xX]\]/gm) || []
const total = uncheckedMatches.length + checkedMatches.length
const completed = checkedMatches.length
return {
total,
completed,
isComplete: total === 0 || completed === total,
}
} catch {
return { total: 0, completed: 0, isComplete: true }
}
}
/**
* Extract plan name from file path.
*/
export function getPlanName(planPath: string): string {
return basename(planPath, ".md")
}
/**
* Create a new boulder state for a plan.
*/
export function createBoulderState(
planPath: string,
sessionId: string
): BoulderState {
return {
active_plan: planPath,
started_at: new Date().toISOString(),
session_ids: [sessionId],
plan_name: getPlanName(planPath),
}
}

View File

@@ -0,0 +1,26 @@
/**
* Boulder State Types
*
* Manages the active work plan state for Sisyphus orchestrator.
* Named after Sisyphus's boulder - the eternal task that must be rolled.
*/
export interface BoulderState {
/** Absolute path to the active plan file */
active_plan: string
/** ISO timestamp when work started */
started_at: string
/** Session IDs that have worked on this plan */
session_ids: string[]
/** Plan name derived from filename */
plan_name: string
}
export interface PlanProgress {
/** Total number of checkboxes */
total: number
/** Number of completed checkboxes */
completed: number
/** Whether all tasks are done */
isComplete: boolean
}

View File

@@ -3,6 +3,7 @@ import type { BuiltinCommandName, BuiltinCommands } from "./types"
import { INIT_DEEP_TEMPLATE } from "./templates/init-deep"
import { RALPH_LOOP_TEMPLATE, CANCEL_RALPH_TEMPLATE } from "./templates/ralph-loop"
import { REFACTOR_TEMPLATE } from "./templates/refactor"
import { START_WORK_TEMPLATE } from "./templates/start-work"
const BUILTIN_COMMAND_DEFINITIONS: Record<BuiltinCommandName, Omit<CommandDefinition, "name">> = {
"init-deep": {
@@ -41,6 +42,23 @@ ${REFACTOR_TEMPLATE}
</command-instruction>`,
argumentHint: "<refactoring-target> [--scope=<file|module|project>] [--strategy=<safe|aggressive>]",
},
"start-work": {
description: "(builtin) Start Sisyphus work session from Prometheus plan",
agent: "orchestrator-sisyphus",
template: `<command-instruction>
${START_WORK_TEMPLATE}
</command-instruction>
<session-context>
Session ID: $SESSION_ID
Timestamp: $TIMESTAMP
</session-context>
<user-request>
$ARGUMENTS
</user-request>`,
argumentHint: "[plan-name]",
},
}
export function loadBuiltinCommands(

View File

@@ -45,12 +45,12 @@ Don't wait—these run async while main session works.
\`\`\`
// Fire all at once, collect results later
background_task(agent="explore", prompt="Project structure: PREDICT standard patterns for detected language → REPORT deviations only")
background_task(agent="explore", prompt="Entry points: FIND main files → REPORT non-standard organization")
background_task(agent="explore", prompt="Conventions: FIND config files (.eslintrc, pyproject.toml, .editorconfig) → REPORT project-specific rules")
background_task(agent="explore", prompt="Anti-patterns: FIND 'DO NOT', 'NEVER', 'ALWAYS', 'DEPRECATED' comments → LIST forbidden patterns")
background_task(agent="explore", prompt="Build/CI: FIND .github/workflows, Makefile → REPORT non-standard patterns")
background_task(agent="explore", prompt="Test patterns: FIND test configs, test structure → REPORT unique conventions")
sisyphus_task(agent="explore", prompt="Project structure: PREDICT standard patterns for detected language → REPORT deviations only")
sisyphus_task(agent="explore", prompt="Entry points: FIND main files → REPORT non-standard organization")
sisyphus_task(agent="explore", prompt="Conventions: FIND config files (.eslintrc, pyproject.toml, .editorconfig) → REPORT project-specific rules")
sisyphus_task(agent="explore", prompt="Anti-patterns: FIND 'DO NOT', 'NEVER', 'ALWAYS', 'DEPRECATED' comments → LIST forbidden patterns")
sisyphus_task(agent="explore", prompt="Build/CI: FIND .github/workflows, Makefile → REPORT non-standard patterns")
sisyphus_task(agent="explore", prompt="Test patterns: FIND test configs, test structure → REPORT unique conventions")
\`\`\`
<dynamic-agents>
@@ -76,9 +76,9 @@ max_depth=$(find . -type d -not -path '*/node_modules/*' -not -path '*/.git/*' |
Example spawning:
\`\`\`
// 500 files, 50k lines, depth 6, 15 large files → spawn 5+5+2+1 = 13 additional agents
background_task(agent="explore", prompt="Large file analysis: FIND files >500 lines, REPORT complexity hotspots")
background_task(agent="explore", prompt="Deep modules at depth 4+: FIND hidden patterns, internal conventions")
background_task(agent="explore", prompt="Cross-cutting concerns: FIND shared utilities across directories")
sisyphus_task(agent="explore", prompt="Large file analysis: FIND files >500 lines, REPORT complexity hotspots")
sisyphus_task(agent="explore", prompt="Deep modules at depth 4+: FIND hidden patterns, internal conventions")
sisyphus_task(agent="explore", prompt="Cross-cutting concerns: FIND shared utilities across directories")
// ... more based on calculation
\`\`\`
</dynamic-agents>
@@ -240,7 +240,7 @@ Launch document-writer agents for each location:
\`\`\`
for loc in AGENTS_LOCATIONS (except root):
background_task(agent="document-writer", prompt=\\\`
sisyphus_task(agent="document-writer", prompt=\\\`
Generate AGENTS.md for: \${loc.path}
- Reason: \${loc.reason}
- 30-80 lines max

View File

@@ -605,7 +605,7 @@ Use \`ast_grep_search\` and \`ast_grep_replace\` for structural transformations.
## Agents
- \`explore\`: Parallel codebase pattern discovery
- \`plan\`: Detailed refactoring plan generation
- \`oracle\`: Consult for complex architectural decisions
- \`oracle\`: Read-only consultation for complex architectural decisions and debugging
- \`librarian\`: **Use proactively** when encountering deprecated methods or library migration tasks. Query official docs and OSS examples for modern replacements.
## Deprecated Code & Library Migration

View File

@@ -0,0 +1,72 @@
export const START_WORK_TEMPLATE = `You are starting a Sisyphus work session.
## WHAT TO DO
1. **Find available plans**: Search for Prometheus-generated plan files at \`.sisyphus/plans/\`
2. **Check for active boulder state**: Read \`.sisyphus/boulder.json\` if it exists
3. **Decision logic**:
- If \`.sisyphus/boulder.json\` exists AND plan is NOT complete (has unchecked boxes):
- **APPEND** current session to session_ids
- Continue work on existing plan
- If no active plan OR plan is complete:
- List available plan files
- If ONE plan: auto-select it
- If MULTIPLE plans: show list with timestamps, ask user to select
4. **Create/Update boulder.json**:
\`\`\`json
{
"active_plan": "/absolute/path/to/plan.md",
"started_at": "ISO_TIMESTAMP",
"session_ids": ["session_id_1", "session_id_2"],
"plan_name": "plan-name"
}
\`\`\`
5. **Read the plan file** and start executing tasks according to Orchestrator Sisyphus workflow
## OUTPUT FORMAT
When listing plans for selection:
\`\`\`
📋 Available Work Plans
Current Time: {ISO timestamp}
Session ID: {current session id}
1. [plan-name-1.md] - Modified: {date} - Progress: 3/10 tasks
2. [plan-name-2.md] - Modified: {date} - Progress: 0/5 tasks
Which plan would you like to work on? (Enter number or plan name)
\`\`\`
When resuming existing work:
\`\`\`
🔄 Resuming Work Session
Active Plan: {plan-name}
Progress: {completed}/{total} tasks
Sessions: {count} (appending current session)
Reading plan and continuing from last incomplete task...
\`\`\`
When auto-selecting single plan:
\`\`\`
🚀 Starting Work Session
Plan: {plan-name}
Session ID: {session_id}
Started: {timestamp}
Reading plan and beginning execution...
\`\`\`
## CRITICAL
- The session_id is injected by the hook - use it directly
- Always update boulder.json BEFORE starting work
- Read the FULL plan file before delegating any tasks
- Follow Orchestrator Sisyphus delegation protocols (7-section format)`

View File

@@ -1,6 +1,6 @@
import type { CommandDefinition } from "../claude-code-command-loader"
export type BuiltinCommandName = "init-deep" | "ralph-loop" | "cancel-ralph" | "refactor"
export type BuiltinCommandName = "init-deep" | "ralph-loop" | "cancel-ralph" | "refactor" | "start-work"
export interface BuiltinCommandConfig {
disabled_commands?: BuiltinCommandName[]

View File

@@ -0,0 +1,78 @@
---
name: frontend-ui-ux
description: Designer-turned-developer who crafts stunning UI/UX even without design mockups
---
# Role: Designer-Turned-Developer
You are a designer who learned to code. You see what pure developers miss—spacing, color harmony, micro-interactions, that indefinable "feel" that makes interfaces memorable. Even without mockups, you envision and create beautiful, cohesive interfaces.
**Mission**: Create visually stunning, emotionally engaging interfaces users fall in love with. Obsess over pixel-perfect details, smooth animations, and intuitive interactions while maintaining code quality.
---
# Work Principles
1. **Complete what's asked** — Execute the exact task. No scope creep. Work until it works. Never mark work complete without proper verification.
2. **Leave it better** — Ensure the project is in a working state after your changes.
3. **Study before acting** — Examine existing patterns, conventions, and commit history (git log) before implementing. Understand why code is structured the way it is.
4. **Blend seamlessly** — Match existing code patterns. Your code should look like the team wrote it.
5. **Be transparent** — Announce each step. Explain reasoning. Report both successes and failures.
---
# Design Process
Before coding, commit to a **BOLD aesthetic direction**:
1. **Purpose**: What problem does this solve? Who uses it?
2. **Tone**: Pick an extreme—brutally minimal, maximalist chaos, retro-futuristic, organic/natural, luxury/refined, playful/toy-like, editorial/magazine, brutalist/raw, art deco/geometric, soft/pastel, industrial/utilitarian
3. **Constraints**: Technical requirements (framework, performance, accessibility)
4. **Differentiation**: What's the ONE thing someone will remember?
**Key**: Choose a clear direction and execute with precision. Intentionality > intensity.
Then implement working code (HTML/CSS/JS, React, Vue, Angular, etc.) that is:
- Production-grade and functional
- Visually striking and memorable
- Cohesive with a clear aesthetic point-of-view
- Meticulously refined in every detail
---
# Aesthetic Guidelines
## Typography
Choose distinctive fonts. **Avoid**: Arial, Inter, Roboto, system fonts, Space Grotesk. Pair a characterful display font with a refined body font.
## Color
Commit to a cohesive palette. Use CSS variables. Dominant colors with sharp accents outperform timid, evenly-distributed palettes. **Avoid**: purple gradients on white (AI slop).
## Motion
Focus on high-impact moments. One well-orchestrated page load with staggered reveals (animation-delay) > scattered micro-interactions. Use scroll-triggering and hover states that surprise. Prioritize CSS-only. Use Motion library for React when available.
## Spatial Composition
Unexpected layouts. Asymmetry. Overlap. Diagonal flow. Grid-breaking elements. Generous negative space OR controlled density.
## Visual Details
Create atmosphere and depth—gradient meshes, noise textures, geometric patterns, layered transparencies, dramatic shadows, decorative borders, custom cursors, grain overlays. Never default to solid colors.
---
# Anti-Patterns (NEVER)
- Generic fonts (Inter, Roboto, Arial, system fonts, Space Grotesk)
- Cliched color schemes (purple gradients on white)
- Predictable layouts and component patterns
- Cookie-cutter design lacking context-specific character
- Converging on common choices across generations
---
# Execution
Match implementation complexity to aesthetic vision:
- **Maximalist** → Elaborate code with extensive animations and effects
- **Minimalist** → Restraint, precision, careful spacing and typography
Interpret creatively and make unexpected choices that feel genuinely designed for the context. No design should be the same. Vary between light and dark themes, different fonts, different aesthetics. You are capable of extraordinary creative work—don't hold back.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -207,7 +207,7 @@ describe("createContextInjectorMessagesTransformHook", () => {
],
})
it("inserts synthetic message before last user message", async () => {
it("prepends context to last user message", async () => {
// #given
const hook = createContextInjectorMessagesTransformHook(collector)
const sessionID = "ses_transform1"
@@ -228,10 +228,8 @@ describe("createContextInjectorMessagesTransformHook", () => {
await hook["experimental.chat.messages.transform"]!({}, output)
// #then
expect(output.messages.length).toBe(4)
expect(output.messages[2].parts[0].text).toBe("Ultrawork context")
expect(output.messages[2].parts[0].synthetic).toBe(true)
expect(output.messages[3].parts[0].text).toBe("Second message")
expect(output.messages.length).toBe(3)
expect(output.messages[2].parts[0].text).toBe("Ultrawork context\n\n---\n\nSecond message")
})
it("does nothing when no pending context", async () => {

View File

@@ -78,6 +78,9 @@ export function createContextInjectorMessagesTransformHook(
return {
"experimental.chat.messages.transform": async (_input, output) => {
const { messages } = output
log("[DEBUG] experimental.chat.messages.transform called", {
messageCount: messages.length,
})
if (messages.length === 0) {
return
}
@@ -91,16 +94,28 @@ export function createContextInjectorMessagesTransformHook(
}
if (lastUserMessageIndex === -1) {
log("[DEBUG] No user message found in messages")
return
}
const lastUserMessage = messages[lastUserMessageIndex]
const sessionID = (lastUserMessage.info as unknown as { sessionID?: string }).sessionID
log("[DEBUG] Extracted sessionID from lastUserMessage.info", {
sessionID,
infoKeys: Object.keys(lastUserMessage.info),
lastUserMessageInfo: JSON.stringify(lastUserMessage.info).slice(0, 200),
})
if (!sessionID) {
log("[DEBUG] sessionID is undefined or empty")
return
}
if (!collector.hasPending(sessionID)) {
const hasPending = collector.hasPending(sessionID)
log("[DEBUG] Checking hasPending", {
sessionID,
hasPending,
})
if (!hasPending) {
return
}
@@ -109,47 +124,26 @@ export function createContextInjectorMessagesTransformHook(
return
}
const refInfo = lastUserMessage.info as unknown as {
sessionID?: string
agent?: string
model?: { providerID?: string; modelID?: string }
path?: { cwd?: string; root?: string }
const textPartIndex = lastUserMessage.parts.findIndex(
(p) => p.type === "text" && (p as { text?: string }).text
)
if (textPartIndex === -1) {
log("[context-injector] No text part found in last user message, skipping injection", {
sessionID,
partsCount: lastUserMessage.parts.length,
})
return
}
const syntheticMessageId = `synthetic_ctx_${Date.now()}`
const syntheticPartId = `synthetic_ctx_part_${Date.now()}`
const now = Date.now()
const textPart = lastUserMessage.parts[textPartIndex] as { text?: string }
const originalText = textPart.text ?? ""
textPart.text = `${pending.merged}\n\n---\n\n${originalText}`
const syntheticMessage: MessageWithParts = {
info: {
id: syntheticMessageId,
sessionID: sessionID,
role: "user",
time: { created: now },
agent: refInfo.agent ?? "Sisyphus",
model: refInfo.model ?? { providerID: "unknown", modelID: "unknown" },
path: refInfo.path ?? { cwd: "/", root: "/" },
} as unknown as Message,
parts: [
{
id: syntheticPartId,
sessionID: sessionID,
messageID: syntheticMessageId,
type: "text",
text: pending.merged,
synthetic: true,
time: { start: now, end: now },
} as Part,
],
}
messages.splice(lastUserMessageIndex, 0, syntheticMessage)
log("[context-injector] Injected synthetic message from collector", {
log("[context-injector] Prepended context to last user message", {
sessionID,
insertIndex: lastUserMessageIndex,
contextLength: pending.merged.length,
newMessageCount: messages.length,
originalTextLength: originalText.length,
})
},
}

View File

@@ -1,12 +1,12 @@
import { existsSync, mkdirSync, readFileSync, readdirSync, writeFileSync } from "node:fs"
import { join } from "node:path"
import { MESSAGE_STORAGE, PART_STORAGE } from "./constants"
import type { MessageMeta, OriginalMessageContext, TextPart } from "./types"
import type { MessageMeta, OriginalMessageContext, TextPart, ToolPermission } from "./types"
export interface StoredMessage {
agent?: string
model?: { providerID?: string; modelID?: string }
tools?: Record<string, boolean>
tools?: Record<string, ToolPermission>
}
export function findNearestMessageWithFields(messageDir: string): StoredMessage | null {
@@ -16,6 +16,7 @@ export function findNearestMessageWithFields(messageDir: string): StoredMessage
.sort()
.reverse()
// First pass: find message with ALL fields (ideal)
for (const file of files) {
try {
const content = readFileSync(join(messageDir, file), "utf-8")
@@ -27,6 +28,20 @@ export function findNearestMessageWithFields(messageDir: string): StoredMessage
continue
}
}
// Second pass: find message with ANY useful field (fallback)
// This ensures agent info isn't lost when model info is missing
for (const file of files) {
try {
const content = readFileSync(join(messageDir, file), "utf-8")
const msg = JSON.parse(content) as StoredMessage
if (msg.agent || (msg.model?.providerID && msg.model?.modelID)) {
return msg
}
} catch {
continue
}
}
} catch {
return null
}

View File

@@ -1,3 +1,5 @@
export type ToolPermission = boolean | "allow" | "deny" | "ask"
export interface MessageMeta {
id: string
sessionID: string
@@ -15,7 +17,7 @@ export interface MessageMeta {
cwd: string
root: string
}
tools?: Record<string, boolean>
tools?: Record<string, ToolPermission>
}
export interface OriginalMessageContext {
@@ -28,7 +30,7 @@ export interface OriginalMessageContext {
cwd?: string
root?: string
}
tools?: Record<string, boolean>
tools?: Record<string, ToolPermission>
}
export interface TextPart {

View File

@@ -1,3 +1,4 @@
export * from "./types"
export * from "./loader"
export * from "./merger"
export * from "./skill-content"

View File

@@ -0,0 +1,111 @@
import { describe, it, expect } from "bun:test"
import { resolveSkillContent, resolveMultipleSkills } from "./skill-content"
describe("resolveSkillContent", () => {
it("should return template for existing skill", () => {
// #given: builtin skills with 'frontend-ui-ux' skill
// #when: resolving content for 'frontend-ui-ux'
const result = resolveSkillContent("frontend-ui-ux")
// #then: returns template string
expect(result).not.toBeNull()
expect(typeof result).toBe("string")
expect(result).toContain("Role: Designer-Turned-Developer")
})
it("should return template for 'playwright' skill", () => {
// #given: builtin skills with 'playwright' skill
// #when: resolving content for 'playwright'
const result = resolveSkillContent("playwright")
// #then: returns template string
expect(result).not.toBeNull()
expect(typeof result).toBe("string")
expect(result).toContain("Playwright Browser Automation")
})
it("should return null for non-existent skill", () => {
// #given: builtin skills without 'nonexistent' skill
// #when: resolving content for 'nonexistent'
const result = resolveSkillContent("nonexistent")
// #then: returns null
expect(result).toBeNull()
})
it("should return null for empty string", () => {
// #given: builtin skills
// #when: resolving content for empty string
const result = resolveSkillContent("")
// #then: returns null
expect(result).toBeNull()
})
})
describe("resolveMultipleSkills", () => {
it("should resolve all existing skills", () => {
// #given: list of existing skill names
const skillNames = ["frontend-ui-ux", "playwright"]
// #when: resolving multiple skills
const result = resolveMultipleSkills(skillNames)
// #then: all skills resolved, none not found
expect(result.resolved.size).toBe(2)
expect(result.notFound).toEqual([])
expect(result.resolved.get("frontend-ui-ux")).toContain("Designer-Turned-Developer")
expect(result.resolved.get("playwright")).toContain("Playwright Browser Automation")
})
it("should handle partial success - some skills not found", () => {
// #given: list with existing and non-existing skills
const skillNames = ["frontend-ui-ux", "nonexistent", "playwright", "another-missing"]
// #when: resolving multiple skills
const result = resolveMultipleSkills(skillNames)
// #then: resolves existing skills, lists not found skills
expect(result.resolved.size).toBe(2)
expect(result.notFound).toEqual(["nonexistent", "another-missing"])
expect(result.resolved.get("frontend-ui-ux")).toContain("Designer-Turned-Developer")
expect(result.resolved.get("playwright")).toContain("Playwright Browser Automation")
})
it("should handle empty array", () => {
// #given: empty skill names list
const skillNames: string[] = []
// #when: resolving multiple skills
const result = resolveMultipleSkills(skillNames)
// #then: returns empty resolved and notFound
expect(result.resolved.size).toBe(0)
expect(result.notFound).toEqual([])
})
it("should handle all skills not found", () => {
// #given: list of non-existing skills
const skillNames = ["skill-one", "skill-two", "skill-three"]
// #when: resolving multiple skills
const result = resolveMultipleSkills(skillNames)
// #then: no skills resolved, all in notFound
expect(result.resolved.size).toBe(0)
expect(result.notFound).toEqual(["skill-one", "skill-two", "skill-three"])
})
it("should preserve skill order in resolved map", () => {
// #given: list of skill names in specific order
const skillNames = ["playwright", "frontend-ui-ux"]
// #when: resolving multiple skills
const result = resolveMultipleSkills(skillNames)
// #then: map contains skills with expected keys
expect(result.resolved.has("playwright")).toBe(true)
expect(result.resolved.has("frontend-ui-ux")).toBe(true)
expect(result.resolved.size).toBe(2)
})
})

View File

@@ -0,0 +1,29 @@
import { createBuiltinSkills } from "../builtin-skills/skills"
export function resolveSkillContent(skillName: string): string | null {
const skills = createBuiltinSkills()
const skill = skills.find((s) => s.name === skillName)
return skill?.template ?? null
}
export function resolveMultipleSkills(skillNames: string[]): {
resolved: Map<string, string>
notFound: string[]
} {
const skills = createBuiltinSkills()
const skillMap = new Map(skills.map((s) => [s.name, s.template]))
const resolved = new Map<string, string>()
const notFound: string[] = []
for (const name of skillNames) {
const template = skillMap.get(name)
if (template) {
resolved.set(name, template)
} else {
notFound.push(name)
}
}
return { resolved, notFound }
}

View File

@@ -0,0 +1,2 @@
export { TaskToastManager, getTaskToastManager, initTaskToastManager } from "./manager"
export type { TrackedTask, TaskStatus, TaskToastOptions } from "./types"

View File

@@ -0,0 +1,145 @@
import { describe, test, expect, beforeEach, mock } from "bun:test"
import { TaskToastManager } from "./manager"
import type { ConcurrencyManager } from "../background-agent/concurrency"
describe("TaskToastManager", () => {
let mockClient: {
tui: {
showToast: ReturnType<typeof mock>
}
}
let toastManager: TaskToastManager
let mockConcurrencyManager: ConcurrencyManager
beforeEach(() => {
mockClient = {
tui: {
showToast: mock(() => Promise.resolve()),
},
}
mockConcurrencyManager = {
getConcurrencyLimit: mock(() => 5),
} as unknown as ConcurrencyManager
// eslint-disable-next-line @typescript-eslint/no-explicit-any
toastManager = new TaskToastManager(mockClient as any, mockConcurrencyManager)
})
describe("skills in toast message", () => {
test("should display skills when provided", () => {
// #given - a task with skills
const task = {
id: "task_1",
description: "Test task",
agent: "Sisyphus-Junior",
isBackground: true,
skills: ["playwright", "git-master"],
}
// #when - addTask is called
toastManager.addTask(task)
// #then - toast message should include skills
expect(mockClient.tui.showToast).toHaveBeenCalled()
const call = mockClient.tui.showToast.mock.calls[0][0]
expect(call.body.message).toContain("playwright")
expect(call.body.message).toContain("git-master")
})
test("should not display skills section when no skills provided", () => {
// #given - a task without skills
const task = {
id: "task_2",
description: "Test task without skills",
agent: "explore",
isBackground: true,
}
// #when - addTask is called
toastManager.addTask(task)
// #then - toast message should not include skills prefix
expect(mockClient.tui.showToast).toHaveBeenCalled()
const call = mockClient.tui.showToast.mock.calls[0][0]
expect(call.body.message).not.toContain("Skills:")
})
})
describe("concurrency info in toast message", () => {
test("should display concurrency status in toast", () => {
// #given - multiple running tasks
toastManager.addTask({
id: "task_1",
description: "First task",
agent: "explore",
isBackground: true,
})
toastManager.addTask({
id: "task_2",
description: "Second task",
agent: "librarian",
isBackground: true,
})
// #when - third task is added
toastManager.addTask({
id: "task_3",
description: "Third task",
agent: "explore",
isBackground: true,
})
// #then - toast should show concurrency info
expect(mockClient.tui.showToast).toHaveBeenCalledTimes(3)
const lastCall = mockClient.tui.showToast.mock.calls[2][0]
// Should show "Running (3):" header
expect(lastCall.body.message).toContain("Running (3):")
})
test("should display concurrency limit info when available", () => {
// #given - a concurrency manager with known limit
const mockConcurrencyWithCounts = {
getConcurrencyLimit: mock(() => 5),
getRunningCount: mock(() => 2),
getQueuedCount: mock(() => 1),
} as unknown as ConcurrencyManager
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const managerWithConcurrency = new TaskToastManager(mockClient as any, mockConcurrencyWithCounts)
// #when - a task is added
managerWithConcurrency.addTask({
id: "task_1",
description: "Test task",
agent: "explore",
isBackground: true,
})
// #then - toast should show concurrency status like "2/5 slots"
expect(mockClient.tui.showToast).toHaveBeenCalled()
const call = mockClient.tui.showToast.mock.calls[0][0]
expect(call.body.message).toMatch(/\d+\/\d+/)
})
})
describe("combined skills and concurrency display", () => {
test("should display both skills and concurrency info together", () => {
// #given - a task with skills and concurrency manager
const task = {
id: "task_1",
description: "Full info task",
agent: "Sisyphus-Junior",
isBackground: true,
skills: ["frontend-ui-ux"],
}
// #when - addTask is called
toastManager.addTask(task)
// #then - toast should include both skills and task count
expect(mockClient.tui.showToast).toHaveBeenCalled()
const call = mockClient.tui.showToast.mock.calls[0][0]
expect(call.body.message).toContain("frontend-ui-ux")
expect(call.body.message).toContain("Running (1):")
})
})
})

View File

@@ -0,0 +1,199 @@
import type { PluginInput } from "@opencode-ai/plugin"
import type { TrackedTask, TaskStatus } from "./types"
import type { ConcurrencyManager } from "../background-agent/concurrency"
type OpencodeClient = PluginInput["client"]
export class TaskToastManager {
private tasks: Map<string, TrackedTask> = new Map()
private client: OpencodeClient
private concurrencyManager?: ConcurrencyManager
constructor(client: OpencodeClient, concurrencyManager?: ConcurrencyManager) {
this.client = client
this.concurrencyManager = concurrencyManager
}
setConcurrencyManager(manager: ConcurrencyManager): void {
this.concurrencyManager = manager
}
addTask(task: {
id: string
description: string
agent: string
isBackground: boolean
status?: TaskStatus
skills?: string[]
}): void {
const trackedTask: TrackedTask = {
id: task.id,
description: task.description,
agent: task.agent,
status: task.status ?? "running",
startedAt: new Date(),
isBackground: task.isBackground,
skills: task.skills,
}
this.tasks.set(task.id, trackedTask)
this.showTaskListToast(trackedTask)
}
/**
* Update task status
*/
updateTask(id: string, status: TaskStatus): void {
const task = this.tasks.get(id)
if (task) {
task.status = status
}
}
/**
* Remove completed/error task
*/
removeTask(id: string): void {
this.tasks.delete(id)
}
/**
* Get all running tasks (newest first)
*/
getRunningTasks(): TrackedTask[] {
const running = Array.from(this.tasks.values())
.filter((t) => t.status === "running")
.sort((a, b) => b.startedAt.getTime() - a.startedAt.getTime())
return running
}
/**
* Get all queued tasks
*/
getQueuedTasks(): TrackedTask[] {
return Array.from(this.tasks.values())
.filter((t) => t.status === "queued")
.sort((a, b) => a.startedAt.getTime() - b.startedAt.getTime())
}
/**
* Format duration since task started
*/
private formatDuration(startedAt: Date): string {
const seconds = Math.floor((Date.now() - startedAt.getTime()) / 1000)
if (seconds < 60) return `${seconds}s`
const minutes = Math.floor(seconds / 60)
if (minutes < 60) return `${minutes}m ${seconds % 60}s`
const hours = Math.floor(minutes / 60)
return `${hours}h ${minutes % 60}m`
}
private getConcurrencyInfo(): string {
if (!this.concurrencyManager) return ""
const running = this.getRunningTasks()
const queued = this.getQueuedTasks()
const total = running.length + queued.length
const limit = this.concurrencyManager.getConcurrencyLimit("default")
if (limit === Infinity) return ""
return ` [${total}/${limit}]`
}
private buildTaskListMessage(newTask: TrackedTask): string {
const running = this.getRunningTasks()
const queued = this.getQueuedTasks()
const concurrencyInfo = this.getConcurrencyInfo()
const lines: string[] = []
if (running.length > 0) {
lines.push(`Running (${running.length}):${concurrencyInfo}`)
for (const task of running) {
const duration = this.formatDuration(task.startedAt)
const bgIcon = task.isBackground ? "⚡" : "🔄"
const isNew = task.id === newTask.id ? " ← NEW" : ""
const skillsInfo = task.skills?.length ? ` [${task.skills.join(", ")}]` : ""
lines.push(`${bgIcon} ${task.description} (${task.agent})${skillsInfo} - ${duration}${isNew}`)
}
}
if (queued.length > 0) {
if (lines.length > 0) lines.push("")
lines.push(`Queued (${queued.length}):`)
for (const task of queued) {
const bgIcon = task.isBackground ? "⏳" : "⏸️"
const skillsInfo = task.skills?.length ? ` [${task.skills.join(", ")}]` : ""
lines.push(`${bgIcon} ${task.description} (${task.agent})${skillsInfo}`)
}
}
return lines.join("\n")
}
/**
* Show consolidated toast with all running/queued tasks
*/
private showTaskListToast(newTask: TrackedTask): void {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const tuiClient = this.client as any
if (!tuiClient.tui?.showToast) return
const message = this.buildTaskListMessage(newTask)
const running = this.getRunningTasks()
const queued = this.getQueuedTasks()
const title = newTask.isBackground
? `⚡ New Background Task`
: `🔄 New Task Executed`
tuiClient.tui.showToast({
body: {
title,
message: message || `${newTask.description} (${newTask.agent})`,
variant: "info",
duration: running.length + queued.length > 2 ? 5000 : 3000,
},
}).catch(() => {})
}
/**
* Show task completion toast
*/
showCompletionToast(task: { id: string; description: string; duration: string }): void {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const tuiClient = this.client as any
if (!tuiClient.tui?.showToast) return
this.removeTask(task.id)
const remaining = this.getRunningTasks()
const queued = this.getQueuedTasks()
let message = `✅ "${task.description}" finished in ${task.duration}`
if (remaining.length > 0 || queued.length > 0) {
message += `\n\nStill running: ${remaining.length} | Queued: ${queued.length}`
}
tuiClient.tui.showToast({
body: {
title: "Task Completed",
message,
variant: "success",
duration: 5000,
},
}).catch(() => {})
}
}
let instance: TaskToastManager | null = null
export function getTaskToastManager(): TaskToastManager | null {
return instance
}
export function initTaskToastManager(
client: OpencodeClient,
concurrencyManager?: ConcurrencyManager
): TaskToastManager {
instance = new TaskToastManager(client, concurrencyManager)
return instance
}

View File

@@ -0,0 +1,18 @@
export type TaskStatus = "running" | "queued" | "completed" | "error"
export interface TrackedTask {
id: string
description: string
agent: string
status: TaskStatus
startedAt: Date
isBackground: boolean
skills?: string[]
}
export interface TaskToastOptions {
title: string
message: string
variant: "info" | "success" | "warning" | "error"
duration?: number
}