Define once. Evaluate everywhere. Learn continuously.
The open infrastructure for making every task observable, retryable, and improvable - across humans, AI agents, and automated systems.
# TaskOSS Spec v1.0
task_id: "deploy-auth-service"
version: "2.1.0"
goal: "Deploy authentication service to production"
owner:
type: "agent"
id: "claude-ops-v3"
qa:
eval_type: "functional"
criteria:
- "All endpoints return 200 OK"
- "Auth tokens validate correctly"
- "Latency under 200ms p99"
retry:
enabled: true
max_attempts: 3
backoff: "exponential"
status: "completed"
eval_score: 0.96
Every task gets an ID, a score, and a chance to improve. No more black-box workflows.
Declarative YAML specs that work across CI/CD, agents, and manual workflows.
Every task carries its own success criteria and automatic evaluation logic.
Full lineage from definition to execution to evaluation results.
Native support for AI agents with structured feedback and retry loops.
One YAML file. Full observability. Every task becomes a learning opportunity.
task_id: "onboard-new-user"
version: "1.2.0"
goal: "Complete user onboarding with verification"
owner:
type: "agent"
id: "onboarding-agent-v2"
inputs:
user_email: "required"
plan_type: "optional"
qa:
eval_type: "functional"
criteria:
- "User record created in database"
- "Welcome email sent successfully"
- "Session token generated"
timeout: "30s"
retry:
enabled: true
max_attempts: 3
task_id: "onboard-new-user"
execution_id: "exec-7f3a9b2c"
evaluation:
overall_score: 0.94
status: "passed"
criteria_results:
- name: "User record created"
passed: true
latency: "45ms"
- name: "Welcome email sent"
passed: true
latency: "1.2s"
- name: "Session token generated"
passed: true
latency: "12ms"
completed_at: "2024-01-15T14:32:07Z"
task_id: "onboard-new-user"
retry_history:
- attempt: 1
status: "failed"
reason: "Email service timeout"
timestamp: "14:31:45Z"
- attempt: 2
status: "failed"
reason: "Email service timeout"
backoff: "2s"
- attempt: 3
status: "success"
backoff: "4s"
learning:
pattern: "email_service_latency"
recommendation: "Increase timeout to 5s"
Create a task spec with task_id, goal, owner, and QA criteria in YAML.
Execute the task and stream updates - status, progress, logs, and retries.
Run QA evaluations, capture scores, and feed learnings back into the system.
From product launches to AI agent orchestration - TaskOSS brings structure to chaos.
See every task's real status - not just "done" but quality, retries, and blockers.
Define criteria once, run on every execution. Track quality trends over time.
Built-in retry policies with exponential backoff. Tasks self-heal without intervention.
Give AI agents clear success signals. Enable learning loops with versioned specs.
"Work without feedback is wasted.
TaskOSS makes every task scoreable, retriable, and future-aware."
The open standard for task intelligence - from definition to evaluation to continuous improvement.
Whether you're shipping features, coordinating agents, or automating platforms - TaskOSS gives your team the power to track, learn, and improve.