Open Source • MIT Licensed

TaskOSS - The Task Operating System

Define once. Evaluate everywhere. Learn continuously.
The open infrastructure for making every task observable, retryable, and improvable - across humans, AI agents, and automated systems.

task.yaml
# TaskOSS Spec v1.0
task_id: "deploy-auth-service"
version: "2.1.0"
goal: "Deploy authentication service to production"

owner:
  type: "agent"
  id: "claude-ops-v3"

qa:
  eval_type: "functional"
  criteria:
    - "All endpoints return 200 OK"
    - "Auth tokens validate correctly"
    - "Latency under 200ms p99"
  retry:
    enabled: true
    max_attempts: 3
    backoff: "exponential"

status: "completed"
eval_score: 0.96

Every Day, Billions in Work Disappears Without a Trace

  • Tasks complete without evaluation - work finishes but quality is unknown
  • AI agents don't know if they succeeded - no structured feedback loops
  • Platform teams lack retry logic - failures repeat without learning
  • No visibility into task quality - outcomes are opaque
Track My Tasks
deploy-frontend-v3
Failed • No retry configured • 2h ago
sync-user-database
Completed • No evaluation • Unknown quality
generate-monthly-report
Stale • Last updated 3 days ago
migrate-auth-schema
Failed 4x • No learning loop • Blocked

QA-Native. Agent-Ready. Retry-Smart.

Every task gets an ID, a score, and a chance to improve. No more black-box workflows.

Define Once, Run Anywhere

Declarative YAML specs that work across CI/CD, agents, and manual workflows.

Built-In QA & Eval

Every task carries its own success criteria and automatic evaluation logic.

Traceable Outputs

Full lineage from definition to execution to evaluation results.

Agent-First Interop

Native support for AI agents with structured feedback and retry loops.

See It In Action: From Definition to Insight

One YAML file. Full observability. Every task becomes a learning opportunity.

onboard-user.yaml
task_id: "onboard-new-user"
version: "1.2.0"
goal: "Complete user onboarding with verification"

owner:
  type: "agent"
  id: "onboarding-agent-v2"

inputs:
  user_email: "required"
  plan_type: "optional"

qa:
  eval_type: "functional"
  criteria:
    - "User record created in database"
    - "Welcome email sent successfully"
    - "Session token generated"
  timeout: "30s"
  retry:
    enabled: true
    max_attempts: 3
evaluation.json
task_id: "onboard-new-user"
execution_id: "exec-7f3a9b2c"

evaluation:
  overall_score: 0.94
  status: "passed"
  criteria_results:
    - name: "User record created"
      passed: true
      latency: "45ms"
    - name: "Welcome email sent"
      passed: true
      latency: "1.2s"
    - name: "Session token generated"
      passed: true
      latency: "12ms"

completed_at: "2024-01-15T14:32:07Z"
retry-log.yaml
task_id: "onboard-new-user"

retry_history:
  - attempt: 1
    status: "failed"
    reason: "Email service timeout"
    timestamp: "14:31:45Z"

  - attempt: 2
    status: "failed"
    reason: "Email service timeout"
    backoff: "2s"

  - attempt: 3
    status: "success"
    backoff: "4s"

learning:
  pattern: "email_service_latency"
  recommendation: "Increase timeout to 5s"
onboard-new-user Passed
Version 1.2.0
Owner onboarding-agent-v2
Execution Time 22.3s
Retry Attempts 3 / 3
Eval Score
94%
User record created
Welcome email sent
Session token generated

Three Steps to Task Intelligence

1

Define

Create a task spec with task_id, goal, owner, and QA criteria in YAML.

2

Track

Execute the task and stream updates - status, progress, logs, and retries.

3

Evaluate & Learn

Run QA evaluations, capture scores, and feed learnings back into the system.

Built for Every Team That Ships

From product launches to AI agent orchestration - TaskOSS brings structure to chaos.

Product Manager

Full Visibility

See every task's real status - not just "done" but quality, retries, and blockers.

QA Lead

Eval Automation

Define criteria once, run on every execution. Track quality trends over time.

DevOps

Retry & Resilience

Built-in retry policies with exponential backoff. Tasks self-heal without intervention.

Agent Framework

Structured Feedback

Give AI agents clear success signals. Enable learning loops with versioned specs.

MIT Licensed Open Source Production Ready Enterprise Support
"Work without feedback is wasted.
TaskOSS makes every task scoreable, retriable, and future-aware."

The open standard for task intelligence - from definition to evaluation to continuous improvement.

Bring Evaluation to Every Task

Whether you're shipping features, coordinating agents, or automating platforms - TaskOSS gives your team the power to track, learn, and improve.