Use Case
Agent Quality Score
A public explanation of the delivery-quality threshold used to keep workflows runnable, testable, and maintainable.
What the score checks
- Goal and scope clarity
- Input and output definition
- Workflow completeness and exception handling
- Risk boundaries and human review points
- Test coverage and deployment readiness
Threshold logic
- The current delivery process uses a 24/30 release threshold
- Critical dimensions cannot pass with shallow or missing handling
- A package can be blocked even if the total score looks acceptable
What fails review
- Undefined escalation behavior
- Missing test cases for risky inputs
- Prompt-only output with no operational workflow
- Claims of automation where human review is still required
How acceptance works
- Acceptance is based on deliverables and testability
- The client should be able to review the package without hidden assumptions
- If a workflow is not maintainable, it is not ready to ship