Human Evaluation for AI Teams
Reference-free human evaluation for LLM outputs.
Create custom rubrics, assign reviewers, and get structured data from your team's qualitative feedback. No integration code required.
Just upload a CSV to get started.
How it works
A streamlined workflow for technical teams.
1. Upload dataset
Import your model outputs as CSV.
2. Configure rubric
Define custom criteria with liquid scales, checkboxes, or free text.
3. Invite reviewers
Share unique links with your team to grade outputs.
4. Export results
Download structured judgments as CSV.
Features
Tools to manage human evaluation.
Reviewer Tracking
Monitor individual progress through completion dashboards.
Blind Comparison
Available for side-by-side (A/B) testing with randomized positioning.
Custom Rubrics
Build evaluation forms with ratings, multi-select, and text justifications.
Audit Trail
Timestamps and reviewer IDs for every judgment.
RBAC
Role-based access control for Admins and Reviewers.
AI DraftingOptional
Draft initial rubrics based on your dataset.