Job Description
Generalist - English ($45/hour)
Location: Geography restricted to US, UK, Canada
Type: Full-time or Part-time Contract Work
Fluent Language Skills Required: English
Role Responsibilities
- Evaluate LLM-generated responses on their ability to effectively answer user queries
- Conduct fact-checking using trusted public sources and external tools
- Generate high-quality human evaluation data by annotating response strengths, areas for improvement, and factual inaccuracies
- Assess reasoning quality, clarity, tone, and completeness of responses
- Ensure model responses align with expected conversational behavior and system guidelines
- Apply consistent annotations by following clear taxonomies, benchmarks, and detailed evaluation guidelines
Good Candidature
- Bachelor’s degree holder
- Significant experience using large language models (LLMs) and understand how and why people use them
- Excellent writing skills and can clearly articulate nuanced feedback
- Strong attention to detail and consistently notice subtle issues others may overlook
- Adaptable and comfortable moving across topics, domains, and customer requirements
- Background or experience in domains requiring structured analytical thinking (e.g., research, policy, analytics, linguistics, engineering)
- Excellent college-level mathematics skills
Nice-to-Have Specialties
- Prior experience with RLHF, model evaluation, or data annotation work
- Experience writing or editing high-quality written content
- Experience comparing multiple outputs and making fine-grained qualitative judgments
- Familiarity with evaluation rubrics, benchmarks, or quality scoring systems