Scientific AI Evaluation & Computational Problem Designer

Weekday AI • United States, United States • Posted June 01, 2026

About the Role

This role is for one of our clients
Compensation: $45-$100 per hour
We are building a large-scale evaluation benchmark to test advanced AI reasoning across scientific and engineering domains. This role focuses on designing rigorous, research-grade computational problems that assess how effectively AI systems can leverage real scientific software tools to solve complex challenges.
Unlike traditional annotation roles, this position requires creating original, graduate-level problems rooted in real-world scientific workflows. You will iteratively refine these problems through calibration against state-of-the-art AI models, ensuring the right balance of difficulty, depth, and reasoning complexity.
Requirements
What You’ll Do
 Design advanced computational problems requiring the use of domain-specific scientific software 
 Create tasks that test both precise execution ...