← Back to Gigs

STEM (Professors)

STEM Tutoring$55 - $85 per hour
立即申请 →

职位描述

Join a leading AI lab’s cutting-edge GenAI team to be at the core of the AI revolution, where your expertise fuels the development of the most advanced Large Language Models.

1\. Overview

Looking for Professors across STEM domains, including ML, Coding, Data Science, etc. to contribute to a project supporting a frontier-model evaluation effort focused on coding and agentic workflows. You’ll design and validate challenging benchmark tasks to help surface and diagnose reasoning and problem-solving gaps in a target model. The work centers on building robust, real-world tasks with executable tests and then analyzing model/agent behavior.

This is a W2 employment position with Cincinnatus LLC, with the opportunity to be placed at a leading AI Lab as part of their extended workforce. You will join a team of domain experts and together, you will guide the next generation of frontier AI tools.

2\. Key Responsibilities

  • Task Design and Development: Design challenging, real-world domain-specific problems that serve as the foundation for agentic tasks. Problems should be constructed to target specific core capability loss failures identified in a frontier AI model.
  • Spec & Golden Solution Generation: Integrate the problems into an Agentic development environment, preparing all necessary components using Python, which include:
  • Detailed Instructions and an overview of the required task.
  • A Golden solution that follows the instruction.
  • Any specific consultations and feedback with domain-specific knowledge.
  • Evaluation and Analysis: Evaluate the cross model’s performance on the tasks
  • Headroom Identification: Identify tasks where target model fails to pass all tests, specifically classifying the failure as a logical reasoning failure
  • Loss Extraction: Analyze the agent’s steps (Agent Trajectory) to observe and extract core capability loss patterns from the model.

3\. Core Qualificat

Details

Category

STEM Tutoring

Location

Remote

Employment Type

Independent Contractor

Posted

2026/4/11