← Zurück zu Jobs

Member of Technical Staff (Research Engineering)

Code Evaluation
Jetzt Bewerben →

Beschreibung

Role details

  • Job type: Full-time.
  • Location: Remote.
  • We are seeking a Research Engineer to operate at the frontier of reinforcement learning (RL), developing novel environments, training pipelines, and evaluation systems that advance the capabilities of modern AI models.
  • This role sits at the intersection of research and production, translating experimental ideas into scalable, high-performance systems.

Day-to-day

  • Architect self-contained RL environments that capture complex, real-world tasks, including reward functions, verifiers, and evaluation logic.
  • Design and episode pipelines and multi-component training processes (MCPs) to support reproducible experimentation.
  • Build automated data generation systems, leveraging synthetic data to accelerate training cycles without compromising quality.
  • Build and integrate AI-driven evaluation and quality assurance systems for automated grading, validation, and feedback loops.
  • Fine-tune and optimize open-source RL models using internally generated datasets and custom training strategies.
  • Establish benchmarking frameworks to measure model capability, robustness, and data quality across tasks.
  • Contribute to the release and analysis of evaluations on internal and external benchmark platforms (e.g., the company benchmarks).

You bring

  • Deep background in reinforcement learning, including environment design and training dynamics.
  • Strong track record of building and scaling RL systems, pipelines, or experimentation frameworks.
  • Proficient in automation and data generation, including synthetic data pipelines.
  • Familiar with automated evaluation systems, model validation, and quality assurance workflows.
  • Experienced in fine-tuning and evaluating open-source ML models.
  • Clear, concise communicator with strong technical writing skills.
  • Comfortable operating in fast-paced, research-driven, and highly collaborative environments.

Bonus

  • publishing benchmarks, evaluations, or research artifacts.
  • Familiarity with evaluation ecosystems (e.g., the company benchmarks or similar frameworks).
  • Background in scalable infrastructure for large- RL experimentation.

Details

Category

Code Evaluation

Location

Remote

Employment Type

Independent Contractor

Posted

15.4.2026