← Volver a Trabajos
Descripción
Role details
- Job type: Full-time.
- Location: Remote.
- We are seeking a Research Engineer to operate at the frontier of reinforcement learning (RL), developing novel environments, training pipelines, and evaluation systems that advance the capabilities of modern AI models.
- This role sits at the intersection of research and production, translating experimental ideas into scalable, high-performance systems.
Day-to-day
- Architect self-contained RL environments that capture complex, real-world tasks, including reward functions, verifiers, and evaluation logic.
- Design and episode pipelines and multi-component training processes (MCPs) to support reproducible experimentation.
- Build automated data generation systems, leveraging synthetic data to accelerate training cycles without compromising quality.
- Build and integrate AI-driven evaluation and quality assurance systems for automated grading, validation, and feedback loops.
- Fine-tune and optimize open-source RL models using internally generated datasets and custom training strategies.
- Establish benchmarking frameworks to measure model capability, robustness, and data quality across tasks.
- Contribute to the release and analysis of evaluations on internal and external benchmark platforms (e.g., the company benchmarks).
You bring
- Deep background in reinforcement learning, including environment design and training dynamics.
- Strong track record of building and scaling RL systems, pipelines, or experimentation frameworks.
- Proficient in automation and data generation, including synthetic data pipelines.
- Familiar with automated evaluation systems, model validation, and quality assurance workflows.
- Experienced in fine-tuning and evaluating open-source ML models.
- Clear, concise communicator with strong technical writing skills.
- Comfortable operating in fast-paced, research-driven, and highly collaborative environments.
Bonus
- publishing benchmarks, evaluations, or research artifacts.
- Familiarity with evaluation ecosystems (e.g., the company benchmarks or similar frameworks).
- Background in scalable infrastructure for large- RL experimentation.
Details
Category
Code Evaluation
Location
Remote
Employment Type
Independent Contractor
Posted
15/4/2026
Oportunidades Relacionadas
Red de talentos de ingenieros de aprendizaje automático
$70 - $250 per hourProfesionales de la ingeniería petrolera
$120 - $200 per hourExperto en ingeniería de software
$50 - $150 per hourRed de talentos de ingenieros front-end
$70 - $150 per hourRed de talentos de ingenieros de DevOps/Plataformas
$70 - $150 per hourReview
→Is Micro1 Legit?
Pay Data
→How Much Do AI Jobs Pay?
Guide
→How to Get Started