Job Description

AI Benchmarking & Evaluation Engineer

Join a team at the forefront of AI model evaluation, setting the standard for how large language models are tested and validated. In this role, you'll assess the latest AI models, design new benchmarks, and develop advanced evaluation methodologies. You'll work closely with engineers, AI researchers, and enterprise clients to ensure cutting-edge AI systems meet the highest standards. This role is a bridge between research and practical implementation and will suit someone who enjoys taking academic papers and creating working models.

Key Responsibilities:

Analyze and benchmark newly released AI models (DeepSeek, Gemini, etc.)
Develop and implement novel evaluation frameworks
Build datasets, manage labeling processes, and publish findings
Enhance automated evaluation techniques for AI-generated content
Collaborate with top AI labs and enterprise partners to refine best practices

Who You Are:

MSc or PhD from leading Computer Science or Machine Learning school
At least 3 years of experience in applied AI, with a focus on benchmarking or model evaluation
Strong background in designing evaluation methodologies
Passion for advancing AI assessment standards
Solid Python, PyTorch/TensorFlow and Django

Make a real impact in AI research and development—apply today!

Job Tags

Similar Jobs

firstPRO 360

Payroll Administrator Job at firstPRO 360

...experience (construction industry preferred) for a permanent opportunity near the Battery. After training -hybrid 3 days in/2 days remote The ideal candidate will have experience in the following areas: Larger construction company/knowledge of job costing...

QS Nurses

Phlebotomist Job at QS Nurses

...of healthcare staffing services for over 3 decades, offering Travel and Per Diem services. Known for building strong relationships... ...beneficial outcomes. Role Description: This is a full-time Phlebotomist role located in Dallas-Fort Worth / Ennis, TX Area. The...

Jobright.ai

Java Software Engineer I - ArcGIS Enterprise, Entry Level Job at Jobright.ai

...Summary: Esri is a leading company in GIS technology, and they are seeking a Java Software Engineer I to build the next generation of ArcGIS Enterprise. In this role, you will design and develop server-side components and REST interfaces for ArcGIS for Server, ensuring...

Kaiser Permanente

EKG Technician II Job at Kaiser Permanente

...cardiology procedures including but not limited to ambulatory EKG monitoring/scanning, exercise stress testing, pacemaker monitoring... ...position, days and hours may vary.COMPANY: KAISERTITLE: EKG Technician IILOCATION: Anaheim, CaliforniaREQNUMBER: 1350661External...

CompHealth

Therapist / Occupational Therapist / North Dakota / CPH# JOB-3056115 - Occupational Therapist Is Needed for Traveling Coverage in ND Job Job at CompHealth

...radius from office ~ One year minimum experience required ~ COVID-19 vaccine not required ~ We provide complimentary housing and travel ~ We arrange and cover costs for licensing and malpractice ~ We simplify the credentialing and privileging process ~ We...

Research Scientist - Model Evaluation Job at Lumicity, Fremont, CA

c1IyelVqa1NtYnJCNENKQnUzTytZMDl4Umc9PQ==