Research Scientist - Model Evaluation Job at Lumicity, Fremont, CA

c1IyelVqa1NtYnJCNENKQnUzTytZMDl4Umc9PQ==
  • Lumicity
  • Fremont, CA

Job Description

AI Benchmarking & Evaluation Engineer

Join a team at the forefront of AI model evaluation, setting the standard for how large language models are tested and validated. In this role, you'll assess the latest AI models, design new benchmarks, and develop advanced evaluation methodologies. You'll work closely with engineers, AI researchers, and enterprise clients to ensure cutting-edge AI systems meet the highest standards. This role is a bridge between research and practical implementation and will suit someone who enjoys taking academic papers and creating working models.

Key Responsibilities:

  • Analyze and benchmark newly released AI models (DeepSeek, Gemini, etc.)
  • Develop and implement novel evaluation frameworks
  • Build datasets, manage labeling processes, and publish findings
  • Enhance automated evaluation techniques for AI-generated content
  • Collaborate with top AI labs and enterprise partners to refine best practices

Who You Are:

  • MSc or PhD from leading Computer Science or Machine Learning school
  • At least 3 years of experience in applied AI, with a focus on benchmarking or model evaluation
  • Strong background in designing evaluation methodologies
  • Passion for advancing AI assessment standards
  • Solid Python, PyTorch/TensorFlow and Django

Make a real impact in AI research and development—apply today!

Job Tags

Similar Jobs

Houston Industrial Trdesmen

Manual Machinist Job at Houston Industrial Trdesmen

 ...Responsibilities of the Manual Machinist include, but are not limited to: Precision machining of pump parts including, rings, bushings, sleeves, and shafts Ability to handle .001 T.I.R. tolerances on finished products Qualifications of the Manual Machinist include... 

Blueline Food Distribution

Distribution Center Director Job at Blueline Food Distribution

 ...Distribution is a company where our colleagues make an impact. Blue Line Distribution, the in-house distributor for Little Caesars Pizza, has been family owned and operated for over 50 years; and is dedicated to leading customized and innovative food service logistics... 

ICX Group

Senior Video Editor and Motion Designer - Hybrid - Ad Agency Job at ICX Group

 ...About the Role: Were a fast-paced, full-service advertising agency seeking a Senior Video Editor with strong agency experience, exceptional storytelling skills, and advanced motion graphics capabilities. You'll collaborate with creative teams to deliver high-volume... 

Onward Search

Teacher of the Visually Impaired [77514] Job at Onward Search

 ...districts. We are seeking a passionate and dedicated part-time Teacher of the Visually Impaired (TVI) to join an exceptional school...  ...and independence. Location: In-person (no hybrid or remote options) School Schedule: 8:00am-3:30pm Guaranteed Hours:... 

Stellantis

Core Tool Architect - HEV Job at Stellantis

 ...The Core Tool Architect Is Responsible For Developing, integrating and maintaining Matlab/Simulink based system level modelling framework for BEVs, HEVs, PHEVs and Conventional vehicles. Developing system level and component level models of all components of the...