Senior Machine Learning Engineer

General Motors • Full-time • 3w ago

The Role:  

We are seeking an experienced, technical oriented, impact delivering-driven expert in ML Training Infrastructure with a strong ability to execute hands-on technical work. In this role, you will be responsible for designing and building scalable, reliable, and high-performance AI/ML platform infrastructure to support advanced AI research and model development initiatives. As a Senior ML System Engineer, you will collaborate closely with machine learning engineers, research scientists, and other partners to develop state-of-the-art AI solutions that enable the future of intelligent driving technologies across General Motors vehicles.

What You'll Do:

Design and development of scalable, reliabile, high-performance ML framework to support model training at scale.
Model training performance analysis and optimizaiton solutions to scale distributed training workflows and maximize resource utilization across heterogeneous hardware environments, and save cost.
Raise the bar on system observability, debuggability, and operational excellence, and user experience.
Collaborate with cross-functional teams to integrate new features and technologies into the platform.

Your Skills & Abilities (Required Qualifications)

Bachelors or higher degree in Computer Science or equivalent major or equivalent experience
5+ years professional software engineering experience
2+ years specialized experience in AI/ML infrastructure, e.g., enabling distributed training for scaling large ML models
Strong programming skills in Python, with proficiency in frameworks such as,PyTorch (prefered), TensorFlow, or similar
Experience with distributed computing, GPU computing, and cloud environments (AWS, GCP, Azure).
Willingness to travel to Sunnyvale, CA as needed
Comfortable working in highly ambiguous and dynamic environments

What Will Give You a Competitive Edge (preferred qualifications):

Self-motivated, strong execution, impact-delivering oriented
Extensive knowledge and experience with PyTorch 2.x+ and distributed training framework
Experience with design and development of training framework that supports FSDP, Pipeline Parallelism and other scalable solutions to training large foundational models
Experience with profiling, analysis, debugging and optimizing training and dataloading performance.
Excellent communication skills to resolve controversial, make consensus, communicate risks and give constructive feedback

Compensation: The compensation information is a good faith estimate only. It is based on what a successful applicant might be paid in accordance with applicable state laws. The compensation may not be representative for positions located outside of the California Bay Area.

The salary range for this role is $134,000 to $235,900. The actual base salary a successful candidate will be offered within this range will vary based on factors relevant to the position.

Bonus Potential: An incentive pay program offers payouts based on company performance, job level, and individual performance.

Relocation: This job may be eligible for relocation benefits.

Benefits:

Benefits: GM offers a variety of health and wellbeing benefit programs. Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts and more.

Remote: This role is based remotely but if you live within a 50-mile radius of [Mountain View, Detroit, Warren, Milford], you are expected to report to that location three times a week, at minimum.

Test Your Formula 1 Career Readiness

Take the Formula 1 Career Readiness Quiz and find out if you’re on track to success. Get a personalised report highlighting your strengths and areas to improve in just 5 minutes!

Get Your Bespoke Report