[Remote] Machine Learning Systems Engineer
Note: The job is a remote job and is open to candidates in USA. Motional is a driverless technology company focused on making autonomous vehicles a safe and reliable reality. They are seeking a Machine Learning Systems Engineer to join their ML Acceleration team, responsible for optimizing systems that enable large-scale model training with an emphasis on speed, cost, reliability, and throughput.
Responsibilities
- Utilize profiling tools (e.g., Nsight, PyTorch Profiler) to identify bottlenecks in data loading, gradient computation, and communication. Implement optimizations like kernel fusion, sharding, and tiling to improve step time
- Optimize distributed training pipelines using frameworks such as PyTorch Distributed
- Design and maintain high-performance GPU kernels in Triton or CUDA for state-of-the-art ML workloads
- Optimize robust data loading pipelines that maximize training throughput
Skills
- Bachelor's, Master's degree, or PhD in Computer Science, Computer Engineering, or a related technical discipline
- Strong proficiency in Python
- Extensive hands-on experience with PyTorch
- Experience optimizing machine learning model execution during training and inference, alongside a strong understanding of fundamental machine learning concepts, architectures, and processes
- Exceptional analytical and problem-solving skills, with a bias for action and a data-driven approach to technical challenges
Benefits
- Candidates for certain positions are eligible to participate in Motional’s benefits program.
- Motional’s benefits include but are not limited to medical, dental, vision, 401k with a company match, health saving accounts, life insurance, pet insurance, and more.
Company Overview