Singapore

AI Engineer, Distributed Training, Singapore

AI Engineer, Distributed Training, Singapore
Description
We are seeking an AI Infrastructure Engineer to build and scale the foundation for large-scale AI model training at a leading AI platform provider. In this role, you will develop and standardize high-performance training workflows, enabling customers to efficiently train advanced AI models in production environments. You will work closely with internal engineering teams and external customers to optimize training performance, establish benchmarks, and deliver best-in-class practices for distributed AI training at scale. Responsibilities: As an AI Infrastructure Engineer, your primary responsibilities will include designing and implementing scalable training frameworks for large AI models. You will develop reusable training 'recipes,' benchmarks, and performance baselines to guide customers in achieving optimal results. You will collaborate with engineering, product, and customer-facing teams to troubleshoot and optimize training workloads across distributed systems. Additionally, you will document best practices, support customer deployments, and contribute to continuous improvements in training efficiency, cost optimization, and system performance. Requirements: To be successful in this role, you should have at least 5 years of experience in machine learning engineering, AI infrastructure, or distributed systems. Strong hands-on experience with deep learning frameworks such as PyTorch or TensorFlow, as well as familiarity with distributed training techniques, is essential. Experience working with large-scale training environments, GPUs, and performance optimization is highly desirable. A solid understanding of MLOps, model training pipelines, and benchmarking methodologies will be advantageous. Strong problem-solving skills, along with the ability to work cross-functionally and engage with both technical and non-technical stakeholders, are critical. A Bachelor's or Master's degree in Computer Science, Engineering, or a related field is required To Apply: Interested candidates, please send your CV to (HIDDEN TEXT). Due to the high volume of applications, only short-listed candidates are notified. Registration No: R1983436 License No: 16S8060
Highlights
Safety Tips
Do not pay a ’prospective employer’ anything in order to secure a job.
1 / 10
More info about this ad

AI Engineer, Distributed Training has been posted in the Bishan Transportation & Logistics category on Locanto.

Right now, this is the only ad posted in this category in Bishan.

Interested in more? Widen your search to view ads in nearby areas of Bishan. This includes Transportation & Logistics in Central Water Catchment, Newton and Hougang. There are more ads within a 15 km radius for this category. If you want to view those ads, click here.

Go to next ad