Singapore

Senior Release Engineer, AI Infrastructure, Singapore

Senior Release Engineer, AI Infrastructure, Singapore
Description
Our client operates within the AI infrastructure space, supporting large-scale, GPU-accelerated environments that power advanced machine learning workloads. Their platforms are built to run distributed training and inference across high-performance compute clusters, enabling scalable and reliable delivery of AI features. This role focuses on strengthening release engineering practices to support consistent, secure, and efficient deployments in compute-intensive environments.
Role As a Senior Release Engineer, AI Infrastructure, you will lead the development of release engineering capabilities, with a focus on CI/CD pipelines and deployment workflows designed for GPU-accelerated, distributed systems. You will design and manage automated pipelines, testing gates, and release processes that support multi-node workloads and distributed ML training use cases. This includes building test infrastructure for long-running, compute-intensive jobs and ensuring deployment practices align with GPU scheduling and orchestration standards within Kubernetes environments. You will implement structured release practices such as GitOps workflows, automated rollback mechanisms, and change controls, while defining standards for code quality, repository management, and dependency governance. Working closely with infrastructure teams, you will ensure deployments meet security, compliance, and performance requirements across large-scale GPU clusters. The role also involves mentoring engineers on deployment safety, incident response, and embedding consistent release practices across the organization.
Requirements You should bring 5-7 years of experience in DevOps or release engineering, with hands-on exposure to GPU-accelerated or high-performance computing environments. Strong expertise in CI/CD tools such as GitHub Actions, GitLab CI, Jenkins, or ArgoCD is required, including experience designing multi-stage pipelines with robust testing gates. Proficiency in scripting languages such as Python, Go, or Bash is essential for automation and orchestration. A solid understanding of Kubernetes is critical, particularly in managing and troubleshooting GPU-enabled workloads, including scheduling, scaling, and rollout strategies in distributed environments. Experience with configuration and secret management, secure deployment practices, and artifact/version control is important. You should also have experience integrating testing frameworks into pipelines, including validation of long-running or resource-intensive workloads. Familiarity with optimizing reliability, performance, and cost efficiency in GPU-accelerated clusters will be advantageous.
To Apply To apply, please submit your resume to Yien Quek at (HIDDEN TEXT). We regret to inform that only successful shortlisted candidates will be notified. Licence No: 16S8060 | Registration no: R1109830
Highlights
Safety Tips
Be careful: if it seems too good to be true, it most likely is.
1 / 10
More info about this ad

Senior Release Engineer, AI Infrastructure has been posted in the Bishan Engineering category on Locanto.

Right now, this is the only ad posted in this category in Bishan.

Interested in more? Widen your search to view ads in nearby areas of Bishan. This includes Engineering in Hougang, Serangoon and Newton. There are more ads within a 15 km radius for this category. If you want to view those ads, click here.

Go to next ad