[AI] AIGC Distributed Training & Optimization Engineer …, Singapore
-
Singapore
-
Posted: yesterday
-
Save
Sea Group is establishing a brand-new, strategic AI department dedicated to exploring the transformative potential of generative AI in revolutionizing human connection, self-expression, and communication diversity, as well as social interaction. We are building the next generation of AI-native applications and a comprehensive Model-as-a-Service (MaaS) product support system. Based on massive multi-country data, we are building a leading multilingual AI ecosystem from the ground up, aiming to develop leading Southeast Asian multilingual models and innovative AI-native applications.
About the Job- Toolchain Development: Design and build distributed training toolchains to support ultra-large-scale AIGC model training.
- System Optimization: Optimize distributed training performance across computation, communication, and storage layers.
- Stability & Scalability: Analyze and resolve technical bottlenecks in the training process, specifically focusing on improving training stability and efficiency.
- Frontier Research: Track and explore cutting‑edge distributed training technologies, leading project planning and production‑grade implementation.
- Master’s degree or above in Computer Science or related fields; Bachelor’s degree may be considered with strong industrial experience.
- Minimum 2 years of relevant experience.
- Distributed Expertise: Deep understanding of distributed training principles (Data/Pipeline/Tensor/Expert Parallelism) with proven hands‑on experience.
- Framework Proficiency: Expert in deep learning frameworks such as PyTorch, DeepSpeed, and Megatron‑LM.
- Low‑level Knowledge: Familiar with GPU hardware architecture and CUDA programming; experience in CUDA kernel development/debugging and familiarity with NCCL and cuDNN.
- AIGC Background: Understanding of AIGC pre‑training methodologies, Transformer architectures, and Diffusion models (e.g., Stable Diffusion, Flux).
- Core Competency: Strong problem‑solving skills, innovative thinking, and excellent team collaboration/communication skills.
-
Company nameShopee
-
Job position[AI] AIGC Distributed Training & Optimization Engineer (Pre-training)
[AI] AIGC Distributed Training & Optimization Engineer … has been posted in the Bishan Transportation & Logistics category on Locanto.
In this category, there are no other ads right now posted in Bishan.
Interested in more? Widen your search to view ads in nearby areas of Bishan. This includes Transportation & Logistics in Central Water Catchment, Novena and Ang Mo Kio. There are more ads within a 15 km radius for this category. If you want to view those ads, click here.