Singapore

Research Fellow / Engineer (Vision-Language Models) - WS1, Singapore

Research Fellow / Engineer (Vision-Language Models) - WS1, Singapore
Description

Organisation/Company SINGAPORE INSTITUTE OF TECHNOLOGY (SIT) Research Field Computer science Researcher Profile First Stage Researcher (R1) Application Deadline 4 Jun 2026 - 00:00 (UTC) Country Singapore Type of Contract Other Job Status Full-time Is the job funded through the EU Research Framework Programme? Not funded by a EU programme Is the Job related to staff position within a Research Infrastructure? No

Offer Description

Schemes of Service: Research

Division: Engineering

Employment Type: Fixed Term

As a University of Applied Learning, the Singapore Institute of Technology (SIT) works closely with industry in its research pursuits. This position is situated within the SIT x NVIDIA AI Centre (SNAIC).

This role is part of an industry innovation project with a large consumer goods company, where you will develop an evaluation framework for vision-language model (VLM) with applications in the personal care sector. The research focuses on fine-grained VLM capabilities such as spatial reasoning, temporal grounding, event tracking, and domain knowledge using a curated multimodal dataset.

Key Responsibilities
  • Manage the research project together with the Principal Investigator (PI) and industry partner to ensure all project deliverables are met
  • Design and implement evaluation frameworks and metrics for vision-language models
  • Develop annotated video datasets and capability-tagged evaluation tasks
  • Build end-to-end evaluation pipelines and failure mode analysis tools to analyze VLM performance across reasoning dimensions
  • Prepare technical reports, publications, and industry-facing deliverables
  • Candidates are to communicate with any internal or external parties to ensure project deliverables are met.
  • Any other ad-hoc duties as assigned by Supervisor.
Requirements
  • PhD in Computer Science or related field
  • Expertise in computer vision and vision-language models
  • Experience with ML evaluation metrics and benchmarking
  • Proficiency in Python and deep learning frameworks (e.g., PyTorch)
  • Interest in applied, industry-collaborative research
#J-18808-Ljbffr
Highlights
Safety Tips
Be careful with jobs that explicitly state ’no experience needed’.
1 / 10
More info about this ad

Research Fellow / Engineer (Vision-Language Models) - WS1 has been posted in the Bishan Engineering category on Locanto.

For Bishan, there are no other ads posted in this category.

Interested in more? Widen your search to view ads in nearby areas of Bishan. This includes Engineering in Central Water Catchment, Serangoon and Toa Payoh. There are more ads within a 15 km radius for this category. If you want to view those ads, click here.

Go to next ad