Oman , Oman

--

Gender Not Mentioned

Education Bachelor of Technology/Engineering(Computers)

Category Information Technology

Industry Information Technology and Services

Job Description

Roles & Responsibilities

Responsibilities:

Lead and mentor a team of highly talented junior ML engineers through:

Design, deploy, and operate scalable AI systems with a focus on reliability and performance

Lead production deployment of LLMs and multimodal systems (RAG, OCR, voice)

Own model performance end-to-end, combining evaluation, observability, and hardware optimization:

Build evaluation pipelines (benchmarks, regression testing, LLM-as-judge)
Implement deep observability (tracing, latency, error tracking)
Optimize GPU utilization (multi-GPU serving, batching, quantization, memory tuning)
Continuously improve throughput, latency, and cost efficiency

Architect and manage GPU infrastructure:

Build and maintain robust MLOps pipelines:

Engage directly with clients and stakeholders to:

Gather and clarify business requirements
Translate non-technical needs into well-defined technical problems
Communicate solutions, trade-offs, and progress through clear documentation, reports, and proposals

Contribute hands-on to system design, implementation, debugging, and production incident resolution