AI Engineer – Agentic AI & Model Infrastructure
Role Overview
We are looking for an AI Engineer who will own AI systems end-to-end — from speech intelligence and model evaluation to agentic workflows, self-hosted model serving, and production reliability.
This role requires strong ownership across the modern AI stack. You will be responsible for building, deploying, optimizing, and monitoring AI systems that operate reliably in real-world production environments.
This is not a model-tuning-only role. The expectation is to own the complete AI system — model selection, integration, infrastructure, evaluation, observability, and continuous improvement.
Key Responsibilities
AI Model Development & Optimization
- Evaluate, integrate, and optimize AI models across different use cases.
- Own model selection decisions based on accuracy, performance, cost, and scalability.
- Build production-grade AI systems using modern LLM and ML technologies.
Speech AI Systems
- Work on speech-to-text (STT) systems to convert audio into accurate transcripts.
- Optimize STT models using real-world quality metrics.
- Improve speaker diarization and speaker identification accuracy for complex multi-speaker environments.
- Build systems capable of handling noisy, real-world audio scenarios.
Agentic AI & LLM Systems
- Build agentic AI workflows that transform conversations into memory, insights, and actions.
- Develop memory and retrieval pipelines.
- Design LLM orchestration workflows and agent architectures.
- Work with LLM gateways and agent orchestration frameworks.
- Build scalable AI applications using retrieval-augmented generation (RAG) patterns.
Model Serving & Infrastructure
- Deploy and optimize self-hosted AI models.
- Work with model serving frameworks such as:
- vLLM
- Triton
- Similar production serving platforms
- Own model performance including:
- Latency
- Throughput
- Infrastructure efficiency
- Cost optimization
Observability & Production Monitoring
- Build visibility across the complete AI pipeline:
- Speech-to-text
- Diarization
- Retrieval
- LLM interactions
- Define and monitor AI quality metrics and production SLOs.
- Track:
- Model quality
- Transcription accuracy
- Retrieval performance
- Latency
- Throughput
- Cost per user
- Build dashboards, monitoring, and alerting systems.
- Identify and resolve failures across distributed AI systems using metrics, logs, and traces.
- Create feedback loops to continuously improve model performance.
AI Evaluation Frameworks
- Build evaluation systems to measure model quality before production release.
- Create ground-truth datasets and benchmarking frameworks.
- Measure AI performance using metrics such as:
- WER (Word Error Rate)
- DER (Diarization Error Rate)
- Recall
- F1 Score
- Run regression testing and A/B evaluations for:
- Model changes
- Prompt updates
- Pipeline improvements
- Drive decisions using measurable performance data.
Required Qualifications
- 3–5 years of experience as an AI/ML Engineer building production AI systems.
- Strong software engineering skills with experience delivering scalable systems.
- Strong programming skills in Python.
- Hands-on experience with:
- Large Language Models (LLMs)
- Agentic AI systems
- Vector databases and retrieval systems
- Model serving infrastructure
- Mandatory experience with:
- Self-hosting AI models
- LLM gateways
- Agent orchestration frameworks
- Experience deploying AI systems in production environments.
- Experience with cloud infrastructure (GCP preferred).
- Experience with containerized deployments using Kubernetes.
- Strong understanding of AI observability and evaluation practices.
- Ability to use metrics and experimentation to drive AI improvements.
Must Have Skills
- Agentic AI development
- Self-hosted model deployment
- LLM gateway experience
- LLM orchestration
- Production ML systems
- Python
- Model serving and optimization
- AI evaluation frameworks
Good to Have Skills
- Speech AI experience:
- Speech-to-text
- Speaker diarization
- Voice AI
- Research background or AI publications.
- Experience optimizing open-source AI models.
- Experience building AI monitoring platforms.
- Experience with production model evaluation pipelines.
Preferred Mindset
- Strong first-principles problem solving.
- Ability to measure before making decisions.
- Focus on production reliability over experimentation only.
- Ownership mindset for building AI systems from development to production.
