← Back to AI Engineer Jobs
HDFC securities logo

Artificial Intelligence Engineer

HDFC securities

🇮🇳Mumbai, INsenior

  • awq
  • aws bedrock
  • azure openai
  • cohere rerank
  • deepeval
  • deepseek
  • dpo
  • gguf
  • go
  • gptq
  • guardrails ai
  • haystack
  • java
  • kotlin
  • langchain
  • langfuse
  • langgraph
  • llama
  • llama.cpp
  • llama guard
  • llamaindex
  • lora
  • milvus
  • mistral
  • nemo guardrails
  • nemotron
  • node.js
  • ollama
  • opensearch
  • pgvector
  • pinecone
  • promptfoo
  • python
  • qlora
  • qwen
  • ragas
  • sagemaker
  • tensorrt-llm
  • tgi
  • typescript
  • vertex ai
  • vllm
  • weaviate

We are building production-grade AI systems for capital markets, including an AI-powered investing assistant that runs on cloud-native infrastructure and integrates with regulated trading and research platforms. We are hiring a Senior AI Engineer to build, evaluate, and operate LLM-based products end to end.

This is a deeply hands-on role. You will write code, debug live systems, run evaluations, and ship to production. We are not looking for someone whose AI experience is limited to wiring up a hosted chat API — we expect you to have personally built, broken, and fixed LLM systems in production.

Experience

  • 5–8 years in software engineering, with 2+ years on LLM/AI products in production.
  • Strong track record of shipping AI features that are actually used by real users at scale.

Required Skills

1. LLM Hosting & Serving

  • Hands-on experience hosting LLMs for testing, evaluation, and production inference.
  • Working knowledge of inference servers and runtimes: vLLM, TGI (Text Generation Inference), TensorRT-LLM, Ollama, llama.cpp.
  • Experience deploying open-weight models (Llama, Mistral, Qwen, Nemotron, GPT-OSS, DeepSeek, etc.) on GPU instances — knowledge of quantization (GPTQ, AWQ, GGUF, FP8), batching strategies (continuous batching, paged attention), and KV-cache management.
  • Experience with managed model hosting platforms: AWS Bedrock, SageMaker, Azure OpenAI, Vertex AI, or equivalent.
  • Ability to choose between hosted APIs and self-hosted inference based on cost, latency, throughput, and data-residency constraints — and to defend that choice with numbers.

2. LLM Evaluation & Testing

  • Designing and running case-specific test suites for LLM-based applications — not just generic benchmarks.
  • Building eval datasets from production traffic, edge cases, and adversarial prompts. Experience curating golden datasets and maintaining them as the product evolves.
  • Hands-on with evaluation frameworks: Langfuse, Promptfoo, DeepEval, RAGAS, OpenAI Evals, LM-Eval-Harness, or equivalents.
  • LLM-as-a-judge pipelines — including knowing the failure modes (judge bias, position bias, verbosity bias) and how to mitigate them.
  • Regression testing for prompts, models, and tool chains. Catching silent quality drift between model versions.
  • Quantitative metrics: faithfulness, groundedness, answer relevance, tool-selection accuracy, hallucination rates, latency percentiles, token cost per query.

3. LLM Frameworks & Orchestration

  • Working knowledge of LangChain, LangGraph, LlamaIndex, Haystack, or equivalent orchestration frameworks.
  • Experience with agentic patterns: ReAct, ReWoo, Reflexion, Plan-and-Execute, multi-agent workflows.
  • MCP (Model Context Protocol) and tool-calling: building tool schemas, handling tool-selection failures, recovering from malformed tool calls.
  • Comfortable working outside Python ecosystems — building LLM applications in Go, Java/Kotlin, TypeScript/Node, or custom in-house frameworks. We do not assume Python is the right answer for production services.
  • Streaming responses (SSE, WebSockets), session management, and handling long-running agentic loops gracefully.

4. Retrieval & Context Engineering

  • Hands-on with embedding models, vector databases (pgvector, OpenSearch, Pinecone, Weaviate, Milvus), and hybrid search (BM25 + dense).
  • Chunking strategies, re-ranking (Cohere Rerank, cross-encoders), and query rewriting.
  • Knowledge of when RAG is the wrong answer (and what to do instead).

5. Good-to-Have

  • Fine-tuning / instruction-tuning / LoRA / QLoRA / DPO on open-weight models.
  • RLHF or RLAIF exposure.
  • Prompt distillation, model routing, and cost optimization at scale.
  • Guardrails: PII redaction, jailbreak detection, output validation (Guardrails AI, NeMo Guardrails, Llama Guard).
  • Experience with multimodal models (vision, audio, ASR/TTS).
  • Contributions to open-source AI/ML projects.

Responsibilities

  • Build and ship LLM-powered features: agentic workflows, RAG pipelines, tool-using assistants, summarization and classification services.
  • Host, serve, and benchmark LLMs — both hosted (Bedrock, Azure OpenAI) and self-hosted (vLLM, TGI) — with measurable latency, throughput, and cost targets.
  • Write and maintain case-specific test suites; create eval datasets from real traffic; gate model and prompt changes on regression results.
  • Instrument production: traces, prompts, tool calls, token usage, error taxonomy. Build dashboards that tell you when quality is degrading.
  • Collaborate with product, design, and domain experts to translate fuzzy requirements into concrete prompts, tools, and evals.
  • Mentor junior engineers and review code with care.
  • Participate in on-call for AI services and contribute to runbooks and RCAs.
Apply on linkedinVisit company →

More ai engineer jobs roles

  • Responsible AI EngineerAccenture in India · Bengaluru, IN→
  • Associate Full Stack AI EngineerAscot Group · Bermuda, BM→
  • Staff AI EngineerSpotOn · San Francisco, US→
  • Applied AI Engineer, Codex Core AgentOpenAI · San Francisco, US→
  • AI Engineer ($170k–$220k + Equity) at WithshepherdJack & Jill · San Francisco, CA→
  • Full-Stack AI Engineer at GreylockJack & Jill · San Francisco, CA→
  • Senior AI EngineerEPAM Systems · Newtown, US→
  • Senior AI Engineer (f/m/d)Awin Global · Berlin, DE→
View all ai engineer jobs roles →

Don't miss the next ai engineer jobs role

Set up an alert and we'll email you matching openings. No spam, unsubscribe anytime.

Double opt-in: we'll email you a link to confirm. No spam, unsubscribe anytime.