← Back to AI Engineer Jobs
TE

Voice AI Engineer

TechKareer

Delhi, Indiamidhybrid

  • azure speech
  • deepgram
  • elevenlabs
  • google stt/tts
  • livekit
  • llm
  • stt
  • tts
  • twilio
  • webrtc

Role: Voice AI Engineer

Location: Delhi, India (On-site)

Workplace type: 6 days a week (5 days in-office, 1 day remote)

We’re hiring a Voice AI Engineer to own and scale real-time voice AI systems across STT, TTS, streaming audio, and low-latency conversational pipelines. We need someone who has built production voice systems, handled real users, and understands the engineering tradeoffs behind latency, accuracy, concurrency, and reliability.

What you’ll work on

You’ll build and improve real-time voice AI pipelines involving:

  • STT/TTS integrations
  • WebRTC / LiveKit / streaming audio infrastructure
  • Low-latency voice agents
  • Hindi, Hinglish, and Indian vernacular speech workflows
  • Turn-taking, interruptions, barge-in, silence detection, and call handling
  • Production monitoring for latency, accuracy, concurrency, and reliability

What we’re looking for

  • 2+ years of production experience in voice AI specifically
  • Hands-on experience with STT/TTS pipelines, WebRTC, LiveKit, or similar real-time audio infra
  • Experience building systems used in a real organization at scale, such as telco, consumer app, B2B SaaS, healthcare, fintech, or contact center environments
  • Demonstrable work with Hindi/Hinglish or other Indian vernacular voice systems is a strong plus
  • Ability to show quantified impact, such as:
  • P50/P95/P99 latency
  • concurrent calls/users handled
  • WER / accuracy improvements
  • call volume or user scale
  • uptime/reliability improvements

Strong signals

  • Built or owned a real-time voice agent in production
  • Worked with LiveKit, Twilio, WebRTC, Deepgram, ElevenLabs, Google STT/TTS, Azure Speech, or similar tools
  • Optimized streaming latency end-to-end
  • Handled noisy audio, accents, code-switching, or multilingual speech
  • Can debug production issues across audio, infra, LLM, STT, and TTS layers

Not a fit if

  • Your voice AI experience is limited to hackathons or demos
  • You have only built chatbot/LLM apps without real-time audio
  • Your resume has no quantified outcomes around latency, scale, accuracy, or reliability

Ideal profile

Someone who has already shipped voice AI to real users, knows why production audio systems break, and can independently own a low-latency voice stack from prototype to scale.

Apply on linkedin

More ai engineer jobs roles

  • AI Engineer DeveloperChatGPT Jobs · New York, NY, US→
  • CTIO AI Engineering ManagerJobs via Dice · New York, NY→
  • Responsible AI EngineerAccenture in India · Bengaluru, IN→
  • Associate Full Stack AI EngineerAscot Group · Bermuda, BM→
  • Staff AI EngineerSpotOn · San Francisco, US→
  • Applied AI Engineer, Codex Core AgentOpenAI · San Francisco, US→
  • AI Engineer ($170k–$220k + Equity) at WithshepherdJack & Jill · San Francisco, CA→
  • Full-Stack AI Engineer at GreylockJack & Jill · San Francisco, CA→
View all ai engineer jobs roles →

Don't miss the next ai engineer jobs role

Set up an alert and we'll email you matching openings. No spam, unsubscribe anytime.

Double opt-in: we'll email you a link to confirm. No spam, unsubscribe anytime.