About Us
Quadrivia is the health technology company behind Q, a comprehensive, controllable, and customizable assistant AI built by clinicians, for clinicians. Addressing the urgent shortage of healthcare professionals, Q provides real-time, personal, and reliable support for clinical tasks across the care continuum. Designed for providers, payers, and pharmaceutical companies, Q is easy to customise and integrates seamlessly into workflows, delivering precise assistance across the care spectrum.
The Role
You'll build and run Cortex, the core AI architecture behind Qu, and the services that sit on top of it: automated AI audits, patient simulators, retrieval (RAG), and the escalation agents that take over in red-guardrail situations. This is a backend role first. The job is to make our AI systems reliable, fast, and observable in production, not to invent new ML. You own the software underneath the agents.
We're a small team that ships real systems. We've built our entire AI-driven evaluation system, our voice orchestrator, and a multi-hierarchical RAG platform from scratch.
What You'll Do
- Design and maintain robust, modular backend systems using clean architectural (SOLID) principles to ensure long-term maintainability, scalability and flexibility as the agentic stack evolves.
- Own Cortex end-to-end: architecture, API design, service boundaries, reliability targets, and proactively managing failure modes.
- Build the platform services around it. The automated audit and eval pipeline, patient simulators for testing agents at scale, and the retrieval layer.
- Write fast, well-tested Python services with FastAPI, asyncio, and pydantic, and get the queues, caching, and data stores right.
- Wire up the multi-agent orchestration: routing between agents, shared state, and clean tool interfaces.
- Engineer the RAG pipeline for high-signal retrieval (chunking, hybrid search, re-ranking, caching) and prove the grounding holds.
- Make the whole thing observable: structured logs, OTEL tracing across the agent graph, cost, latency and token visibility, dashboards, and CI gates that catch regressions before they ship.
Minimum Qualifications
- Your core is backend and software engineering. You write clean, maintainable services and you care how they behave in production.
- Deep understanding of architectural design patterns (e.g., Clean/Hexagonal Architecture, Domain-Driven Design, SOLID, event-driven) to manage complex system boundaries.
- At least 2 years, demonstrable, building or scaling user-facing AI software that real users touched. We'll want to see it.
- Expert Python, with strong FastAPI, asyncio, pydantic, and production observability.
- Comfortable with agent patterns and eval-driven development.
- You've worked at a startup before and know what wearing several hats actually costs.
Nice to Have
- Real-time and voice: WebRTC, LiveKit, SIP, VAD, barge-in, turn-taking. Useful here, not required.
- Programmatic prompt optimization techniques.
- LLM-as-judge setups and other evaluation tooling.
- GCP: Cloud Run or GKE, Pub/Sub, Vertex AI, GCS, Secret Manager, Cloud Logging and Trace.
- Healthcare data familiarity.
Example Problems You'll Tackle
- Stand up the AI audit pipeline so evals run automatically on slices of production traffic, with regression gates wired into CI.
- Build a patient simulator that lets us stress-test agents at scale before they ever reach a real call.
- Improve the RAG pipeline with hybrid retrieval and re-ranking, then prove the gains with faithfulness and context metrics.
- Get OTEL-first tracing across the agent graph, with automated eval triggers on live traffic.
- Turn EHR integrations into reliable tools the agents can call.
Tech Stack
Python, FastAPI, pydantic, asyncio, Redis, Postgres, vector stores, Docker, Kubernetes, Terraform, ArgoCD, OTEL, TypeScript, React. Real-time stacks (WebRTC, LiveKit, SIP, STT/TTS) where the work touches voice.
What Success Looks Like
- Quadrivia's backend becomes a reference for reliability, safety, and performance.
- Your services run above 99.99% availability under strict regulatory constraints.
- Other engineers build new clinical workflows and agent capabilities quickly and safely.
- AI-generated code gets reviewed, corrected, and owned by you.
