Profile overview:

We’re hiring an AI Architect to design and ship the next generation of AI systems coming out of our AI CoE — both the internal platform our consultants build on, and the client facing solutions we deliver. You’ll work across LLM applications, RAG systems, and multiagent workflows, taking them from whiteboard to production. Our primary cloud is Azure, and you’ll go deep on the Azure AI stack (Foundry, AI Search, Azure OpenAI, AI Content Safety, AML), while staying fluent in the open-source and cross-cloud tools that round out a modern GenAI practice. You’ll lead a small team of AI engineers, drive R&D and enablement across the CoE, partner with sales on presales pursuits, and stay deep in the code yourself.

Key Responsibilities:

•

Reference architectures. Build and maintain the canonical blueprints other teams build against — RAG on Azure AI Search, agent runtimes on Azure AI Foundry, LLMbacked assistants, retrieval and data layers on Fabric / Cosmos DB / ADLS. Versioned, opinionated, with working sample implementations.

•

Production AI architecture. End-to-end design for client and internal AI systems: model selection, retrieval design, orchestration, identity and data-boundary enforcement, deployment topology (AKS, Container Apps, APIM as the AI gateway), and the cost/latency/reliability tradeoffs underneath them.

•

Agent system design. Frameworks, orchestration patterns, tool-use contracts, memory and state management, eval harnesses. Our primary stack is Azure AI Foundry Agent Service and Semantic Kernel; we use NVIDIA NeMo, LangGraph, and AutoGen where they’re the right fit. You’ll set the policy on when to reach for each.

•

The eval and observability layer. Tracing, regression eval suites, drift and quality monitoring, cost and latency telemetry — built on Azure Monitor and Application Insights, complemented by Braintrust, LangSmith, Ragas, or PromptFlow evals where they earn their keep. AI systems we can’t measure don’t ship.

•

Responsible AI in practice. Governance patterns, prompt and output safety (Azure AI Content Safety, jailbreak/PII shields), data-boundary enforcement via Entra ID and Private Link, audit trails. Critical for our regulated-industry client work.

•

R&D and team enablement. Run an R&D cadence — evaluate new Azure AI releases (Foundry models, agent features, AI Search capabilities) and the broader OSS landscape, prototype what matters, kill what doesn’t, and publish findings back to the CoE. Build and run the internal training program (workshops, code labs, brown-bags, certification paths) that levels up the AI engineering team.

•

Presales partnership. Partner with sales and account leads on qualified pursuits: scope AI engagements, run technical discovery, shape solution architectures, contribute to SOWs and proposals, and present to client architects and CTOs. You’re the senior technical voice in the room when it counts.

•

Team leadership. Lead and grow a small team of AI engineers. Run architecture reviews and design sessions across the CoE, and partner closely with client delivery teams.

Required Qualifications & Skills:

•

12+ years building software, with the last 2–3 spent shipping real production AI/ML systems — ideally including LLM applications, RAG, or agent systems that handle live traffic.

•

Deep on Azure AI. Hands-on with Azure AI Foundry (model catalog, agent service, prompt flow), Azure OpenAI, Azure AI Search (vector + hybrid retrieval), AI Content Safety, Azure Machine Learning, and AI Document Intelligence. You’ve put more than one of these in production.

•

Comfortable across the Azure platform. AKS or Azure Container Apps, Azure API

Management (as an AI gateway), Azure Functions, Cosmos DB (incl. vector), Microsoft Fabric / Synapse, Entra ID, Key Vault, Private Link. Bicep or Terraform for IaC. GitHub Actions or Azure DevOps for CI/CD.

•

Fluent in the broader GenAI stack. Model APIs beyond Azure (Anthropic, OpenAI direct, Bedrock) and open-source orchestration — Semantic Kernel, NVIDIA NeMo, LangGraph, AutoGen, LlamaIndex. Vector stores and eval tooling (Braintrust,

LangSmith, Ragas) where they fit better than the Azure-native option.

•

Strong systems and data engineering chops. You can design retrieval pipelines, pick the right storage and compute, and reason about latency, cost, and reliability tradeoffs.

•

Production discipline. CI/CD, containerization, infra-as-code, observability — the unglamorous things that separate prototypes from products.

•

Presales-capable. Comfortable in front of clients: discovery, whiteboarding solutions, defending architecture decisions, contributing to proposals and SOWs.

•

Experience leading small engineering teams. You’ve managed at least 2–3 engineers before, balanced shipping with people development, and know when to step in versus step back.

•

Clear technical communication. You can explain agent design to a CTO and pair with a junior engineer in the same afternoon.

Strategic Objectives and Expected Outcomes in the first year:

•

At least one major AI system you’ve architected is in production on Azure and being used by client delivery teams or end customers.

•

A published set of CoE reference architectures (RAG, agent runtime, eval harness) with working Azure sample implementations — measurably shortening the path from PoC to production for downstream teams.

•

A running R&D and training cadence: regular evaluations of new Azure AI and OSS releases, an internal workshop / lab program, and visible upskilling across the AI engineering team.

•

Direct contribution to won deals — you’ve been the senior architect on at least a handful of presales pursuits that closed.

•

The engineering bar for AI work across the CoE is visibly higher because of the patterns, reviews, and standards you’ve established.

Preferred to have:

Azure certifications (AI Engineer Associate, Solutions Architect Expert) or equivalent depth. Open-source contributions in the AI/ML space. Experience operating AI systems under regulatory constraints (HIPAA, SOC 2, financial services). Background in services or consulting environments where the same architecture serves multiple clients.