Responsibilities

Built and shipped multi-agent systems in production, not prototypes, not demos. Real systems with real failure modes.
Worked with LangGraph, LangChain, CrewAI, AutoGen, or equivalent orchestration frameworks and can explain why you made that choice.
Designed and queried knowledge graphs or graph databases like Neo4j or graph layers on relational systems. You understand why a graph is the right data model for relationship-heavy problems and not just because it looks cool.
Built systems that detect absence, not just what's wrong but what's missing. This is a specific reasoning skill and we'll test for it.
Written production in Python async, typed, modular, and observable. You write code other engineers can reason about.
Worked with Playwright, browser-use, or equivalent browser automation at a level beyond basic scripting.

Requirements

Experience with RAG systems and specifically their limits. You know why RAG alone fails for temporal reasoning, absence detection, and cross-entity traversal.
Contributed to or built agent evaluation frameworks (RAGAS, custom evals, LLM-as-judge pipelines).
Worked with vector stores alongside graph databases pgvector, Pinecone, and Weaviate and know when to use each. Familiarity with software testing concepts, QA workflows, or developer tooling : you don't need to be a QA engineer, but you need to understand what one worries about.
Exposure to GitHub API, Jira API, or similar developer ecosystem integrations.
TypeScript or Node.js exposure our frontend-adjacent agent requires it occasionally.

(ref:hirist.tech)

Responsibilities

Built and shipped multi-agent systems in production, not prototypes, not demos. Real systems with real failure modes.
Worked with LangGraph, LangChain, CrewAI, AutoGen, or equivalent orchestration frameworks and can explain why you made that choice.
Designed and queried knowledge graphs or graph databases like Neo4j or graph layers on relational systems. You understand why a graph is the right data model for relationship-heavy problems and not just because it looks cool.
Built systems that detect absence, not just what's wrong but what's missing. This is a specific reasoning skill and we'll test for it.
Written production in Python async, typed, modular, and observable. You write code other engineers can reason about.
Worked with Playwright, browser-use, or equivalent browser automation at a level beyond basic scripting.

Requirements

Experience with RAG systems and specifically their limits. You know why RAG alone fails for temporal reasoning, absence detection, and cross-entity traversal.
Contributed to or built agent evaluation frameworks (RAGAS, custom evals, LLM-as-judge pipelines).
Worked with vector stores alongside graph databases pgvector, Pinecone, and Weaviate and know when to use each. Familiarity with software testing concepts, QA workflows, or developer tooling : you don't need to be a QA engineer, but you need to understand what one worries about.
Exposure to GitHub API, Jira API, or similar developer ecosystem integrations.
TypeScript or Node.js exposure our frontend-adjacent agent requires it occasionally.

(ref:hirist.tech)

Techmatters - AI Engineer - LLM/RAG

More ai engineer jobs roles