About the Role :
We are hiring an AI Engineer to be the lead technical contributor on a personalization and ranking engagement for a large-scale consumer marketplace. You will set the technical direction, make the key modeling decisions, and stay hands-on throughout. You will be a senior technical point of contact with the customer — explaining trade-offs, managing expectations, and turning results into clear recommendations. You will lead a rigorous, POC-first program: engineering user-level features from behavioral data, integrating LLM-generated user profiles into a deep-learning ranking model, and driving the work from offline validation through production-readiness.
What You’ll Do • Own the technical strategy for a personalization program on a production recommendation/ranking system, making the architecture and modeling decisions and being accountable for the results. • Stay hands-on: build the features, train the models, run the experiments, and write the critical code. • Set the technical bar and support other engineers through design reviews, mentorship, and pairing. • Act as a senior technical point of contact with the customer, communicating progress, risks, and results to both engineers and senior stakeholders, and managing expectations through ambiguity. • Design and run a structured, parallel-track proof-of-concept that measures the incremental lift of GenAI-based profiles over well-engineered behavioral ML features. • Engineer user-level features from large-scale behavioral data (category/product affinity, time-of-day and price-sensitivity patterns, per-user click/conversion history, recency frequency signals). • Integrate LLM-generated user profiles into ranking models, including embedding generation, projection-layer tuning, gating, and ablation to ensure the signal is properly weighted. • Own the deep-learning ranking model (multi-task CTR/CVR architectures such as shared bottom MTL), including feature integration, hyperparameter optimization (Bayesian/grid search), and bias correction (position/popularity). • Define and run the offline evaluation framework — NDCG, MRR, Precision/Recall at K — with segment-level analysis and ablation studies across user cohorts. • Establish the path to production: model serving and scheduled inference integration, shadow-mode testing, A/B framework readiness, and guardrail metrics. • Deliver clear technical documentation and lead knowledge-transfer sessions so the customer’s teams can operate and iterate independently after handoff.
Required Qualifications • 10+ years in applied machine learning / data science, with deep hands-on experience in recommender systems, learning-to-rank, or large-scale personalization. • Practical experience building with LLMs in production: generating and integrating model derived features or profiles, working with embeddings, and reasoning about evaluation, latency, and cost. • Experience with Amazon Bedrock or comparable managed LLM platforms for production inference. • Hands-on experience with segment- or cohort-based personalization, including measuring performance at the segment level rather than relying on aggregate metrics. • Experience designing cold-start strategies for users or items with limited history. • Strong communication skills — able to explain modeling decisions, trade-offs, and results clearly to engineers, data scientists, and senior business stakeholders, and to manage expectations through ambiguity. • Customer-facing or stakeholder-facing experience: building trust, navigating competing priorities, and serving as a senior technical voice in high-stakes conversations. • A track record of technical leadership through mentoring engineers, driving design decisions, and setting standards. • Strong track record taking ML models from experimentation to production, owning the offline-to-online validation story (ranking metrics, ablations, segment analysis, shadow testing, A/B readiness). • Deep, hands-on expertise in deep learning for ranking/recommendation — multi-task learning, embedding-based architectures — with a major framework (TensorFlow or PyTorch). • Strong feature engineering on large behavioral datasets using the modern data stack (PySpark, SQL, distributed data lakes). • Rigorous experimental methodology — hyperparameter optimization, bias correction, and a disciplined, hypothesis-driven approach to measuring true lift. • Hands-on AWS experience across the ML lifecycle, and strong proficiency in Python. Preferred Qualifications • Experience personalizing ranking for marketplaces or consumer platforms at scale (e commerce, food delivery, media, or similar). • MLOps maturity: model versioning, monitoring, and reproducible training pipelines. • Advanced degree in Computer Science, Machine Learning, Statistics, or a related quantitative field. • Prior experience in a client-facing consulting or professional-services delivery environment.
