About The Company
At Capital One, we are dedicated to transforming the banking experience through innovative and responsible AI systems. As an industry leader, we have pioneered the integration of machine learning to deliver real-time, personalized customer interactions. Our substantial investments in cutting-edge technology infrastructure and attracting world-class talent position us at the forefront of enterprise AI adoption. Our mission is to leverage emerging AI capabilities to enhance our products and services, making banking more accessible, efficient, and human-centric. We are committed to building a diverse, inclusive, and collaborative environment that fosters continuous innovation and excellence.
About The Role
We are seeking a highly skilled Senior Distinguished Engineer, AI Compute, to join our dynamic team remotely. In this strategic technical leadership role, you will focus on engineering and scaling foundational compute capabilities for our enterprise AI and machine learning platform. Your expertise will drive the development of large-scale, high-performance, and highly available distributed systems that power a wide array of AI workloads, from model training and inference to data processing and generative AI applications. You will collaborate closely with cross-functional teams, including developers, product managers, and stakeholders, to design, implement, and optimize compute infrastructure that supports our innovative AI initiatives. Your leadership will help shape the future of AI computing at Capital One, ensuring robust, scalable, and secure systems that meet the evolving needs of our business and customers.
Qualifications
- Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 10 years of experience developing AI and ML algorithms or technologies, or
- Master's degree in relevant fields plus at least 8 years of experience in AI/ML development
- Minimum of 10 years of programming experience with Python, Go, Scala, or Java
- Deep understanding of distributed compute frameworks such as Spark, Dask, Ray, or Flink
- Experience with container orchestration platforms like Kubernetes and serverless environments such as AWS Lambda
- Strong background in building scalable AI/ML infrastructure supporting large-scale training, inference, and data processing workloads
- Proven track record of leading complex technical projects and mentoring engineering teams
- Excellent problem-solving skills with a focus on operational excellence and automation
Responsibilities
- Architect and develop control and data plane implementations to realize a highly available, multi-tenant, and secure machine learning platform
- Design and implement Ray and Spark-based distributed compute solutions to accelerate diverse workloads, including LLM pre-training, reinforcement learning, and large-scale data processing
- Optimize compute resource utilization and cost-efficiency across cloud environments, including AWS-specific primitives
- Lead systemic improvements to automate routine operational workflows, enhancing reliability and efficiency
- Oversee the technical execution of multiple projects, collaborating with developers working on distributed microservices and foundation models
- Engage with product and program management teams to align technical solutions with business objectives
- Stay at the forefront of technological advancements, experimenting with new tools and leading design and code review sessions
- Contribute to the professional growth of team members through mentorship and talent acquisition efforts
- Promote a culture of innovation, continuous learning, and technical excellence within the engineering community
Benefits
- Competitive salary and performance-based incentives
- Comprehensive health, dental, and vision insurance plans
- Retirement savings plans with employer contributions
- Paid time off and flexible work arrangements, including remote work options
- Professional development opportunities, including training and conferences
- Inclusive and collaborative work environment that values diversity
- Support for work-life balance and employee well-being programs
Equal Opportunity
Capital One is an equal opportunity employer committed to fostering an inclusive environment. We do not discriminate based on race, color, religion, sex, national origin, age, disability, or protected veteran status. We promote diversity and inclusion in all aspects of employment and are dedicated to providing a workplace free from discrimination and harassment. We also comply with all applicable laws regarding criminal background checks and fair employment practices. If you require accommodations during the application process, please contact our recruiting team. We welcome qualified applicants from all backgrounds to join our team and help us build a better banking future through innovative AI solutions.
