DescriptionIn this role, you will achieve state-of-the-art throughput for critical models using advanced techniques such as model parallelism and distributed training Reduce inference time for new model architectures using optimizations like quantization and pruning. You will collaborate closely with Applied AI engineering to optimize the internal inference stack. You will also mentor top-tier AI systems engineers while, fostering a culture of continuous learning and innovation. You will coordinate the inference needs of JPMC's research teams, ensuring alignment with business goals.
As a Senior Lead Software Engineer at JPMorgan Chase within the Corporate Sector, AI/Technologies team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. In this role you will drive significant business impact through your capabilities and contributions and apply deep technical prowess and problem-solving skills and methodologies to tackle a diverse array of challenges that span multiple technologies and applications in AI/ML space.
Job responsibilities
- Architects and implements distributed ML infrastructure, including inference, training, scheduling, orchestration, and storage.
- Develops advanced monitoring and management tools for high reliability and scalability.
- Optimizes system performance by identifying and resolving inefficiencies and bottlenecks. Collaborates with product teams to deliver tailored, technology-driven solutions.
- Drives the adoption and execution of ML Platform tools across various teams.
- Integrate Generative AI within the ML Platform using state-of-the-art techniques.
- Develops secure and high-quality production code, and reviews and debugs code written by others.
- Drives decisions that influence the product design, application functionality, and technical operations and processes.
- Serves as a function-wide subject matter expert in one or more areas of focus.
- Actively contributes to the engineering community as an advocate of firm wide frameworks, tools, and practices SDLC.
- Analyzes, writes, develops, tests, releases the products using Python on AWS
Required qualifications, capabilities, and skills
- Formal training or certification on software engineering concepts and 5+ years applied experience·
- Extensive hands-on experience with ML frameworks (TensorFlow, PyTorch, JAX, RAY).
Deep expertise in AWS / GCP and Kubernetes ecosystem, including EKS, Helm, and custom operators.
- Strong coding skills and experience in developing large-scale ML systems.
Background in High Performance Computing, ML Hardware Acceleration (e.g., GPU, TPU, RDMA), or ML for Systems.
- Proven track record in contributing to and optimizing open-source ML frameworks.
- Strategic thinker with the ability to craft and drive a technical vision for maximum business impact.
- Demonstrated leadership in working effectively with engineers, data scientists, and ML practitioners.
- Advanced in one or more programming language(s) – Python or Java, Intermediate Python is a must
Proven ability to identify trade-offs, clarify project ambiguities, and drive decision-making.
Preferred qualifications, capabilities, and skills
- Master’s degree in computer science or data Science
- Experience in building Generative AI based system.
- Experience in continuous integration and continuous deployment platform