Senior AI Performance Engineer

Sorry, this job was removed at 12:10 a.m. (PST) on Tuesday, Jun 10, 2025

Be an Early Applicant

In-Office

San Francisco, CA

In-Office

San Francisco, CA

Similar Jobs

Hewlett Packard Enterprise

Sr AI/HPC Applications and Performance Engineer

Yesterday

In-Office or Remote

162K-371K Annually

Senior level

162K-371K Annually

Senior level

Artificial Intelligence • Cloud • Information Technology • Consulting

The role involves developing architectures for AI/HPC software systems, mentoring team members, and influencing technology strategies within the organization.

Top Skills: AICloud ArchitecturesDevOpsDistributed ComputingFull Stack DevelopmentHpcMicroservices

Block

Solutions Engineer

7 Hours Ago

In-Office or Remote

143K-258K Annually

Senior level

143K-258K Annually

Senior level

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency

Lead technical delivery management for enterprise integrations, manage projects, coordinate global teams, and influence product roadmaps.

Top Skills: E-Commerce PlatformsPayment Processing WorkflowsPayment Service ProvidersRestful Apis

Block

Software Engineering Program Management, Lead

7 Hours Ago

In-Office or Remote

240K-359K Annually

Expert/Leader

240K-359K Annually

Expert/Leader

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency

Lead the Software Engineering Program Management team to drive complex cross-functional programs across payment and Bitcoin hardware products, ensuring successful delivery and strategic alignment.

Top Skills: Consumer ElectronicsEmbedded PlatformsFirmwareHardware-Software IntegrationSoftware

We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the boundaries of what's possible in video generation.

Role overview:

As a Deep Learning Performance Engineer at Genmo, you will play a critical role in optimizing the performance of our large generative AI models. Your expertise will ensure that our models run efficiently on clusters, leveraging advanced techniques and tools to enhance their performance. This role is perfect for someone with a deep understanding of deep learning performance bottlenecks, kernel optimization, and distributed training strategies.

Key responsibilities:

Analyze and optimize the performance of massively parallel and distributed systems
Implement and fine-tune distributed training strategies for multi-GPU and multi-node environments
Implement high-performance CUDA, Triton, C++ and PyTorch code.
Profile model performance and identify bottlenecks using tools like NVIDIA NSight Systems, PyTorch Profiler, and TensorFlow Profiler
Develop and maintain benchmarking suites for continuous performance monitoring

Qualifications:

Master's or PhD in Computer Science, Electrical Engineering, or a related field
5+ years of experience in optimizing deep learning models, preferably in a production environment
Must have
- Strong programming skills in Python and C++. Experience in training large models using Python & PyTorch and/or TensorFlow including their distributed training frameworks.
- Proven track record of optimizing large-scale models (10B+ parameters)
- Deep understanding of GPU architecture and CUDA programming
- Experience in entire development pipeline from data processing, preparation & data loading to training and inference.
- Experience optimizing and deploying inference workloads for throughput and latency across the stack (inputs, model inference, outputs, parallel processing etc.)
- Demonstrated expertise in high-performance computing using NVIDIA Triton and CUDA
- Demonstrated ability to significantly improve model inference and training speeds through low-level optimizations
Ideal candidates will have:
- Knowledge of distributed inference systems for handling high-volume workloads
- Strong background in linear algebra, optimization, and machine learning algorithms
- Experience with generative AI models (GANs, Diffusion Models, Transformers)
- Knowledge of hardware-aware neural architecture design
- Experience with high-performance computing (HPC) environments
- Contributions to relevant open source projects or research publications

Genmo is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. Genmo, Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish.

2261 Market Street, San Francisco, CA, United States

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Genmo

Senior AI Performance Engineer

Similar Jobs

Sr AI/HPC Applications and Performance Engineer

Solutions Engineer

Software Engineering Program Management, Lead

Genmo San Francisco, California, USA Office

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech