WitnessAI

ML Engineer - Infrastructure

Posted 5 Days Ago

7 Locations

140K-170K Annually

Mid level

7 Locations

140K-170K Annually

Mid level

The ML Infrastructure Engineer will optimize, deploy and scale machine learning models in production. Responsibilities include designing scalable GPU infrastructures, implementing advanced inference solutions, and collaborating with applied scientists and software engineers to enhance model efficiency and workflow automation.

The summary above was generated by AI

ML Engineer (Infrastructure)

Location: San Francisco - Bay Area Hybrid

About the Role:

WitnessAI is a leader in providing innovative networking solutions designed to enhance security, performance, and reliability for businesses of all sizes. We are seeking an ML Infrastructure Engineer to optimize, deploy and scale machine learning models in production environments. You will play a critical role in scaling GPU resources, building continuous learning pipelines, and integrating a variety of inference frameworks. Your expertise in model quantization, pruning, and other optimization techniques will ensure our models run efficiently and effectively.

You will contribute to our mission through the following:

Develop and Optimize: Design and manage scalable GPU infrastructures for model training and inference. Build automated pipelines that accelerate ML workflows, implement feedback loops for continuous learning, and enhance model efficiency in resource-constrained environments.
Implement Advanced Inference Solutions: Evaluate and integrate inference platforms like NVIDIA Triton and vLLM to ensure high availability, scalability, and reliability of deployed models.
Collaborate for Impact: Work closely with applied scientists, software engineers, and DevOps professionals to deploy models that drive our company's mission forward. Document best practices to support team knowledge sharing and improve code quality and reproducibility.

The ideal candidate will have expertise in designing, developing, and maintaining scalable ML infrastructure components, including data pipelines and deployment systems. You should have a demonstrated track record of optimizing ML workflows for performance and resource utilization, and stay up to date on best practices for model management and reproducibility. Strong communication skills and the ability to collaborate across functions to execute complex projects are essential.

Qualifications

Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

2+ years of experience building and scaling machine learning systems.
Proven experience in scaling GPU resources for machine learning applications.
Experience with inference platforms like NVIDIA Triton, vLLM, or similar.
Demonstrated expertise in model quantization, pruning, and other optimization techniques with frameworks such as TensorRT, ONNX or others.

Skilled in automating data collection, preprocessing, model retraining, and deployment.
Proficient with cloud platforms such as AWS (preferred), GCP, or Azure, especially in deploying and managing GPU instances.
Strong skills in Python; familiarity with other scripting languages is a plus.
Experience with CUDA packages.
Experience with PyTorch, Tensorflow or similar frameworks.
Proficient in Docker and Kubernetes.
Experience with Jenkins, Github CI/CD, or similar tools.
Experience with Prometheus, Grafana, or similar monitoring solutions.

Soft Skills

Strong problem-solving and analytical abilities.
Excellent communication and teamwork skills.
Ability to work independently and manage multiple tasks effectively.
Proactive attitude toward learning and adopting new technologies.

Benefits:

Hybrid work environment
Competitive salary.
Health, dental, and vision insurance.
401(k) plan.
Opportunities for professional development and growth.
Generous vacation policy.

Salary range:

$140,000-$170,000

Top Skills

AWS

Azure

Cuda

Docker

GCP

Github Ci/Cd

Grafana

Jenkins

Kubernetes

Nvidia Triton

Onnx

Prometheus

Python

PyTorch

TensorFlow

Tensorrt

Vllm

San Mateo, CA, United States, 94403

Similar Jobs

Prenuvo

Senior Software Engineer, Machine Learning Infrastructure

5 Days Ago

Vancouver, BC, CAN

Senior level

Healthtech

As a Senior Software Engineer, you will lead the design and implementation of machine learning infrastructure, focusing on secure and scalable data workflows. Your role involves collaboration with data scientists and engineers to produce production-ready systems and manage CI/CD pipelines.

Stripe

Staff Engineer, Machine Learning Infrastructure

5 Days Ago

Toronto, ON, CAN

Expert/Leader

Payments • Software

As a Staff Engineer in Machine Learning Infrastructure at Stripe, you will lead technical projects that enhance ML development and MLOps. You will design system architecture, define project directions, mentor engineers, and collaborate across functions to improve the end-to-end ML lifecycle for the company.

Stripe

Software Engineer, Machine Learning Infrastructure

8 Days Ago

Toronto, ON, CAN

Junior

Payments • Software

The Software Engineer for Machine Learning Infrastructure will collaborate with machine learning engineers and product teams to develop services for ML model training, experimentation, and deployment. Responsibilities include designing scalable services, improving ML development processes, and solving technical challenges across systems.

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

By clicking Apply you agree to share your profile information with the hiring company.