Voltage Park Logo

Voltage Park

Manager of Infrastructure Operations

Job Posted 18 Days Ago Posted 18 Days Ago
Remote
2 Locations
180K-240K Annually
Senior level
Remote
2 Locations
180K-240K Annually
Senior level
Lead and mentor the 24/7 Infrastructure Operations team, ensuring system stability and performance while implementing best practices and automation.
The summary above was generated by AI

Voltage Park is seeking a highly skilled and proactive Manager of Infrastructure Operations to lead our 24/7 Infrastructure Operations team responsible for the stability, scalability, and performance of compute, storage, and platform infrastructure. This role plays a key part in delivering always-on, high-performance environments that support AI/ML training, inference, and HPC workloads at scale. The ideal candidate combines technical depth with strong leadership skills and a passion for operational excellence. 

This position offers full remote flexibility, although candidates must be based in the continental US and available to work during PST hours. Unfortunately, we are unable to provide sponsorship for this role.

Responsibilities:

  • Establish and uphold the standard practices for our expanding InfraOps team.

  • Lead and mentor a 24/7 infrastructure Operations team responsible for monitoring, maintaining, and supporting our infrastructure.

  • Develop and maintain operational runbooks, escalation procedures, and documentation for critical systems.

  • Collaborate with Infrastructure Engineering, Network operations, and Datacenter Operations and Customer Success teams to support infrastructure rollouts, upgrades, and scaling efforts.

  • Oversee observability systems (monitoring, logging, alerting) and drive continuous improvements in automation and root-cause analysis.

  • Drive adoption of “Infrastructure as Code” and automated workflows to reduce manual intervention.

  • Implement and enforce best practices for system availability, performance tuning, capacity planning, and lifecycle management.

  • Be available for on-call support during urgent system incidents.

  • Ensure compliance with security, regulatory, and organizational standards across all environments.

Qualifications:

  • Proficiency in Puppet, Terraform, and Ansible.

  • Strong scripting skills in Bash, Python, or Go.

  • Extensive experience in setting up, deploying, and managing Kubernetes clusters.

  • Proven track record of architecting, building, and delivering complex systems from inception.

  • Ability to strike a balance between pragmatic development and ideal architectures.

  • Skilled at navigating trade-offs between design, risk, cost, and outcomes.

  • Deep understanding of network protocols, network programming, Unix variants, monitoring, and security systems.

  • Excellent written and verbal communication skills.

Leadership Requirements:

  • Demonstrated ability to inspire and lead a team towards common goals, fostering a positive and collaborative work environment.

  • Proven track record of effectively delegating tasks, providing constructive feedback, and developing team members' skills.

  • Strong decision-making skills, capable of guiding the team through complex technical challenges and strategic initiatives.

  • Ability to communicate a clear vision and align team efforts with broader company objectives.

  • Experience in conflict resolution and team building, promoting diversity, equity, and inclusion within the team and the organization.

Culture:

  • Enjoy collaborating with a growing motivated team focused on execution.

  • Comfortable operating with a high degree of autonomy and able to independently prioritize tasks aligning with company objectives.

  • Possess a breadth of knowledge in your domain while also embracing the opportunity to take on diverse responsibilities.

  • Value the importance of clear communication and documentation in driving success.

Team Charter:

The 24/7 Infrastructure Operations Team ensures the stability, scalability, and performance of Voltage Park’s compute, storage, and platform systems across data centers, cloud, and edge. Supporting AI and HPC GPU environments, the team delivers proactive monitoring, automation toolsets, and continuous optimization to maintain high availability and operational excellence at all times to ensure the best possible customer experience.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Compensation Range: $180K - $240K


#BI-Remote

Top Skills

Ansible
Bash
Go
Kubernetes
Puppet
Python
Terraform
HQ

Voltage Park San Francisco, California, USA Office

555 Montgomery Street, San Francisco, CA, United States

Similar Jobs at Voltage Park

3 Days Ago
Remote
2 Locations
140K-165K Annually
Mid level
140K-165K Annually
Mid level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
The Technical Account Manager ensures customer satisfaction by managing relationships, optimizing infrastructure usage, providing strategic insights, and collaborating with internal teams.
Top Skills: Advanced Data Analytics PlatformsAICloud InfrastructureHpcMl
3 Days Ago
Remote
USA
170K-225K Annually
Senior level
170K-225K Annually
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Lead the customer experience team at Voltage Park to ensure exceptional service, ongoing improvements, and alignment between customer feedback and organizational strategy.
Top Skills: Ai IntegrationElastic MonitoringGeneral Linux SystemsGpu ArchitectureNetworking KnowledgeVast Storage
17 Days Ago
Remote
2 Locations
145K-185K Annually
Mid level
145K-185K Annually
Mid level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
As a Solutions Engineer at Voltage Park, you will be the technical expert supporting the sales team by addressing complex customer inquiries, designing tailored GPU cloud solutions, and collaborating with account executives to develop comprehensive proposals. Your role involves educating clients on GPUaaS offerings, configuring environments, and maintaining current technical knowledge in the AI and GPU technology space.

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account