Get the job you really want.
Top Reliability Engineer Jobs in San Francisco, CA
As a Site Reliability Engineer at RunPod, you will design, implement, and maintain scalable and highly available systems, troubleshoot complex issues, manage large-scale infrastructure, and automate processes to ensure reliability and performance.
The SRE Manager will lead and mentor a team of Site Reliability Engineers, overseeing the design and maintenance of distributed systems, ensuring reliability and scalability, and managing infrastructure security. Responsibilities include strategic planning, establishing SLIs, SLOs, and SLAs, and collaborating with cross-functional teams to meet organizational goals.
As a Site Reliability Engineer II, you will enhance site and system reliability through monitoring, automation, and infrastructure management. Your role includes managing IaC, ensuring performance metrics are met, and collaborating across teams while participating in on-call rotations to address incidents.
Invisible Technologies is seeking a Principal Software Engineer specializing in SRE/DevOps. This role involves leading technical initiatives, mentoring team members, and developing cloud-based architecture while ensuring security and networking considerations are met. Candidates should have strong experience with Kubernetes, cloud providers like AWS and GCP, and infrastructure as code tools such as Terraform and Ansible.
The Staff Software Site Reliability Engineer will lead incident management, oversee change and problem management processes, develop reliability engineering tools, and promote SRE best practices across teams, ensuring system reliability and stability at Credit Karma.
As a Staff Software Engineer on the Compute Reliability and Efficiency team, you will focus on Linux and Kubernetes systems engineering, enhancing the performance and reliability of Reddit's infrastructure and tools. You will write and design software for availability and efficiency, collaborate with engineers, and automate development processes.
The Staff Software Engineer will focus on lower-level systems engineering, particularly in Linux and Kubernetes, to enhance the performance, reliability, and scalability of Reddit's infrastructure, while collaborating with other engineers to automate and improve critical processes.
The Senior Reliability Test Engineer will design and execute reliability test plans for humanoid robots, focusing on modules like actuators, batteries, and sensors. Responsibilities include developing specifications, conducting electrical diagnostics, and utilizing CAD for test fixture designs, with a strong requirement for Python scripting to automate testing processes.
Featured Jobs
In this role, you will build and maintain scalable infrastructure ensuring reliability and low-latency experiences, collaborating with leadership and software developers to implement best practices in cloud technologies and infrastructure management.
As a Senior Site Reliability Engineer, you will automate processes, improve team workflows, manage deployment tools, and maintain secure infrastructure while working with a diverse technology stack in a hybrid cloud environment. You will be responsible for monitoring and improving system performance and ensuring the reliability of enterprise SaaS offerings.
As a Site Reliability Engineer at Boomi, you will develop systems and software to meet customer goals. Your responsibilities include maintaining infrastructure as code, ensuring system reliability, improving operational processes, and collaborating on product features while participating in on-call rotations and monitoring systems.
As a Site Reliability Engineer at Boomi, you'll develop sophisticated systems and software, working collaboratively in an Agile team. You'll ensure the reliability of production systems and work on infrastructure management, observability, and automation processes, while actively participating in on-call rotations and improving operational workflows.
The Staff Site Reliability Engineer for the Data Engineering team will maintain and enhance the reliability of data infrastructure, collaborating closely with engineers to implement automation, monitoring, and best practices for data platform performance.
As a Senior Platform Engineer at Mux, you will design and operate infrastructure for high traffic platforms, improve usability and automation for CI/CD systems, lead cross-functional projects, debug production issues, and establish engineering standards.
All Filters
No Results
No Results