Roblox Logo

Roblox

Sr. Site Reliability Engineer, Compute SRE

Job Posted 6 Days Ago Reposted 6 Days Ago
Be an Early Applicant
Hybrid
San Mateo, CA
193K-239K Annually
Senior level
Hybrid
San Mateo, CA
193K-239K Annually
Senior level
As a Senior SRE, you'll enhance Roblox's infrastructure, automate processes, build monitoring tools, and ensure production readiness while tackling technical challenges.
The summary above was generated by AI

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences– all created by our global community of developers and creators. 

At Roblox, we’re building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device. We’re on a mission to connect a billion people with optimism and civility, and looking for amazing talent to help us get there. 

A career at Roblox means you’ll be working to shape the future of human interaction, solving unique technical challenges at scale, and helping to create safer, more civil shared experiences for everyone.

What You’ll Do:

The Infra Compute SRE mission is to own and manage the successful operation of our underlying cell infrastructure system, along with elements of service discovery, secrets management and related software layers. We’re looking for skilled Site Reliability Engineers with strong programming skills to help us build Roblox's private cloud, productionize our growing Kubernetes-based infrastructure, and institute reliability best practices across the Roblox Compute team.

You Will:

  • Design and Develop systems & libraries that promote fault-tolerance and resilience, automate much of the management and lifecycle of our clusters, and ensure systems are observable.
  • Promote and Institute reliability best practices across the Infra Compute group, drive common reliability initiatives, provide collaborative technical reviews and operational guidance to strengthen system reliability.
  • Build, Automate and Standardize process automation to create a "golden path" of tooling and platform support that powers the fundamental Roblox ecosystem.
  • Create Tooling that provides production guardrails by evaluating release candidate capacity with load testing tooling before deploying to production.
  • Create Performance Monitoring Services and observability towards understanding capacity issues and platform degradations, monitoring production services and their changes, like generalized canarying services with alerting.
  • Analyze systems and system designs for production readiness

You Have:

  • A Bachelor's degree (or equivalent professional experience) in Computer Science or related engineering field with a proven track record including at least 6 years as an SRE or Software Engineer.
  • Fluency with high-level programming languages like Go, Java, and C#.
  • Experience with Kubernetes, or similar orchestration systems. Experience in Nomad, Vault, and Consul is strongly desired.
  • Experience and good habits around building software and tools and getting them adopted. Your system's focus advises a view of code needing to be deeply reliable.

You Are:

  • A Partner: You know that the best tools integrate broadly with the tooling ecosystem. You approach partners and processes with curiosity and seek to understand a problem deeply before you start coding.
  • A Developer: You love building durable and reliable complex systems.
  • Passionate about problem-solving, finding creative work solutions, and addressing unexpected challenges as part of a team.
  • Problem Solver: You ask the right questions to tackle issues within your expertise and you use data to test your theories.
  • Planner: You have experience in large project lifecycles. You have experience working in sprints, breaking down complex tasks into achievements, and reporting status to keep project scheduling accurate.

For roles that are based at our headquarters in San Mateo, CA: The starting base pay for this position is as shown below. The actual base pay is dependent upon a variety of job-related factors such as professional background, training, work experience, location, business needs and market demand. Therefore, in some circumstances, the actual salary could fall outside of this expected range. This pay range is subject to change and may be modified in the future. All full-time employees are also eligible for equity compensation and for benefits.

Annual Salary Range

$192,890$238,520 USD

Roles that are based in our San Mateo, CA Headquarters are in-office Tuesday, Wednesday, and Thursday, with optional in-office on Monday and Friday (unless otherwise noted).

You’ll Love: 

  • Industry-leading compensation package
  • Excellent medical, dental, and vision coverage
  • A rewarding 401k program
  • Flexible vacation policy (varies by exemption status)
  • Roflex - Flexible and supportive work policy 
  • Roblox Admin badge for your avatar
  • At Roblox HQ: 
    • Free catered lunches five times a week and several fully stocked kitchens with unlimited snacks
    • Onsite fitness center and fitness program credit
    • Annual CalTrain Go Pass

Roblox provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. Roblox also provides reasonable accommodations for all candidates during the interview process.

Top Skills

C#
Go
Java
Rust
HQ

Roblox San Mateo, California, USA Office

3150 South Delaware Street, San Mateo, CA, United States, 94403

Similar Jobs

9 Days Ago
Hybrid
San Diego, CA, USA
127K-215K Annually
Senior level
127K-215K Annually
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
As a Senior Site Reliability Engineer, you will enhance cloud infrastructure reliability, automate processes, and lead technical resolutions to improve system performance.
Top Skills: Ai TechnologiesAnsibleAWSAzureBashGCPGitlab Ci/CdGoGrafanaJavaScriptKubernetesLinuxMariadbMySQLOpentelemetryPostgresPrometheusPythonTerraform
2 Days Ago
4 Locations
110K-150K Annually
Junior
110K-150K Annually
Junior
Fintech
The Site Reliability Engineer will enhance application infrastructure, ensure reliability and scalability, automate processes, and handle monitoring systems.
Top Skills: AWSCi/CdDockerGoJavaJavaScriptKubernetesPythonRubySwarm
6 Days Ago
Remote
San Francisco, CA, USA
Senior level
Senior level
Artificial Intelligence • Hardware • Software
The Site Reliability Engineer will manage infrastructure reliability, automate processes, respond to incidents, and improve system performance in a SaaS environment.
Top Skills: AWSAzureBashBuildkiteDockerGCPGithub ActionsJenkinsKubernetesPostgresPulumiPuppetPythonTerraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account