MetroStar

Site Reliability Engineer (5667)

Posted 7 Days Ago

Remote

Mid level

Remote

Mid level

The Site Reliability Engineer will design and manage scalable systems, optimize performance, automate processes, and ensure service availability. Responsibilities also include incident response and documentation maintenance.

The summary above was generated by AI

As Site Reliability Engineer, you’ll lead the design, implementation, and management of highly available and scalable systems, applying industry best practices and reliability engineering principles.

We know that you can’t have great technology services without amazing people. At MetroStar, we are obsessed with our people and have led a two-decade legacy of building the best and brightest teams. Because we know our future relies on our deep understanding and relentless focus on our people, we live by our mission: A passion for our people. Value for our customers.

If you think you can see yourself delivering our mission and pursuing our goals with us, then check out the job description below!

What you’ll do:

Collaborate with cross-functional teams to identify performance bottlenecks, troubleshoot complex issues, and optimize system performance to meet defined service level objectives.
Design and implement monitoring, alerting, and incident response strategies to proactively identify and mitigate potential issues, ensuring uninterrupted service availability.
Drive automation initiatives to streamline deployment, configuration management, and infrastructure provisioning processes.
Develop and maintain comprehensive documentation for system configurations, processes, and procedures.
Participate in on-call rotations and respond to incidents, working diligently to resolve issues and prevent recurrence.

What you’ll need to succeed:

Possess an active Secret U.S. Government security clearance or higher
Bachelor’s degree in Computer Science, Information Technology, or a related field.
Minimum of 3 years of professional experience in a Site Reliability Engineering role or similar capacity.
Strong experience with cloud technologies (e.g., AWS, Azure, GCP) and infrastructure as code (e.g., Terraform, Ansible).
Proficiency in managing, leading, and engineering incident and outage response
Strong engineering experience in network protocols (e.g., TCP/IP, DNS, HTTP/HTTPS, Load Balancing, etc.)
Proficiency in programming and scripting languages (e.g., Python, Go, Bash) and RPA (e.g. Blue Prism, UIPath) to automate tasks and develop tools.
Deep understanding of containerization and orchestration technologies (e.g., Kubernetes, Docker).
Expertise in implementing and managing monitoring and logging solutions (e.g., Splunk, Prometheus, Grafana, ELK stack).
Familiarity with CI/CD pipeline development and management (e.g., GitLab CI, Azure DevOps, AWS Lambda, Jenkins)
Proven track record of designing, building, and maintaining highly available and scalable systems.
Expert proficiency in developing automated functional, regression and performance tests and developing automated testing standards for development teams.
Experience facilitating change and configuration management processes to drive reliability.
Strong problem-solving skills, with the ability to diagnose complex issues and implement effective solutions.
Excellent communication skills, with the ability to collaborate effectively across diverse teams.

Like we said, we are big fans of our people. That’s why we offer a generous benefits package, professional growth, and valuable time to recharge. Learn more about our company culture code and benefits. Plus, check out our accolades.

Commitment to Non-Discrimination
All qualified applicants will receive consideration for employment based on merit and without regard to sex, race, ethnicity, age, national origin, citizenship, religion, physical or mental disability, medical condition, genetic information, pregnancy, family structure, marital status, ancestry, domestic partner status, sexual orientation, gender identity or expression, veteran or military status, status as a protected veteran, or any other status protected by applicable federal, state, local, or international law.

What we want you to know:

In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.

Not ready to apply now?

Top Skills

Ansible

AWS

Aws Lambda

Azure

Azure Devops

Bash

Blue Prism

Docker

Elk Stack

GCP

Gitlab Ci

Grafana

Jenkins

Kubernetes

Prometheus

Python

Splunk

Terraform

Uipath

Similar Jobs

Cisco Meraki

Lead Site Reliability Engineer, Network - Remote

2 Days Ago

Easy Apply

Remote

Hybrid

Easy Apply

148K-236K Annually

Senior level

148K-236K Annually

Senior level

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI

As a Lead Site Reliability Engineer, you will enhance cloud infrastructure, automate operations, and troubleshoot complex production issues in a secure environment.

Top Skills: AnsibleAWSBashChefDirect ConnectDockerGoKubernetesPuppetPythonRestRubyScalaSoapTlsTransit GatewayUnix/LinuxVpc

Boomi

Senior Site Reliability Engineer - Hybrid

50 Minutes Ago

Remote

USA

Senior level

Cloud • Information Technology • Productivity • Software • Automation

As a Senior Site Reliability Engineer, you will enhance system scalability and reliability, automate infrastructure, mentor engineers, and collaborate on product features development.

Top Skills: AnsibleAWSCloud FormationNew RelicPythonSplunkTerraform

Vantage (vantage.sh)

Senior Site Reliability Engineer

7 Days Ago

Remote

180K-220K Annually

Senior level

180K-220K Annually

Senior level

Cloud • Software

As a Senior Site Reliability Engineer, you'll lead infrastructure design, ensure reliability and scalability, and collaborate with product teams.

Top Skills: DatadogKubernetesPulumiTerraformTerragrunt

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Apply Save

By clicking Apply you agree to share your profile information with the hiring company.