MetroStar Logo

MetroStar

Site Reliability Engineer (5667)

Job Posted 7 Days Ago Posted 7 Days Ago
Remote
Mid level
Remote
Mid level
The Site Reliability Engineer will design and manage scalable systems, optimize performance, automate processes, and ensure service availability. Responsibilities also include incident response and documentation maintenance.
The summary above was generated by AI

As Site Reliability Engineer, you’ll lead the design, implementation, and management of highly available and scalable systems, applying industry best practices and reliability engineering principles.

We know that you can’t have great technology services without amazing people. At MetroStar, we are obsessed with our people and have led a two-decade legacy of building the best and brightest teams. Because we know our future relies on our deep understanding and relentless focus on our people, we live by our mission: A passion for our people. Value for our customers.

If you think you can see yourself delivering our mission and pursuing our goals with us, then check out the job description below!

What you’ll do:

  • Collaborate with cross-functional teams to identify performance bottlenecks, troubleshoot complex issues, and optimize system performance to meet defined service level objectives.
  • Design and implement monitoring, alerting, and incident response strategies to proactively identify and mitigate potential issues, ensuring uninterrupted service availability.
  • Drive automation initiatives to streamline deployment, configuration management, and infrastructure provisioning processes.
  • Develop and maintain comprehensive documentation for system configurations, processes, and procedures.
  • Participate in on-call rotations and respond to incidents, working diligently to resolve issues and prevent recurrence.

What you’ll need to succeed:

  • Possess an active Secret U.S. Government security clearance or higher
  • Bachelor’s degree in Computer Science, Information Technology, or a related field.
  • Minimum of 3 years of professional experience in a Site Reliability Engineering role or similar capacity.
  • Strong experience with cloud technologies (e.g., AWS, Azure, GCP) and infrastructure as code (e.g., Terraform, Ansible).
  • Proficiency in managing, leading, and engineering incident and outage response
  • Strong engineering experience in network protocols (e.g., TCP/IP, DNS, HTTP/HTTPS, Load Balancing, etc.)
  • Proficiency in programming and scripting languages (e.g., Python, Go, Bash) and RPA (e.g. Blue Prism, UIPath) to automate tasks and develop tools.
  • Deep understanding of containerization and orchestration technologies (e.g., Kubernetes, Docker).
  • Expertise in implementing and managing monitoring and logging solutions (e.g., Splunk, Prometheus, Grafana, ELK stack).
  • Familiarity with CI/CD pipeline development and management (e.g., GitLab CI, Azure DevOps, AWS Lambda, Jenkins)
  • Proven track record of designing, building, and maintaining highly available and scalable systems.
  • Expert proficiency in developing automated functional, regression and performance tests and developing automated testing standards for development teams.
  • Experience facilitating change and configuration management processes to drive reliability.
  • Strong problem-solving skills, with the ability to diagnose complex issues and implement effective solutions.
  • Excellent communication skills, with the ability to collaborate effectively across diverse teams.

Like we said, we are big fans of our people. That’s why we offer a generous benefits package, professional growth, and valuable time to recharge. Learn more about our company culture code and benefits. Plus, check out our accolades.

Commitment to Non-Discrimination
All qualified applicants will receive consideration for employment based on merit and without regard to sex, race, ethnicity, age, national origin, citizenship, religion, physical or mental disability, medical condition, genetic information, pregnancy, family structure, marital status, ancestry, domestic partner status, sexual orientation, gender identity or expression, veteran or military status, status as a protected veteran, or any other status protected by applicable federal, state, local, or international law.

 What we want you to know:

In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.

 Not ready to apply now? 

Sign up to join our newsletter here.

Top Skills

Ansible
AWS
Aws Lambda
Azure
Azure Devops
Bash
Blue Prism
Docker
Elk Stack
GCP
Gitlab Ci
Go
Grafana
Jenkins
Kubernetes
Prometheus
Python
Splunk
Terraform
Uipath

Similar Jobs

2 Days Ago
Easy Apply
Remote
Hybrid
2 Locations
Easy Apply
148K-236K Annually
Senior level
148K-236K Annually
Senior level
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
As a Lead Site Reliability Engineer, you will enhance cloud infrastructure, automate operations, and troubleshoot complex production issues in a secure environment.
Top Skills: AnsibleAWSBashChefDirect ConnectDockerGoKubernetesPuppetPythonRestRubyScalaSoapTlsTransit GatewayUnix/LinuxVpc
50 Minutes Ago
Remote
USA
Senior level
Senior level
Cloud • Information Technology • Productivity • Software • Automation
As a Senior Site Reliability Engineer, you will enhance system scalability and reliability, automate infrastructure, mentor engineers, and collaborate on product features development.
Top Skills: AnsibleAWSCloud FormationNew RelicPythonSplunkTerraform
7 Days Ago
Remote
2 Locations
180K-220K Annually
Senior level
180K-220K Annually
Senior level
Cloud • Software
As a Senior Site Reliability Engineer, you'll lead infrastructure design, ensure reliability and scalability, and collaborate with product teams.
Top Skills: DatadogKubernetesPulumiTerraformTerragrunt

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account