Kentik Logo

Kentik

Staff Site Reliability Engineer, Infrastructure

Job Posted 4 Days Ago Posted 4 Days Ago
Remote
Hiring Remotely in United States
Senior level
Remote
Hiring Remotely in United States
Senior level
As a Staff Site Reliability Engineer, you will manage core infrastructure, improve reliability, automate operations, and support engineering teams in a remote environment.
The summary above was generated by AI
Who we are

Kentik is the network observability company. Our platform is a must-have for the network front line, whether digital business, corporate IT, or service provider. Network professionals turn to the Kentik Network Observability Cloud to plan, run, and fix any network, relying on our infinite granularity, AI-driven insights, and insanely fast search.

Kentik makes sense of network, cloud, host, and container flow, Internet routing, performance tests, and network metrics. We show network pros what they need to know about their network performance, health, and security to make their business-critical services shine. Networks power the world’s most valuable companies, and those companies trust Kentik. Market leaders like IBM, Box, and Zoom rely on Kentik for network observability. Visit us at kentik.com and follow us at @kentikinc.

What we do

Kentik is looking for an experienced software engineer with an operational mindset to join our Infrastructure team as a Staff SRE.

This infra team is in charge of the software stack that powers Kentik - from configuration management and orchestration, API and service fabric, to datastores and data pipelines, developer experience and internal observability. In partnership with our hardware and network operations team, we provide a reliable platform for other engineering teams to build on.

We are an international group of collaborative, experienced developers and operations practitioners, with broad and deep knowledge of networks, systems and applications.

What you'll do
The role is a mix of development and operations. You will be writing code for internal API services and tools,  as well as operating these services in production, along with third-party components like envoy or postgres.  

  • Own, scale and maintain our core infrastructure, both on bare metal deployments and major public clouds; keep everything healthy and up to date
  • Bring our overall reliability to new levels, streamline and simplify our stack, and contribute to our efforts automating all the things
  • Define and refine our platform offering - reliable, easy to use "paved path" solutions we provide to the rest of the engineering organization 
  • Identify needs, spec plans, and deliver value to our internal customer teams on an ongoing basis
  • Participate in our low-noise on-call rotation, and help other teams with their internal monitoring needs and practices
  • Contribute to our nascent efforts to make customer On-Prems a repeatable, scalable product
  • Work with the hardware and network operations team to keep our datacenter humming and our systems provisioned
  • Collaborate with other engineers in a dynamic, fast-paced and very collaborative remote environment
What you'll bring

Studies have shown that some candidates tend to apply to jobs only if they meet 100% of the qualifications. We encourage you to apply if you meet most of the criteria - even if you don’t match all of the qualifications, your skills and experience could be valuable in this role!

  • 5+ years of relevant experience
  • An SRE mindset and and the drive to build reliable, easy to operate systems
  • Running, scaling, tuning postgres, kafka or other datastores and third-party applications
  • Experience building apps and services (i.e. Go or nodejs, GRPC, postgres or mysql)
  • Shipped projects from scratch and maintained them over a period of time
  • Clear communication, both synchronous and via technical plans
  • A practice of instrumenting your services and setting up the right monitoring and alerts
  • Passion for building and providing amazing tools and platforms to other engineers

Our tech stack

  • Our core data engine and platform are primarily written in Go
  • We use Node.js for mid tier and most public-facing APIs, and React for UI
  • Datastores: Postgres, Kafka, Mysql, and Redis, as well as Consul and Vault
  • Haproxy, Envoy for API traffic routing and balancing, with a custom control plane written in Go
  • Internal Observability and Monitoring: logs, metrics and traces, dashboards, alerts, SLIs/SLOs with victoriametrics, grafana, ELK, honeycomb
  • Puppet and Hashicorp Nomad for config management and orchestration
  • Github for source control, PRs, issues, Jenkins for automated builds, and lots of good custom tooling for deployments
  • Most of our systems run on Linux bare-metal hosts managed with puppet, we also have iaas and managed k8s services in major public clouds

What we offer

Kentik is a fully remote company that operates globally. We seek professionals that will help us thrive as an organization, and in turn, to broaden and enhance your career. We’re very thorough in the interview process to understand your skills and how they will relate to your successful growth here at Kentik. Our compensation philosophy encompasses a fair program for all in order to attract, engage and retain talented individuals who will drive our business and wow our customers.

The compensation range for this position is: $185,000 - $250,000. This range reflects the low and high end of the U.S. compensation range Kentik reasonably and generally expects to pay the hired candidate in this role. The actual compensation offered may be lower or higher than the stated range depending on various factors, including but not limited to:

  • Experience with the skill sets required for success
  • Demonstrated competencies and potential 
  • A geographic market-based approach

In addition to a great career opportunity, Kentik offers stellar benefits for our employees, which include:

  • 100% of premiums are paid by company for health, vision and dental coverage for you and your dependents
  • Additionally, an annual Health Reimbursement Account (HRA) of $3,000 for an individual or $4,500 for a family
  • Paid family & medical leave 
  • Open PTO, a quarterly Wellness Day, and a minimum of 10 paid holidays
  • 401(k) retirement account
  • Home office reimbursement 
  • Stock options

Note: Benefits are as listed for all US full-time employees. For compensation, international applicants will be treated equitably in relation to the laws applicable within the countries in which we operate.

 

Come work with us

The true meaning of Kentik is visibility. We’re committed to making sure everyone feels empowered to use their voice, has a sense of belonging, and is represented at Kentik. 

We don’t look for individuals who fit the culture, but those who will continue to add to the culture.
We encourage everyone to apply, especially those individuals who are underrepresented in the industry: people of color, LGBTQI+ community, women, individuals with disabilities (both seen and unseen), veterans, and people of any age or family status. 

Kentik is committed to creating an inclusive interview process. If you require a reasonable accommodation during the application or interview process, please reach out to recruiting@kentik.com.

Come as you are!
You will be working at a fast-growing, well-funded startup alongside industry thought leaders and network aficionados as we build the future of observability and set the high bar for how network operations and digital businesses should run. With a competitive salary and amazing benefits on top of the meaningful and challenging projects you’ll take on, we’re sure you’ll enjoy joining the Kentik team.

#li-remote

Top Skills

Elk
Envoy
Go
Grafana
Grpc
Haproxy
Hashicorp Nomad
Honeycomb
Jenkins
Kafka
Linux
MySQL
Node.js
Postgres
Puppet
Redis
HQ

Kentik San Francisco, California, USA Office

548 Market St, Pmb 78595, San Francisco, CA, United States, 94104

Similar Jobs

4 Days Ago
Remote
2 Locations
170K-200K Annually
Senior level
170K-200K Annually
Senior level
Cloud • Software
As a Senior Site Reliability Engineer, you'll lead infrastructure design, ensure reliability and scalability, and collaborate with product teams.
Top Skills: DatadogKubernetesPulumiTerraformTerragrunt
An Hour Ago
Remote
Hybrid
Austin, TX, USA
150K-240K Annually
Senior level
150K-240K Annually
Senior level
Information Technology • Productivity • Software • Infrastructure as a Service (IaaS)
As a Senior Software Engineer, you'll lead the development of high-quality software applications, contributing to architecture decisions and ensuring quality standards are met, while fostering collaboration within teams.
Top Skills: AWSC++JavaKotlinPostgresSQL
2 Hours Ago
Easy Apply
Remote
Hybrid
US
Easy Apply
Mid level
Mid level
AdTech • Enterprise Web • Information Technology • Machine Learning • Marketing Tech • Sales
The Solutions Architect at OpenX will manage DSP integrations, provide technical guidance, and ensure optimal performance of programmatic advertising technologies.
Top Skills: JavaScriptJSONPythonRest ApisRtb ProtocolSQL

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account