Databricks

Staff Software Engineer - Distributed Data Systems

Reposted 17 Days Ago

Be an Early Applicant

San Francisco, CA

192K-260K Annually

Senior level

San Francisco, CA

192K-260K Annually

Senior level

As a Staff Software Engineer at Databricks, you will develop advanced distributed data storage and processing systems and enhance query performance across various ETL and data science workloads.

The summary above was generated by AI

P-186

At Databricks, we are obsessed with enabling data teams to solve the world's toughest problems, from security threat detection to cancer drug development. We do this by building and running the world's best data and AI infrastructure platform, so our customers can focus on the high value challenges that are central to their own missions.

Founded in 2013 by the original creators of Apache Spark™, Databricks has grown from a tiny corner office in Berkeley, California to a global organization with over 1000 employees. Thousands of organizations, from small to Fortune 100, trust Databricks with their mission-critical workloads, making us one of the fastest growing SaaS companies in the world.

Our engineering teams build highly technical products that fulfill real, important needs in the world. We constantly push the boundaries of data and AI technology, while simultaneously operating with the resilience, security and scale that is critical to making customers successful on our platform.

We develop and operate one of the largest scale software platforms. The fleet consists of millions of virtual machines, generating terabytes of logs and processing exabytes of data per day. At our scale, we regularly observe cloud hardware, network, and operating system faults, and our software must gracefully shield our customers from any of the above.

Modern data analysis employs sophisticated methods such as machine learning that go well beyond the roll-up and drill-down capabilities of traditional SQL query engines. As a software engineer on the Runtime team at Databricks, you will be building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance, yet provide the expressiveness and programming abstractions to support diverse workloads ranging from ETL to data science.

Below are some example projects:

Apache Spark™: Develop the de facto open source standard framework for big data.

Data Plane Storage: Deliver reliable and high performance services and client libraries for storing and accessing humongous amount of data on cloud storage backends, e.g., AWS S3, Azure Blob Store.

Delta Lake: A storage management system that combines the scale and cost-efficiency of data lakes, the performance and reliability of a data warehouse, and the low latency of streaming. Its higher level abstractions and guarantees, including ACID transactions and time travel, drastically simplify the complexity of real-world data engineering architecture.

Delta Pipelines: It's difficult to manage even a single data engineering pipeline. The goal of the Delta Pipelines project is to make it simple and possible to orchestrate and operate tens of thousands of data pipelines. It provides a higher level abstraction for expressing data pipelines and enables customers to deploy, test & upgrade pipelines and eliminate operational burdens for managing and building high quality data pipelines.

Performance Engineering: Build the next generation query optimizer and execution engine that's fast, tuning free, scalable, and robust.

What we look for:

BS in Computer Science, related technical field or equivalent practical experience.
Optional: MS or PhD in databases, distributed systems.
Comfortable working towards a multi-year vision with incremental deliverables.
Driven by delivering customer value and impact.
8+ years of production level experience in either Java, Scala or C++.
Strong foundation in algorithms and data structures and their real-world use cases.
Experience with distributed systems, databases, and big data systems (Apache Spark™, Hadoop).

Pay Range Transparency

Databricks is committed to fair and equitable compensation practices. The pay range(s) for this role is listed below and represents base salary range for non-commissionable roles or on-target earnings for commissionable roles. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to job-related skills, depth of experience, relevant certifications and training, and specific work location. Based on the factors above, Databricks utilizes the full width of the range. The total compensation package for this position may also include eligibility for annual performance bonus, equity, and the benefits listed above. For more information regarding which range your location is in visit our page here.

Local Pay Range

$192,000—$260,000 USD

About Databricks

Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.
Benefits
At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visit https://www.mybenefitsnow.com/databricks.

Our Commitment to Diversity and Inclusion

At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.

Compliance

If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.

Top Skills

C++

Java

Scala

160 Spear Street, San Francisco, CA, United States, 94105

Similar Jobs

Databricks

Staff Software Engineer - Distributed Data Systems

17 Days Ago

Mountain View, CA, USA

192K-260K Annually

Senior level

192K-260K Annually

Senior level

Big Data • Machine Learning • Software • Analytics • Big Data Analytics

As a Staff Software Engineer, you will build next-gen distributed data systems, improve performance, and develop high-value data processing solutions.

Top Skills: SparkC++Delta LakeDistributed SystemsHadoopJavaScala

Nexthink

Solution Consultant

2 Hours Ago

Hybrid

Los Angeles, CA, USA

Senior level

Artificial Intelligence • Big Data • Information Technology • Software

The Solution Consultant is responsible for supporting sales with presales expertise, showcasing Nexthink's value, and facilitating partnerships with various IT stakeholders to achieve revenue goals.

Top Skills: CloudCloud-Based SolutionsDigital Experience MonitoringIt OperationsIt SecurityIt Service ManagementItil FrameworkO365SaaSWin10

Nexthink

Solution Consultant

2 Hours Ago

Hybrid

San Diego, CA, USA

Senior level

Artificial Intelligence • Big Data • Information Technology • Software

The Solution Consultant supports sales teams by qualifying leads, demonstrating product value, and working collaboratively with multiple stakeholders in enterprise accounts to drive business outcomes and support revenue goals.

Top Skills: CloudCloudDigital Experience MonitoringItilO365SaaSService LifecycleWindows 10

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Apply Save

By clicking Apply you agree to share your profile information with the hiring company.

Databricks

Staff Software Engineer - Distributed Data Systems

Top Skills

Databricks San Francisco, California, USA Office

Similar Jobs

Staff Software Engineer - Distributed Data Systems

Solution Consultant

Solution Consultant

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech