Staff Data Engineer #4009

Posted 7 Days Ago
Be an Early Applicant
Menlo Park, CA
Hybrid
178K-223K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Machine Learning • Software • Biotech
GRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured.
The Role
The Staff Data Engineer at GRAIL will manage the data lifecycle from ingestion to analysis, ensuring data integrity and compliance in a regulated biotechnology environment. Responsibilities include developing ETL pipelines, collaborating with scientists and clinical teams, and maintaining data quality and documentation for research and regulatory purposes.
Summary Generated by Built In

GRAIL is focused on improving lives by developing pioneering technologies to detect cancer early. As a member of our team, you will help manage the end-to-end data lifecycle, ensuring data integrity, reliability, and compliance in a regulated environment. You will work closely with cross-functional teams including lab scientists, data scientists, biostatisticians, medical directors, and software engineers to create critical datasets and data solutions that drive our product pipeline.


We are seeking a Staff Data Engineer to develop, optimize, and manage GRAIL’s data lifecycle from sample ingestion to analysis, ensuring compliance with regulatory and clinical standards. You will partner with cross-functional teams to ensure that data solutions are high-quality, scalable, and aligned with our regulatory requirements, including FDA and other global health authorities.


This is a hybrid role and requires you to be onsite at least 2 days a week in Menlo Park, CA.

Responsibilities:

  • Lead the design, development, and optimization of scalable ETL pipelines and data configurations to support the ingestion, transformation, and analysis of clinical and research datasets, ensuring alignment with regulatory and product requirements.
  • Collaborate with data scientists, biostatisticians, and clinical teams to understand and address the data needs of various programs, including clinical trials, research studies, and regulatory submissions.
  • Ensure data integrity, traceability, and quality through robust validation procedures, ensuring compliance with FDA guidelines and other regulatory requirements.
  • Proactively identify new technologies, methodologies, and processes to address evolving data management challenges within a regulated biotechnology environment.
  • Manage the generation and maintenance of metadata, data navigation tools, and documentation to support operational objectives and streamline study processes.
  • Support study operations by ensuring that datasets are structured to meet clinical, scientific, and regulatory milestones, including data locks, submissions, and monitoring.

Preferred Qualifications:

  • BS/MS in a quantitative scientific field (Computer Science, Engineering, Mathematics, Statistics, Bioinformatics, etc.) with 8+ years of experience in data engineering, ideally within a regulated environment such as biotechnology, pharmaceuticals, medical devices, or healthcare.
  • Strong understanding of ETL processes, data pipeline development, and database management, with proven experience delivering data solutions in support of clinical or regulatory requirements.
  • Expertise in SQL and Python or R
  • Experience working with cloud-based data platforms (AWS, Azure, Google Cloud) with a strong understanding of compliance frameworks (e.g., HIPAA, 21 CFR Part 11, GDPR).
  • Excellent problem-solving skills with a track record of ensuring data quality and integrity across complex datasets.
  • Demonstrated success working in cross-functional, collaborative teams, with the ability to translate user requirements into scalable, high-quality data solutions.

  • Highly Desired Qualifications

  • 3+ years of experience working in a regulated industry (biotechnology, medical devices, healthcare) with knowledge of compliance and regulatory requirements for data management.
  • Proven experience in data lifecycle management for clinical trials, including understanding of regulatory submissions (e.g., FDA PMA, IDMC reports).
  • Familiarity with tools like Apache Airflow and DBT for data pipeline orchestration in regulated environments.

The expected, full-time, annual base pay scale for this position is $178K- $223K. Actual base pay will consider skills, experience, and location.

Top Skills

Python
R
SQL

What the Team is Saying

Neda Ronaghi
Ruth Mauntz
Allie Ahn
Maryam Hosseini
David Jenions
Satnam Alag
The Company
Menlo Park, CA
1,300 Employees
Hybrid Workplace
Year Founded: 2016

What We Do

GRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured. GRAIL is using the power of high-intensity sequencing, population-scale clinical studies, and state-of-the-art computer science and data science to enhance the scientific understanding of cancer biology, and to develop and commercialize pioneering products

Why Work With Us

Everything we do is guided by our mission to detect cancer early, when it can be cured. It’s the reason we’re here, and it’s no small task.

The right people make all the difference. That’s why we’re looking for those who strive to share their knowledge, contribute their skills, inspire each other and commit to something bigger than themselves.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

GRAIL Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

GRAIL has a variety of work types depending on the roles. Some are onsite like a lab role, others are hybrid and still others are remote.

Typical time on-site: Flexible
Menlo Park, CA

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account