Research Engineer, LLM

Posted 24 Days Ago
Be an Early Applicant
San Francisco, CA
Hybrid
170K-237K Annually
Senior level
Artificial Intelligence • Cloud • Machine Learning
Accelerate development and production of any AI app, on any cloud, at any scale — with Ray.
The Role
The Research Engineer will develop advanced features for Ray, focusing on large-scale training and ML systems. Responsibilities include coding, research, and collaborating with teams to enhance Ray's capabilities and performance for machine learning applications.
Summary Generated by Built In

About Anyscale:


At Anyscale, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We’re commercializing Ray, a popular open-source project that's creating an ecosystem of libraries for scalable machine learning. Companies like OpenAIUberSpotifyInstacartCruise, and many more, have Ray in their tech stacks to accelerate the progress of AI applications out into the real world.


With Anyscale, we’re building the best place to run Ray, so that any developer or data scientist can scale an ML application from their laptop to the cluster without needing to be a distributed systems expert.


Proud to be backed by Andreessen Horowitz, NEA, and Addition with $250+ million raised to date.


About the role


The Anyscale Research team is looking for a strong ML Engineer and Researcher who is passionate about pushing the boundaries of what’s possible with Ray. In this role, you will be instrumental in developing cutting-edge features, such as our new Accelerated DAG (ADAG) API, to establish Ray as a leader in large-scale training. You will play a key role in exploring new directions and driving our vision for Ray’s future.


This position is ideal for individuals with a strong engineering background and a passion for ML systems. You will spend approximately 70% of your time coding and engineering, while the remainder will focus on research to support our strategic vision and innovation. The role demands a deep understanding of ML systems and applications, including LLMs and multimodal models, with a strong emphasis on technical skills and engineering expertise.


About the team


We are a newly formed team dedicated to applied research in both ML modeling and systems. Our mission is to advance the capabilities of Ray and advance machine learning workloads on the Anyscale Platform. We operate at the intersection of research and engineering, collaborating closely with the Ray Core and Ray Train teams to bridge gaps and develop innovative solutions that push the frontier of ML and systems research.


We'd love to hear from you if have

  • Production level experience in Machine Learning, and in distributed ML Systems (Python/Pytorch)
  • 4+ years experience in one of those fields: Machine Learning, NLP, or CV, or ML Systems
  • Graduate degree (MSc or PhD) in one of the fields above
  • Published in a top-tier AI conference (Neurips, ICML, ICLR, CVPR, ACL, etc)

A snapshot of projects you may work on

  • Enhancing Ray for Large-Scale Training: Collaborate with the Ray Core and Ray Train teams to adapt and optimize Ray for efficient, large-scale GPU-heavy training, addressing current limitations and expanding its capabilities.
  • Developing the ADAG API: Explore and potentially implement an Accelerated DAG (ADAG) API for Ray, aiming to improve performance and scalability for complex ML workflows.
  • System Integration and Optimization: Create and refine integrations between Ray and other components, such as Ray Data, to streamline large-scale ML processes and ensure seamless operation across different systems.
  • Research and Innovation: Contribute to cutting-edge research in ML systems, identifying new opportunities and methods to push the boundaries of what Ray can achieve in large-scale training environments.
  • Prototype and Benchmarking: Design and build prototypes to test new features or enhancements, and conduct benchmarking to assess performance improvements and validate the effectiveness of your solutions.
  • Work on applied research, pushing state-of-the-art on large-scale model training
  • Advance Ray as the best open source library for large-scale machine learning

Compensation

  • At Anyscale, we take a market-based approach to compensation. We are data-driven, transparent, and consistent. The target salary for this role is $170,112 ~ $237,000. As the market data changes over time, the target salary for this role may be adjusted.
  • This role is also eligible to participate in Anyscale's Equity and Benefits offerings, including, Stock Options

  • Healthcare plans, with premiums covered by Anyscale at 99%
  • 401k Retirement Plan
  • Wellness stipend
  • Education stipend
  • Paid Parental Leave
  • Fertility Benefits
  • Flexible Time Off
  • Commute Reimbursement
  • 100% of in office meals covered

Anyscale Inc. is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. 


Anyscale Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish

Top Skills

Python,Pytorch
The Company
San Francisco, CA
0 Employees
Hybrid Workplace
Year Founded: 2019

What We Do

Distributed computing made simple. Anyscale enables developers of all skill levels to easily build AI applications that run at any scale, from a laptop to a data center.

Similar Jobs

Palo Alto, CA, USA
52 Employees
250K-350K Annually

Perplexity AI Logo Perplexity AI

AI Research Engineer - LLM Training

Artificial Intelligence • Software
7 Locations
41 Employees
200K-280K Annually

Anyscale Logo Anyscale

Research Engineer, LLM

Artificial Intelligence • Software
Hybrid
San Francisco, CA, USA
115 Employees
170K-237K Annually

Dynamo AI Logo Dynamo AI

ML Research Engineer — LLM Safety

Artificial Intelligence • Software
Remote
San Francisco, CA, USA
58 Employees

Similar Companies Hiring

RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
San Francisco, CA
53 Employees
EliseAI Thumbnail
Real Estate • Natural Language Processing • Machine Learning • Healthtech • Artificial Intelligence
San Francisco, CA
165 Employees
Altana Thumbnail
Software • Machine Learning • Artificial Intelligence
San Francisco, CA
200 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account