Research Scientist - Voice AI Foundations

Sorry, this job was removed at 12:07 p.m. (PST) on Monday, Dec 16, 2024
Remote
150K-220K Annually
Internship
Artificial Intelligence • Natural Language Processing • Software
Unlocking the power of voice data to fuel the world’s big ideas.
The Role

Company Overview

Deepgram is a foundational AI company on a mission to transform human-machine interaction using natural language. We give any developer access to the fastest, most powerful voice AI platform including access to models for speech-to-text, text-to-speech, and spoken language understanding with just an API call. From transcription to sentiment analysis to voice synthesis, Deepgram is the preferred partner for builders of voice AI applications.

The Opportunity

Despite the proliferation of text-based communication, voice remains the preferred medium for humans to interact with machines. Delivering real-world voice AI solutions to our customers' most challenging problems ultimately drives our mission. At Deepgram, you will have the unique opportunity to innovate, experiment, and build -- significantly shaping our products and AI capabilities. We value tenacious problem-solving and the ability to iterate, learn and adapt. Domain-specific expertise in speech or language AI is not required. As such, you're encouraged to deepen your skills on-the-job, broadening your knowledge and expertise through constant iteration and invention. Our start-up environment offers a stunning growth trajectory due to a level of ownership and an on-ground connection with end-customers that larger research labs simply cannot provide. Embark on a journey to redefine voice technology with us at Deepgram.

The Role

Deepgram is currently looking for strong Research Scientists who have demonstrated experience in solving hard problems using deep learning. At Deepgram, you will apply your skills to uncover breakthroughs that define the future of voice-enabled applications and experiences. Your work will revolve around harnessing vast audio and text datasets to train foundation models that go beyond transcribing speech and comprehending text -- the models you’ll be building will unlock nuanced meanings in complex conversation, adapt robustly to diverse speech patterns, and generate empathic responses with human-like, contextualized speech. You will collaborate with product & engineering to help deploy these models in the most scalable voice API on the planet. We look forward to you bringing your whole self to work, sharing learnings from your latest experiments, and collaborating with us to advance the state of AI and voice technology.

What You’ll Do

  • Design and carry out experimental programs to build new speech and language AI foundation models across modalities and tasks, that solve critical problems for our customers.

  • Drive large-scale training jobs successfully on massive distributed computing infrastructure.

  • Optimize model architectures to make them as fast and memory-efficient as possible; deploy new models into production for use at massive scale.

  • Document and present results and complex technical concepts clearly for internal and external audiences

  • Stay up to date with the latest advances in deep learning with a particular eye towards their implications and applications within our products.

You’ll Love This Role If You

  • Are passionate about AI and interested in leveraging data to solve hard problems

  • Enjoy building from the ground up and love to create new systems from scratch

  • Are data-driven and prefer to solve problems using iterative experimentation

It’s Important To Us That You Have

  • PhD in Physics, Electrical Engineering, Computer Science or another related field

  • Prior experience in designing and conducting experimental programs aimed at understanding complex phenomena, with the ability to rapidly iterate and change course as needed. 

  • Proven experience building models from a blank page and owning the entire deep learning stack including data curation, characterization and cleaning, architecture design and model building, distributed large-scale training, and model optimization for inference.

  • Strong communication skills and the ability to translate complex concepts in simple terms, depending on the target audience

  • Strong software engineering skills with particular emphasis on developing clean, modular code in Python and working with Pytorch.

It Would Be Great if You Had

  • Prior industry experience in building deep learning models to solve complex problems, with a solid understanding toward the applications and implications of different neural network types, architectures, and loss mechanisms.

  • Deep understanding and experience working with state-of-the-art network architectures including transformers. 

  • Understanding of different parallelism paradigms for efficient distributed training.

Backed by prominent investors including Y Combinator, Madrona, Tiger Global, Wing VC and NVIDIA, Deepgram has raised over $85 million in total funding after closing our Series B funding round last year. If you're looking to work on cutting-edge technology and make a significant impact in the AI industry, we'd love to hear from you!

Deepgram is an equal opportunity employer. We want all voices and perspectives represented in our workforce. We are a curious bunch focused on collaboration and doing the right thing. We put our customers first, grow together and move quickly. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, gender identity or expression, age, marital status, veteran status, disability status, pregnancy, parental status, genetic information, political affiliation, or any other status protected by the laws or regulations in the locations where we operate.

We are happy to provide accommodations for applicants who need them.

Compensation Range: $150K - $220K

What the Team is Saying

Jeff "Susan" Ward
Randy Barlow
Ingrid Elise Dorai-Rekaa
Mara Lubell
Primitivo Gonzalez
The Company
109 Employees
Hybrid Workplace
Year Founded: 2015

What We Do

Legacy speech recognition tech is slow, inaccurate, and expensive. It’s time to stop settling for out-of-the-box solutions that don’t meet enterprise needs. Deepgram is the only true end-to-end Deep Learning ASR offering real-time transcription, built to scale. Use it alone or on top of your existing tech and see results in weeks, not months or longer. When speech recognition that’s “good enough for everyone” isn’t good enough for you, try Deepgram.

Deepgram is an NVIDIA partner and a Y Combinator company.

Why Work With Us

Our culture, like our product, is constantly learning and evolving, but the heart of our team is enduring. We are a self-motivated, positive, passionate, and competitive group of people. We have the best technology on the market and are determined to help customers leverage it.

Gallery

Gallery
Gallery
Gallery

Deepgram Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

We currently have a hybrid business model with a nationally distributed workforce and one physical office in Ann Arbor, MI.

Typical time on-site: Not Specified
US

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account