Responsible for enhancing data quality by designing scalable data pipelines, implementing evaluation frameworks, and conducting research to improve data collection methods.
As MLE on Luma's Data team you are responsible for raising the bar for our data quality. Data is the critical foundation of our products, and we are looking for individuals who can identify creative approaches to data and captioning and then implement solutions for processing at PB scale. Good candidates should have exceptional general python engineering skills alongside a combination of industry ML experience, Data experience, and passion for building AI products.
Responsibilities
- Design data pipelines, including finding appropriate data sources, scraping, filtering, post-processing, de-duplicating, and versioning. The system should be robust and scalable for production use.
- Design and implement frameworks to evaluate the effectiveness of our models and data. For example, set up the standards for an automated evaluation pipeline to run before any new model gets deployed into the API.
- Work closely with research and product teams who might be data contributors or consumers to incorporate their data usage needs on a variety of tasks.
- Conduct open-ended research to improve the quality of collected data, including but not limited to, semi-supervised learning, human-in-the-loop machine learning and fine-tuning with human feedback.
Experience
- 5+ years of relevant experience or demonstration of high impact projects as a Data Engineer, Machine Learning Engineer, or Data Scientist, dealing with large amounts of data on a daily basis.
- Good to have experience with visual media and computer vision algorithms.
- Have a strong belief in the criticality of high-quality data and are highly motivated to work with the associated challenges.
- Experience with end to end training ML pipelines.
- Have experience working in large distributed systems.
- Strong generalist Python and Pytorch skills.
Compensation
- The pay range for this position in California is $180,000 - $250,000yr; however, base pay offered may vary depending on job-related knowledge, skills, candidate location, and experience. We also offer competitive equity packages in the form of stock options and a comprehensive benefits plan.
Your application is reviewed by real people.
Top Skills
Python
PyTorch
Similar Jobs
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
The Machine Learning Modeler focuses on validating ML models for financial services, developing tools for scalability and governance, and ensuring compliance with regulations. Strong software engineering and communication skills are essential, alongside experience in model risk management.
Top Skills:
Automated TestingLlmMachine LearningModel GovernancePythonSQLXgboost
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Lead Atlassian's Data Engineering Team in building scalable data solutions, ensuring operational data quality, architecting data systems, and mentoring engineers.
Top Skills:
AirflowAWSDatabricksFlinkHiveKafkaRedshiftSparkSQL
Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI
The Principal Data Engineer will lead the advancement of the Enterprise Data Platform, ensuring high-quality solutions that enable data-driven decisions and adherence to best practices in data engineering.
Top Skills:
AirflowApache CassandraCockroachdbDockerFivetranJavaKafkaKubernetesMongoDBPythonQlikRdbmsSnowflakeSparkSQL
What you need to know about the San Francisco Tech Scene
San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine