Rox Jobs

Founding Applied Research Engineer

Rox

Founding Applied Research Engineer

Posted 16 Days Ago

Be an Early Applicant

In-Office

San Francisco, CA, USA

Mid level

In-Office

San Francisco, CA, USA

Mid level

Design and run research programs focused on agent systems, signal classification, cost-efficient inference, and behavioral benchmarking in applied AI. Build evaluation frameworks, conduct experiments, and convert findings into production systems to define applied AI research agendas.

The summary above was generated by AI

Why This Role Exists

Foundation models are commoditizing. Defensibility comes from specialized models, proprietary training signals, and evaluation ownership. Every applied AI company we benchmark against like Decagon, Harvey, Sierra, Cursor has already moved. The window to claim frontier applied AI for revenue is closing in the next few months.

Rox is in market. We run agents against enterprise data at scale, every day. We see exactly where research meets production and where the data is dirty, state is changing, and being wrong costs (a lot of) money.
The Applied Research team exists to close that gap permanently.

What This Team Works On

Four problems we care about right now:

Cost-efficient inference for Clever Columns. Distill a Rox-trained model from frontier teachers so per-account enrichment runs at 1/20th the cost without quality loss. Ships first. Doesn't require trajectory attribution.

Signal classification across the public knowledge graph. A small, fast classifier that distinguishes genuine buying signals from noise across the news, jobs, and filings corpus we already ingest at scale. Powers Recommended Next Moves and Auto Prospecting. Cleanest data subset.

Personalization grounding and hallucination detection. A reward model that catches fabricated prospect context in Sequences in real time. This is the most underrated production failure mode in outbound AI. Trained on cross-customer consensus edits.

Sequencing policy under sparse, delayed rewards. Offline-to-online RL on multi-touch trajectories with intermediate signals as proxies for terminal outcomes. Long-horizon flagship. Hard. [Depends on trajectory instrumentation in progress with Platform Eng.]

These are not benchmark problems. They have real SLAs and real customers depending on them.

What You'll Do

Design and run research programs tied directly to the four above.
Build evaluation frameworks that measure trajectory quality, not just final output, because most eval infrastructure measures end results and we care about the path.
Work on agent memory, retrieval, and context systems alongside elite and competitive engineering minds.
Translate findings into infrastructure with measurable production impact. Help define where Rox Research goes next.

What We're Looking For

You have spent real time thinking about how agents fail in practice, not just on benchmarks. You have built evaluation systems and know exactly where standard approaches break down. You can write code well enough to implement your own ideas, run your own experiments, and ship things that make it into production.

You move fast. The environment changes monthly and the team ships continuously.

Particularly relevant: agent evaluation and behavioral benchmarking; retrieval-augmented generation and knowledge graph systems; RL applied to real-world agent behavior; production ML systems (latency, reliability, observability); post-training and model adaptation for production use cases.

A PhD is not required. Strong research instincts and the ability to ship are.

What Success Looks Like

First few weeks: you understand Rox's architecture, where the production problems are, and where the research gaps are. You have opinions and you share them.

First few months: you are running experiments that directly inform how we build. Something you worked on is in production.

Over time: you are defining the research agenda for the most interesting applied AI problem in the enterprise. The systems you build are things no one else has built before, because no one else has the structural data position to build them.

Why Join Now

We are at an unusual moment. Large enough to have real scale, real customers, and genuinely interesting research problems. Small enough that you are one of a handful of people shaping what the Applied Research function looks like and what it prioritizes.

The team is extraordinary: IMO, IOI, and ICPC medalists, researchers from DeepMind and OpenAI. The feedback loop is a live enterprise system, not a leaderboard. If that's not more interesting to you than publishing for the sake of publishing, this probably isn't the right fit.

San Francisco, onsite. We relocate exceptional people.

San Francisco, CA, United States, 94105

Similar Jobs

Adyen

Compliance Advisory Officer

31 Minutes Ago

Easy Apply

Hybrid

San Francisco, CA, USA

Easy Apply

120K-155K Annually

Senior level

120K-155K Annually

Senior level

Fintech • Payments • Financial Services

Support assessment and resolution of escalated compliance matters, analyze AML/CFT and integrity risks, partner with commercial and first-line teams to apply risk-based solutions, help develop compliance frameworks and escalation procedures, identify automation opportunities, and collaborate globally to ensure compliant onboarding and operations.

Adyen

Compliance Advisory Officer

31 Minutes Ago

Easy Apply

Hybrid

San Francisco, CA, USA

Easy Apply

120K-155K Annually

Senior level

120K-155K Annually

Senior level

Fintech • Payments • Financial Services

Support assessment and resolution of escalated compliance matters across AML/CFT, integrity, and regulatory obligations. Partner with legal, risk, commercial, and operations to provide risk-based solutions, develop compliance frameworks, improve escalation procedures, and identify automation opportunities. Translate compliance issues into actionable steps and collaborate globally to execute compliance initiatives.

Tapestry - Coach and Kate Spade

Retail Contingent

2 Hours Ago

Hybrid

Livermore, CA, USA

15-24 Hourly

Entry level

15-24 Hourly

Entry level

eCommerce • Fashion • Retail • Sales • Wearables • Design

Maintain organized, customer-ready store by processing deliveries, stocking the sales floor, executing price changes and markdowns, auditing inventory/shrinkage, and supporting daily operational standards and cleanliness.

Top Skills: Omnichannel SellingSocial Media

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Rox

Founding Applied Research Engineer

Rox San Francisco, California, USA Office

Similar Jobs

Compliance Advisory Officer

Compliance Advisory Officer

Retail Contingent

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech