Transluce

Research Engineer - Scalable Interpretability

Transluce  •  $250k - $500k/yr  •  San Francisco, CA (Onsite)  •  3 hours ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

Salary range: $250,000 - $500,000/year + benefits

Transluce is a non-profit research lab building tools for scalable, end-to-end oversight of AI systems. We build world-class, AI-backed analysis tools and use these to set industry standards for evaluation. Our tools are integrated with core agent benchmarks like SWE-bench, while our evaluations are directly underpinning regulation, including our role as EU AI Office’s main evaluation developer for harmful manipulation risks.

About the role: We are looking for strong scientists and engineers to help advance our vision of scalable end-to-end oversight assistants, building on our recent advances such as predictive concept decoders and user model extractors As part of our highly collaborative team, you will learn and grow quickly, creating technology at the frontier of AI research and with high direct impact.

Core responsibility: Help us develop and train scalable interpretability assistants that can predict and detect unexpected and subtle behaviors from models’ activations. This includes:
  • Creating diverse evaluations that range in difficulty. This involves finding naturally occurring interesting and undesirable behaviors exhibited by open-source models.
  • Developing novel architectures and objectives for training interpretability assistants.
  • Scaling up the training and inference pipelines to support up to 1T-scale models.

Qualities of a strong candidate:
  • Experience with fine-tuning language models, designing new architectures, and creating evaluations.
  • Reliable results: good experimental design, epistemic self-awareness and transparency
  • Generativeness: coming up with original, productive ideas for unblocking progress
  • Curiosity: a desire to understand ML systems and how they work
  • Strong programming ability, including navigating trade-offs between prototyping speed and maintainability
  • Strong communication skills, low ego, openness to giving and receiving feedback

We are located in San Francisco and enthusiastic to work together in-person. We are open to sponsoring international visas.
Transluce

About Transluce

Transluce is an independent research lab that builds open, scalable technology for understanding AI systems and steering them in the public interest. Transluce means to shine light through something to reveal its structure. Today’s complex AI systems are difficult to understand—not even experts can reliably predict their behavior once deployed. Given AI's extraordinary consequences on society, we need scalable and open analyses of the capabilities and risks of AI systems.

We are building open source, AI-driven tools to understand and analyze AI systems. We will apply these tools to open-weight models, so the world can vet our analyses and improve their reliability. Once our technology has been vetted, we will work with frontier AI labs and governments to ensure that internal assessments reach the same standards as our publicly vetted procedures.

Email: info@transluce.org

Industry
Biotech & Life Sciences
Company Size
11-50 employees
Headquarters
San Francisco, California
Year Founded
2024
Social Media