Tilde的封面图片
Tilde

Tilde

科技、信息和网络

Palo Alto,CA 222 位关注者

Interpreter models to optimize AI deployments

关于我们

Interpretability is the most important problem in the world. We build moonshot, applied interpretability solutions to make maximally safe and performant AI.

网站
https://www.tilderesearch.com/
所属行业
科技、信息和网络
规模
2-10 人
总部
Palo Alto,CA
类型
个体经营

地点

Tilde员工

动态

  • 查看Tilde的组织主页

    222 位关注者

    We're thrilled to announce Tilde. We’re building interpreter models and control tech to unlock deep reasoning and command of models, enabling the next generation of human-AI interaction. The bottleneck in AI today isn’t intelligence; it’s communication. Today’s interactions rely on black-box prompting, an inherently lossy method. Imagine trying to optimize a software system without access to its source code—it’s not only inefficient, but absurdly limiting. Our vision at Tilde is to go beyond these constraints with applied interpretability: we’re building solutions to reveal a model’s inner workings, enabling us to directly steer behavior and enhance performance beyond what’s possible with traditional post-training or fine-tuning methods. So far, our interpreter models have already enabled improvements in reasoning with Llama-3.1-8B and more fine-grained control in generation with S?OTA text-to-video models. By advancing a new paradigm of insight and control, we’re creating AI that can tackle tasks genuinely beyond human reach—elevating, not replacing, existing methods. Our team combines unique backgrounds, spanning academia and both small and large AI infrastructure companies, all driven by a shared goal: achieving a deeper understanding of the universe within these models. If this sounds interesting to you, come build the future with us: tilderesearch.com/join

  • 查看Tilde的组织主页

    222 位关注者

    Over the past few weeks, we've been using this graph theory problem in interviews and figured we'd open it up to everyone here: https://lnkd.in/gXGdMpYW If you solve it, we’ll move you directly to the last rounds of our process! And if you don’t like graph theory, but do like interpretability, we have plenty of other fun problems so feel free to email us [email protected]. We are doing a lot of applied interpretability work like this: https://lnkd.in/gGEEWYQC, which was the first application of SAE-based interventions to a downstream task that outperforms classical baselines

  • 查看Tilde的组织主页

    222 位关注者

    ?? Can Mechanistic Interpretability Be Useful? ?? At Tilde, we’ve been exploring how Sparse Autoencoders (SAEs)—a mechanistic interpretability tool for surfacing monosemantic, interpretable features in AI models—can directly improve downstream performance. Today, we’re excited to share Sieve, our latest research demonstrating the practical power of interpretability (joint work with Adam Karvonen). Read our post here: https://lnkd.in/gGEEWYQC Our work focuses on a real-world challenge provided by Benchify, a YC-backed code-generation company. The task? Write code to fuzz test Python functions under strict constraints (e.g., no regex usage). Unfortunately, models often fail to follow instructions when operating in long-context environments with multiple constraints. Here’s where SAEs come in: ?? We can use mechanistic insights surfaced by SAEs to intervene and improve instruction following in code generation, demonstrating Pareto dominance over existing techniques like steering and prompting. ???Why this matters ?? This is the first time mechanistic interpretability has counterfactually enabled a downstream task, demonstrating that SAEs can do more than provide insights—they can directly drive performance improvements. By leveraging interpretable features surfaced by SAEs, we: 1?? Conditionally targeted failure modes (e.g., regex usage) without disrupting other behaviors. 2?? Achieved surgical precision, with over 99.9% activation accuracy, preventing interference with unrelated tasks. We delivered ?lightning fast? interventions that require less than 20 minutes end-to-end to implement, with minimal additional overhead. ???Results ?? Using GemmaScope SAEs (for Gemma) and custom lightweight SAEs (for Llama), we compared against strong baselines across thousands of code generations. The results? Conditional SAE-based interventions showed clear Pareto dominance: - 99.9% precision in eliminating regex usage. - Models remained unaffected during standard downstream tasks. This means we can precisely target undesirable behaviors without compromising performance—a breakthrough in controllable AI. ?? In the spirit of open science, we’re sharing all models, data, evaluation results, and analyses. Mechanistic interpretability is often seen as theoretical, but our work shows how it can deliver practical, actionable benefits. This research is an exciting step toward closing the gap between theory and application, empowering fine-grained control over AI models. Let us know your thoughts—we’d love to hear your feedback and ideas!

相似主页

查看职位