登录查看更多内容

Does your Dev Team have the skills for AI Development? Ask Simon Last, co-founder and CTO of Notion?

Nagesh Nama

CEO at xLM | Transforming Life Sciences with AI & ML | Pioneer in GxP Continuous Validation |

发布日期: 2024年12月22日

This article summarizes insights from a podcast interview with Simon Last , co-founder and CTO of Notion, discussing the development and implementation of Notion AI. Key themes include the iterative and experimental nature of building AI products, the importance of robust evaluation and testing, and the shift in required skill sets for AI product development. Notion's philosophy centers around viewing AI as a "primitive" or building block, allowing users to create custom workflows, and emphasizes the need for a user-centered approach in adopting this new technology.

Key Themes & Ideas

AI as a Primitive:

Notion views AI as a fundamental building block, similar to a relational database or table view. The goal is to integrate AI into the existing Notion framework, allowing users to create their own custom software and workflows.

"for notion we think a lot about like Primitives or building blocks our goal as a company is to try to break the pattern of these like rigid vertical SAS tools and instead reconstruct them from their underlying Primitives and allow users to make their own custom software so things like like a relational database or a table view or different blocks within a page I think of AI as another primitive in the toolbox lets you do really useful automations on top"

Iterative Product Development:

Notion’s approach to AI development has been highly iterative, starting with a small team that moved quickly to ship products and learned along the way. They began with an AI writing assistant, then added AI autofill for databases and later Q&A.

"I'm a big fan of starting like a tiger team for a new initiative I think it's good to have like small group of people that can move really fast and have a clear Focus mandate so that's how it started"

Organizational Structure & Democratization:

The initial AI team was a centralized "tiger team," but they are now working on democratizing AI usage within Notion by embedding AI specialists into other teams. This is proving challenging; it is difficult to share the mental model around best practices for AI development. Having new people join the core AI team temporarily to get hands-on experience has been more successful.

"what we've had most success with actually is when someone new wants to work on AI we've had a lot of success with just having them join the AI team temporarily and then they just join our stand-ups and get in the weeds with us every day and then quickly pick up all the context around like how to do evils how to iterate on prompts"

The Challenge of Evaluation and Testing:

Unlike traditional software development, evaluating and testing AI is much more difficult. It’s crucial to have systems for logging, reproducing failures, collecting failures into datasets, and running evaluations to determine improvements.

"it's super hard it's a really different experience than the prei world... in the AI world I can often have an idea and and then be surprised that it didn't work in some some way I didn't expect"

Logging and Reproducibility:

Exact reproduction of errors is extremely important and this requires thorough logging. Notion allows users to opt-in to share data that is used for evaluation purposes (not training). This shared data is crucial for creating regression datasets.

"extremely important to be able to exactly reproduce the the failure situation the log has to be an exact reproduction of the error and then you should be able to just rerun that inference with the exact same input right that's really important if you can't do that you're totally screwed"

Focus on Regression Testing:

Notion prioritizes identifying and fixing regressions over collecting "good" cases. The focus is on continuously improving and ensuring that previously fixed errors do not resurface.

"I tend not to care about good cases that much I think the the main focus is because what's the actionable value of that...to me the main goal is to collect concrete regressions and then fix them and then make sure that they stay fixed"

Deterministic and Non-Deterministic Evaluation:

Notion uses both types of evaluations, and prefers to use deterministic evals when possible. Deterministic evals include formatting the output (e.g., XML, JSON) to make invalid output impossible, as well as using classifiers to classify results and compare them with ground truth values. Model-graded evals work best when specific and concrete; otherwise, a new layer of problems is created.

领英推荐

Unleashing Testing Potential- Harnessing Generative AI…

QualiZeal 1 年前

What Is the AI Development Process Like

Developers Latam 5 个月前

How to Seamlessly Integrate AI into Your Development…

Centizen, Inc. 3 个月前

"you can have deterministic or non-deterministic evals definitely best use deterministic when you can and then you can use like a model gr Val when not for model gr vals they work best when it's as concrete and specific as possible"

Fine-Tuning vs. In-Context Learning:

Notion initially experimented heavily with fine-tuning, but now prefers in-context learning due to the rapid pace of new model releases, and the difficulty of debugging fine-tuned models. Notion tries to use the best model at any given time. Prompt engineering is more crucial than changing models.

"I'm honestly not that bullish on companies outside of the foundation model companies doing fine tuning and if I like meet a startup and they say they're doing fine tuning I I actually now think of that as like a negative update a lack of experience"

Importance of Task Specificity:

Clearly defining the task and the desired output is crucial for getting good results from AI models. The evolution of Notion's Q&A feature highlights the importance of having a detailed rubric explaining what constitutes a "good" answer within their product experience.

"in general yeah like the model doesn't know what you want until you tell it and right yeah I think a big one Q&A is probably a really big example"

New Skill Sets for AI Product Development:

The people building AI products need to be comfortable with uncertainty and iterate quickly. They need to be empirical in their approach, and not overly attached to old ways of thinking from pre-language model era Machine Learning.

"especially on the more producty side if you looking for someone that's going to be able to lead an AI product team you definitely need someone that's going to be very okay with a lot of uncertainty and really pushing to Think Through every day are we doing the right thing"

Trust and Transparency:

Notion focuses on building user trust by making AI actions visible (e.g., diffs before accepting edits), providing citations in Q&A, and having the AI communicate uncertainty rather than provide incorrect information.

"the most obvious tools are one if the AI is taking an action have the user confirm that and be able to see in a nicely visualized way what is action being performed"

Garbage In, Garbage Out:

Notion takes a multi-pronged approach to poor quality data in its knowledge base. They do have a ranking and filtering step, which prioritizes more up-to-date, frequently viewed documents from higher authority users and then they give all the relevant data to the model for evaluation. They prefer not to show the model non-transparent black-box scores because it can easily lead to incorrect answers.

User-Centered Design:

Notion is working to create AI experiences that are embedded in existing user habits. They aim to create products that do not require a change in behavior to be adopted and also to provide users with a journey to more complex workflows and automations.

Future Vision:

Notion envisions AI automating tedious tasks, allowing users to focus on higher-level, more strategic activities, and using Notion as a host for workflows. They also want databases to become an implementation detail, where AI manages them in the background and the human user is free to set more high-level strategies and observe outputs.

"the North Star for me for AI is basically can you automate all the things that people don't want to be doing right all the cumbersome TDS tasks that people do every day and ultimately lift humans up into a high level of abstraction where where they're doing like High leverage more fulfilling activities"

AI in Development:

Notion engineers are actively using AI tools such as Cursor for autocomplete and other code editing tasks. They are also experimenting with various coding agent startups to try to further automate development processes. Simon uses Claude for coding and Chat GPT-4 for chat tasks.

Conclusion

Notion’s experience building Notion AI underscores the significant shift in product development introduced by AI. The focus on rapid iteration, robust evaluation, and user-centered design highlights the challenges and opportunities in this evolving landscape. Notion’s vision for the future involves AI not only enhancing but fundamentally transforming how users interact with the platform and approach knowledge work. They believe that AI is key to abstracting away the tedious work and empowering users to focus on higher level strategic thinking.

要查看或添加评论，请登录

Nagesh Nama的更多文章

MIT’s Open-Source EV Design Dataset: DrivAerNet++ and Its Impact on AI-Driven Vehicle Innovation

2025年3月8日

MIT’s Open-Source EV Design Dataset: DrivAerNet++ and Its Impact on AI-Driven Vehicle Innovation

Overview MIT researchers have developed DrivAerNet++, the world’s largest open-source dataset of aerodynamic car…
Anthropic's Constitutional Classifiers for Jailbreak Defense

2025年2月12日

Anthropic's Constitutional Classifiers for Jailbreak Defense

"Constitutional Classifiers," a new approach for defending large language models (LLMs) against adversarial "jailbreak"…
e-therapeutics integrates computational power and biological data to accelerate the discovery of life-transforming RNAi medicines

2025年2月10日

e-therapeutics integrates computational power and biological data to accelerate the discovery of life-transforming RNAi medicines

e-therapeutics PLC is a biotech company focused on developing RNAi therapeutics using a combination of computational…
Manas AI is leveraging advanced AI, computational chemistry, and biological expertise to accelerate and reduce the cost of drug discovery

2025年2月9日

Manas AI is leveraging advanced AI, computational chemistry, and biological expertise to accelerate and reduce the cost of drug discovery

Manas AI is a biotechnology company leveraging advanced artificial intelligence, computational chemistry, and…

2 条评论
Agentic AI - The Rise of Agents; Now we need APIs more than ever!

2025年2月3日

Agentic AI - The Rise of Agents; Now we need APIs more than ever!

Source: The blog post by Postman CEO Abhinav Asthana which explores the evolution of AI, moving beyond simple…
Spinach leaves can potentially help repair human heart tissue in a groundbreaking approach to cardiac tissue engineering!

2025年2月1日

Spinach leaves can potentially help repair human heart tissue in a groundbreaking approach to cardiac tissue engineering!

Scientists have discovered that spinach leaves can potentially help repair human heart tissue in a groundbreaking…

4 条评论
Deepbreak @ Deepseek!

2025年2月1日

Deepbreak @ Deepseek!

DeepSeek AI, a Chinese AI platform, has recently gained attention for its new R1 reasoning model, which is cheaper than…
DeepSeek’s Distillation: Disrupting AI With Smaller, Smarter Models

2025年2月1日

DeepSeek’s Distillation: Disrupting AI With Smaller, Smarter Models

In January 2025, Chinese AI startup DeepSeek sent shockwaves through the tech industry with the release of its R1…
New AI Contender: Ai2’s AI Model Beats DeepSeek’s V3

2025年1月31日

New AI Contender: Ai2’s AI Model Beats DeepSeek’s V3

The Allen Institute for AI (AI2) has made significant strides in the field of open-source artificial intelligence with…
BCG AI Radar 2025: Analysis of the current state and future trends of AI adoption based on the BCG AI Radar 2025 survey.

2025年1月30日

BCG AI Radar 2025: Analysis of the current state and future trends of AI adoption based on the BCG AI Radar 2025 survey.

Source: Boston Consulting Group (BCG) This briefing document summarizes the key findings from the BCG AI Radar 2025…

See all articles

Does your Dev Team have the skills for AI Development? Ask Simon Last, co-founder and CTO of Notion?

Nagesh Nama

CEO at xLM | Transforming Life Sciences with AI & ML | Pioneer in GxP Continuous Validation |

Key Themes & Ideas

AI as a Primitive:

Iterative Product Development:

Organizational Structure & Democratization:

The Challenge of Evaluation and Testing:

Logging and Reproducibility:

Focus on Regression Testing:

Deterministic and Non-Deterministic Evaluation:

领英推荐

Fine-Tuning vs. In-Context Learning:

Trust and Transparency:

Garbage In, Garbage Out:

User-Centered Design:

Future Vision:

AI in Development:

Conclusion

Nagesh Nama的更多文章

社区洞察

其他会员也浏览了

Top 15 AI Development Companies 2025

How to Choose the Best AI Development Tool for Your Project

Unlocking Success: Why Gen AI Product Managers Must Master 'The Treacherous Twelve' pitfalls in RAG Architectures.

"How to Learn AI Without Losing Your Mind: A PM’s Survival Guide"

Will AI Replace Software Developers?

Rewiring the Machine: How Fair Supply Built Their AI-First Development Process

GenAI is my copilot? Developers weigh pains and gains

AI Eats Software: The Inevitable Dominance of Generative Automation, Predictive Analytics, and LLMs

Supercharge Your Development: Unlock the Full Power of AI with OpenAI APIs

The Role of a Product Manager in AI Product Development

Key Themes & Ideas

AI as a Primitive:

Iterative Product Development:

Organizational Structure & Democratization:

The Challenge of Evaluation and Testing:

Logging and Reproducibility:

Focus on Regression Testing:

Deterministic and Non-Deterministic Evaluation:

领英推荐

Fine-Tuning vs. In-Context Learning:

Trust and Transparency:

Garbage In, Garbage Out:

User-Centered Design:

Future Vision:

AI in Development:

Conclusion

Nagesh Nama的更多文章

MIT’s Open-Source EV Design Dataset: DrivAerNet++ and Its Impact on AI-Driven Vehicle Innovation

Anthropic's Constitutional Classifiers for Jailbreak Defense

e-therapeutics integrates computational power and biological data to accelerate the discovery of life-transforming RNAi medicines

Manas AI is leveraging advanced AI, computational chemistry, and biological expertise to accelerate and reduce the cost of drug discovery

Agentic AI - The Rise of Agents; Now we need APIs more than ever!

Spinach leaves can potentially help repair human heart tissue in a groundbreaking approach to cardiac tissue engineering!

Deepbreak @ Deepseek!

DeepSeek’s Distillation: Disrupting AI With Smaller, Smarter Models

New AI Contender: Ai2’s AI Model Beats DeepSeek’s V3

BCG AI Radar 2025: Analysis of the current state and future trends of AI adoption based on the BCG AI Radar 2025 survey.

社区洞察

其他会员也浏览了

Top 15 AI Development Companies 2025

How to Choose the Best AI Development Tool for Your Project

Unlocking Success: Why Gen AI Product Managers Must Master 'The Treacherous Twelve' pitfalls in RAG Architectures.

"How to Learn AI Without Losing Your Mind: A PM’s Survival Guide"

Will AI Replace Software Developers?

Rewiring the Machine: How Fair Supply Built Their AI-First Development Process

GenAI is my copilot? Developers weigh pains and gains

AI Eats Software: The Inevitable Dominance of Generative Automation, Predictive Analytics, and LLMs

Supercharge Your Development: Unlock the Full Power of AI with OpenAI APIs

The Role of a Product Manager in AI Product Development