登录查看更多内容

Reproducible AI: How and Why?

Vincent Granville

AI/LLM Disruptive Leader | GenAI Tech Lab

发布日期: 2024年8月14日

Most LLMs are not reproducible because the underlying deep neural networks are not. Because that's something LLM creators don't care about. We do, and ours are reproducible, including our GenAI systems that use GAN.

With traditional AI, you save the weights, which represent the model. It takes a massive amount of space, and you can't reproduce the weights if you don't save them. You could if you had saved the 3-4 seeds that generates them, along with the other hyperparameters. But in practice, AI creators don't even specify the seeds, using default ones that change each time you re-run the steps to create the model. It does not help with fine-tuning, especially for models sensitive to seeds.

Yet, all you have to do is allow the user to specify the seeds of the random number generators involved, be it from PyTorch, Pandas, NumPy, Random, GPU, base Python, and whatever source of randomness (libraries) you use to create your model. First, you need a good random generator you have full control over. Better than numpy.random, which in addition is subject to updates that can lead to previous seeds not working as expected, depending on library version. See our random number generator, with infinite period and one line of code, faster and better than what's in Python and elsewhere. With more flaws found in numpy.random, using new tests.

?? Read the full article, with access to Python code on GitHub, and case studies.

Free Online Courses 1 年前

Artificial Intelligence #133

Andriy Burkov 2 年前

New Book on Synthetic Data: Version 3.0 Just Released

Vincent Granville 1 年前

A word from our sponsor: for a very fast, versatile database that can handle large AI applications in real time and works with various architectures (JSON, graph, vector, SQL and so on) more efficiently than in their native environment, I invite you to attend this event.

This hands-on workshop is for developers and AI professionals, featuring state-of-the-art technology, case studies, code-share, and live demos. Recording and GitHub material will be available to registrants who cannot attend the free 60-min session.

GenAI and Machine Learning

204,993 位关注者

Sukrit Goel

Founder & CEO @InteligenAI | Customized AI Solutions and AI Strategy tailored for your business | Hiring across multiple profiles

3 个月

Interesting insights Vincent!

Gregory H.

TOGAF | Microsoft COTS Enterprise | Cybersecurity

3 个月

Vincent Granville, This could be an advantage or disadvantage depending on the LLM’s corpus?? Any disadvantage for LLMs being reproducible should be solved using SLMs?? I guess I am confused on the issue reproducible neural networks would solve?? Or this is a statement of fact??

1 次回应

Boris A Roginsky

Controlled & Ethical AI & Cyber Integration Executive with full range of the Govern, Policies, Finance, Risk & Compliance Acumen over multiple business spheres. I make the impossible possible, the improbable probable.

3 个月

Great article Vincent, Bravo for sharing, and publishing it. The notion of the random number generator is very specific to the use cases in manu AI and Gen AI and ML algorithms and practicalities. Python handles it extremely well with appropriate library inclusion. There are other very useful approaches for Gen AI and ML are available as well. It all depends on the specific use case and constraints of the computing powers.

1 次回应

David Sherr

3 个月

Golden Copy is possible if reproducible

1 次回应

Corporates Guide

3 个月

Nicely explained Vincent

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Reproducible AI: How and Why?

Vincent Granville

AI/LLM Disruptive Leader | GenAI Tech Lab

领英推荐

GenAI and Machine Learning

204,993 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

pANN: A Fast Alternative to Vector Search

Become a Cloudployer, build your own AI cloudployees with Python

What are the best practices with TensorFlow?

OpenAI Unveils o1: The AI Model That Thinks Like a PhD

AI Framework for Beginners: TensorFlow

Advance Super Intelligence #2

EP 1: Paper 1: A Neural Probabilistic Language Model

Mastering the Basics: 10 AI Books to Guide You in 2024

The "Hockey Stick" Chart of Artificial Intelligence

top 10 AI tools and frameworks

领英推荐

GenAI and Machine Learning

204,993 位关注者

New LLM & RAG Courses and Certifications

2024年11月14日

Optimizing AI Systems: Fintech Case Study

2024年11月5日

LLM, RAG, GPT & GenAI: Free Certifications and Courses from Leading Experts

2024年11月1日

Building a GenAI/LLM app on AWS with Anthropic Claude

2024年10月28日

AI/RAG Tutorial: Building Enterprise-Grade, Secure, Scalable Data APIs

2024年10月22日

AI, GenAI, LLM, Prompt Engineering, NLP: Review of the Ecosystem

2024年10月18日

New Book: Building Disruptive AI & LLM Technology from Scratch

2024年10月15日

Building an Enterprise-Grade Agentic RAG

2024年10月14日

Databases For AI, GenAI & RAG/LLMs: Vendor Comparison

2024年10月9日

Building a Ranking System to Enhance Prompt Results: The New PageRank for RAG/LLM

2024年10月8日

社区洞察

其他会员也浏览了

pANN: A Fast Alternative to Vector Search

Become a Cloudployer, build your own AI cloudployees with Python

What are the best practices with TensorFlow?

OpenAI Unveils o1: The AI Model That Thinks Like a PhD

AI Framework for Beginners: TensorFlow

Advance Super Intelligence #2

EP 1: Paper 1: A Neural Probabilistic Language Model

Mastering the Basics: 10 AI Books to Guide You in 2024

The "Hockey Stick" Chart of Artificial Intelligence

top 10 AI tools and frameworks