How to Train AI Models Faster and Cheaper with New Math

How to Train AI Models Faster and Cheaper with New Math

Artificial intelligence (AI) is transforming the world with its amazing applications, such as natural language processing, computer vision, and generative art. However, training AI models to perform these tasks is often costly and time-consuming, requiring millions of dollars and months of computation on powerful hardware.

But what if we could train AI models faster and cheaper by using new math? That is the question that researchers and engineers are trying to answer by exploring new ways of representing and computing numbers in AI algorithms. By using smaller and simpler numbers, they hope to reduce the memory, energy, and chip area required for training AI models, while maintaining the accuracy and quality of the results.

One example of this approach is Nvidia’s per-vector scaled quantization (VSQ) technique, which uses a combination of 8-bit and 4-bit numbers to train AI models. The technique rounds each number to one of the 16 values that can be represented by 4 bits, and then scales it by a factor that depends on the vector it belongs to. This way, the technique preserves the relative magnitude and direction of each number, which are important for AI calculations. Nvidia’s chip prototype using VSQ achieved 8-bit results from 4-bit precision, at least for one part of the training process called inferencing1.

Another example of this approach is posits, a new kind of number that was proposed by John Gustafson, a computer scientist and mathematician. Posits are similar to floating-point numbers, but they have a variable number of bits for the exponent and the significand, depending on the value of the number. This allows posits to represent a wider range of numbers with higher precision and lower error than floating-point numbers. Posits also have a special bit called the regime, which indicates the order of magnitude of the number. Researchers have shown that posits can improve the performance and accuracy of AI algorithms, such as matrix multiplication and convolutional neural networks2.

These are just some of the examples of how new math can help us train AI models faster and cheaper. By using new number formats and basic computations, we can unlock new possibilities and challenges for AI research and development. If you are interested in learning more about this topic, you can check out these resources:

I hope you enjoyed this post and learned something new. Please feel free to share your thoughts, questions, or feedback in the comments below. Thank you for reading! ??

Asra Islam

Sr. Consultant

1 年

Thanks for sharing. It is been a while that I worked with matrixes but am interested in the AI. Even though the variable bits is exciting it would yield only a deterministic result limiting scaling size even with super computers, I assume.

要查看或添加评论,请登录

Bilal Raza的更多文章

社区洞察

其他会员也浏览了