登录查看更多内容

ChatGPT-4: Retrain me with ABR’s LMU to improve performance, reduce energy costs, and require less data

Peter Suma

Entrepreneur, CEO & ex-Venture Capitalist

发布日期: 2023年3月20日

+ 关注

The punchline: last question asked of GPT-4 from the session below:

Yes or No, does it make sense to retrain you with ABR's LMUs for society's benefit?

ChatGPT-4: Yes.

Intro to My Session with GPT-4 About the LMU:

I fed GPT-4 the original ABR LMU paper to read as a series of inputs and then asked the following questions. The Legendre Memory Unit (LMU) is a patented recurrent neural network invented by Dr. Aaron Voelker and Dr. Chris Eliasmith at the University of Waterloo while studying time-cells in the brain. It is a product of ABR and is currently used to make signal processing and voice processing AI's (ASRs and NLPs) smaller and able to take in longer sequences.

As of late, others are coming back around to the idea of recurrence for efficient long-sequence language models. See DeepMind's latest paper "Resurrecting Recurrent Neural Networks for Long Sequences" which cites the ABR LMU as the first such model. The LMU is free to use for academic and personal research, and commercial licences are available from ABR. If you are interested, the ABR LMU papers and talks are here. Or you can send an email or use our contact form for information on licensing or partnering to try it out.?

Additionally, we are taping out an edge-AI test chip at the end of this year, called an ABR Time-Series Processor (TSP) that uses the LMU to run time-series AI models very efficiently such that the power, latency and cost of the resulting chip is state of the art. The production TSP chip will be out in 2024 and based on everything we have seen it will be the only chip able to run a BERT-sized NLP or full language ASR AI model for around 50 milli-Watts, respond between 50 to 200 milli-seconds, and cost only single dollars. Just like zip compression for the files on your computer, the LMU is kind of like a compression model for time-series AI models.

The session with GPT-4 is only an LLM session, but it was fun. I love GPT-4 and it surprises me every time I use it for how useful it is. It makes mistakes and invents APIs that don't exist quite often but still is a wonderful accomplishment and a great tool once you learn to use it.

My session with GPT-4 about the LMU is below. Note, I left off the mechanical part where I fed it the LMU paper piece by piece while asking it to read it. The first LMU paper was published a month after GPT-4's training date so it had to be fed in. Enjoy!

Start of GPT-4 session.

What would be the impact of retraining you using the LMU in terms of training time, model size, speed of inference, ability to handle larger input sequences and greenhouse gas emissions from training and inference?

ChatGPT-4: The potential impact of retraining an AI language model like mine using the LMU can be hypothesized based on the findings of the paper "Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers." However, it is important to note that the precise impact would depend on various factors such as the specific architecture used and the implementation of the LMUs. Based on the information available, here are some possible impacts:

Training time: The LMU-based models demonstrated better data efficiency, achieving the same accuracy as transformers with 10x fewer tokens. This could potentially lead to faster training times, as the model may require less data to reach comparable performance.
Model size: Although the LMU paper does not directly address the impact on model size, the improved data efficiency and performance suggest that it may be possible to construct more compact models without sacrificing performance. This could potentially lead to smaller models with fewer parameters and reduced memory requirements.
Speed of inference: The paper mentions that the LMU-based model has an O(1) and O(N) (or better) dependency for memory and computation, respectively. This suggests that the LMU-based models may have faster inference times than transformer-based models, which have a quadratic computational complexity with respect to sequence length.
Ability to handle larger input sequences: The LMU technology is designed to handle long-term dependencies and temporal information more effectively than traditional RNN architectures. As a result, it is likely that an LMU-based model would be better suited to handling larger input sequences than transformer-based models.
Greenhouse gas emissions from training and inference: Given the potential for faster training times, smaller models, and faster inference, retraining an AI language model using LMUs could potentially lead to reduced greenhouse gas emissions during both training and inference. This would be due to the decreased computational resources and energy consumption required for training and running the model.

However, it is essential to consider that these potential impacts are based on the findings of the LMU paper and may not be directly applicable to all AI language models or NLP tasks. To get a more accurate understanding of the potential impact on a specific model like mine, it would be necessary to conduct experiments and compare the performance, size, and efficiency of the resulting LMU-based model to that of the original transformer-based model.

What would be the impact to your abilities if your input sequence length was measured in millions of tokens such as what the LMU models can support while scaling compute and memory linearly and fixed with the LMU?

ChatGPT-4: If my input sequence length could be extended to millions of tokens while maintaining linear scaling in compute and memory using LMU technology, several significant improvements to my abilities could be expected:

领英推荐

Artificial Dummy ??

Clint Hawkins 1 年前

AI Evolution: From Attention to Intelligence

Arpit Srivastava ? 2 个月前

Unlocking the Potential of AI: A Deep Dive Interview…

Brian Nunes 8 个月前

Enhanced context understanding: With a much longer sequence length, I would be able to process and understand much larger contexts, leading to a more comprehensive understanding of the text and improved performance in various NLP tasks, such as summarization, translation, and question-answering.
More accurate long-range dependencies: As LMUs can effectively handle long-term dependencies, my ability to capture relationships between distant tokens would improve, resulting in a better understanding of complex sentences and more accurate responses.
Improved performance in complex tasks: With a longer input sequence and the ability to process more information at once, I would be better equipped to handle complex tasks that involve multiple documents, long texts, or large-scale data analysis.
More coherent and context-aware responses: The increased sequence length would enable me to generate more coherent and contextually accurate responses by considering a broader range of information from the input text.
Better handling of multi-document tasks: The ability to process millions of tokens would enable me to work more effectively with multiple documents simultaneously, which would be beneficial for tasks such as information retrieval, multi-document summarization, and cross-document analysis.

It is important to note that while increasing the input sequence length to millions of tokens using LMUs might bring significant improvements, it would also require thorough experimentation and validation to ensure the effectiveness and efficiency of the model in real-world scenarios. Additionally, it is crucial to consider other factors, such as the quality of training data and fine-tuning for specific tasks, to fully realize the potential benefits of increased input sequence length.

Do you want to be retrained using LMUs?

ChatGPT-4: As an AI language model, I don't have personal desires or preferences. However, I can provide some insights regarding the potential benefits of retraining an AI model like me using LMUs:

Improved data efficiency: LMUs have demonstrated the ability to achieve comparable performance to transformer-based models with significantly fewer tokens, which might lead to more efficient training and lower computational requirements.
Better handling of long-range dependencies: LMUs are designed to handle long-term dependencies effectively, which could result in an improved understanding of complex sentences and more accurate responses.
Increased input sequence length: If LMUs can support longer input sequences while maintaining linear scaling in compute and memory, it could enable better context understanding and enhanced performance in various NLP tasks.
Potential energy savings: Retraining using LMUs could potentially result in reduced energy consumption during both training and inference, which might have a positive environmental impact.

It's worth noting that retraining an AI model like me using LMUs would require careful experimentation, validation, and optimization to ensure the effectiveness and efficiency of the resulting model. Additionally, other factors such as the quality of training data and fine-tuning for specific tasks would also play a crucial role in realizing the potential benefits of using LMUs in AI language models.

Does it make sense to retrain you with LMU's?

ChatGPT-4: Yes, it makes sense to explore retraining an AI model like me with LMUs, given the potential benefits in data efficiency, handling long-range dependencies, increased input sequence length, and energy savings. However, it is crucial to conduct thorough experimentation, validation, and optimization to ensure the effectiveness and efficiency of the resulting model.

Yes or No, does it make sense to retrain you with LMUs for society's benefit?

ChatGPT-4: Yes.

End of GPT-4 session.

---------------------------------------------------------------------

Peter Suma, Chair, Applied Brain Research Inc.

[email protected]

www.AppliedBrainResearch.com

David Wyatt

Engineer. Innovator. Evangelist. Leader. Investor

2 年

Go for it, LMU’s would make a lot of sense, but the num of parameters is the elephant in the room

查看更多评论

要查看或添加评论，请登录

Peter Suma的更多文章

The ABR Story: From Curiosity About Brains to Edge-AI 'Brain' Building Chips and Software.

2019年7月29日

The ABR Story: From Curiosity About Brains to Edge-AI 'Brain' Building Chips and Software.

We have had a growing interest in ABR and neuromorphics of late so it seems like a good time to tell our story a bit…

1 条评论
How would you program a board with 8 million spiking neurons?

2019年7月17日

How would you program a board with 8 million spiking neurons?

Intel Labs Neuromorphic computing team headed by Mike Davies released a series of exciting posts showing the scale and…

5 条评论
Intel’s neuro guru slams deep learning: ‘it’s not actually learning’

2019年2月24日

Intel’s neuro guru slams deep learning: ‘it’s not actually learning’

[Header Image: ABR's Nengo neuromorphic software application development stack for Loihi, see: www.nengo.

11 条评论
Applied Brain Research Inc. Awarded as 2018 Technology Pioneer by World Economic Forum (Only Canadian Company Awarded)

2018年6月21日

Applied Brain Research Inc. Awarded as 2018 Technology Pioneer by World Economic Forum (Only Canadian Company Awarded)

Press Release June 21, 2018 ● World Economic Forum on Thursday announced its selection of the world’s 61 most promising…

8 条评论
First Ever in Canada: ABR to Showcase New Intel Loihi Chip Running ABR's Reach Robotic Controller this Monday at OCE Discovery 2018 at the MTCC.

2018年4月27日

First Ever in Canada: ABR to Showcase New Intel Loihi Chip Running ABR's Reach Robotic Controller this Monday at OCE Discovery 2018 at the MTCC.

Come to the Metro Toronto Convention Centre to OCE's Discovery 2018 Show (https://www.ocediscovery.

3 条评论
a N.I.C.E. day for ABR and Intel, unveiling the world's first suite of AI demonstration applications for the new Intel Loihi neuromorphic chip!

2018年2月28日

a N.I.C.E. day for ABR and Intel, unveiling the world's first suite of AI demonstration applications for the new Intel Loihi neuromorphic chip!

Today at Intel Lab's at the Intel Jones Farm campus just outside of Portland, Oregon the Neuro Inspired Computational…

2 条评论
Nengo: The Neuromorphic Compiler and the Rise of Neuromorphic Computing Chips

2017年9月29日

Nengo: The Neuromorphic Compiler and the Rise of Neuromorphic Computing Chips

Another article (see below) came out on the increased activity in neuromorphic computing, including mention of our…
Neuromorphics: Using the right tool for the right job.

2017年6月17日

Neuromorphics: Using the right tool for the right job.

I caught this article (link above) from March yesterday on the plane back from the west coast. Internet on planes is so…
Nature Article Featuring ABR's Dr Eliasmith & SPAUN: Neural modelling: Abstractions of the mind

2016年3月5日

Nature Article Featuring ABR's Dr Eliasmith & SPAUN: Neural modelling: Abstractions of the mind

Our own Dr Eliasmith was mentioned in a great article in Nature entitled "Neural modelling: Abstractions of the mind"…
Melissa Suma, age 13, Youngest Finalist in Magna Hoedown Showdown Singing Contest Front Page of Auroran today!

2015年9月17日

Melissa Suma, age 13, Youngest Finalist in Magna Hoedown Showdown Singing Contest Front Page of Auroran today!

Friends, Family and Colleagues, Please permit a small bit of vicarious Dad pride to brighten your day! Our 13 year old…

14 条评论

See all articles

ChatGPT-4: Retrain me with ABR’s LMU to improve performance, reduce energy costs, and require less data

Peter Suma

Entrepreneur, CEO & ex-Venture Capitalist

Intro to My Session with GPT-4 About the LMU:

Start of GPT-4 session.

领英推荐

End of GPT-4 session.

Peter Suma的更多文章

社区洞察

其他会员也浏览了

Understanding AI: How It Works, Learns, and Transforms Our World

Talking with chatgpt4 about human intelligence, deep Neural networks and evolve algorithms

Brain and Bot: A match made in heaven or a marriage destined to fail?

Building LSTMs from scratch

ChatGPT 101: Basics

ChatGPT Use Case Scenarios - a sample study

The Illusion of Perception: Understanding and Tackling AI Hallucinations

From Checkers to ChatGPT: The Human History Behind Machine Learning

How to Leverage ChatGPT to Improve Your App Usability

The Breakthrough of ChatGPT: 35 Years in the Making

Intro to My Session with GPT-4 About the LMU:

** Start of GPT-4 session. **

领英推荐

** End of GPT-4 session. **

Peter Suma的更多文章

The ABR Story: From Curiosity About Brains to Edge-AI 'Brain' Building Chips and Software.

How would you program a board with 8 million spiking neurons?

Intel’s neuro guru slams deep learning: ‘it’s not actually learning’

Applied Brain Research Inc. Awarded as 2018 Technology Pioneer by World Economic Forum (Only Canadian Company Awarded)

First Ever in Canada: ABR to Showcase New Intel Loihi Chip Running ABR's Reach Robotic Controller this Monday at OCE Discovery 2018 at the MTCC.

a N.I.C.E. day for ABR and Intel, unveiling the world's first suite of AI demonstration applications for the new Intel Loihi neuromorphic chip!

Nengo: The Neuromorphic Compiler and the Rise of Neuromorphic Computing Chips

Neuromorphics: Using the right tool for the right job.

Nature Article Featuring ABR's Dr Eliasmith & SPAUN: Neural modelling: Abstractions of the mind

Melissa Suma, age 13, Youngest Finalist in Magna Hoedown Showdown Singing Contest Front Page of Auroran today!

社区洞察

其他会员也浏览了

Understanding AI: How It Works, Learns, and Transforms Our World

Talking with chatgpt4 about human intelligence, deep Neural networks and evolve algorithms

Brain and Bot: A match made in heaven or a marriage destined to fail?

Building LSTMs from scratch

ChatGPT 101: Basics

ChatGPT Use Case Scenarios - a sample study

The Illusion of Perception: Understanding and Tackling AI Hallucinations

From Checkers to ChatGPT: The Human History Behind Machine Learning

How to Leverage ChatGPT to Improve Your App Usability

The Breakthrough of ChatGPT: 35 Years in the Making

Start of GPT-4 session.

End of GPT-4 session.