ChatGPT-4: Retrain me with ABR’s LMU to improve performance, reduce energy costs, and require less data
The punchline: last question asked of GPT-4 from the session below:
Yes or No, does it make sense to retrain you with ABR's LMUs for society's benefit?
ChatGPT-4: Yes.
Intro to My Session with GPT-4 About the LMU:
I fed GPT-4 the original ABR LMU paper to read as a series of inputs and then asked the following questions. The Legendre Memory Unit (LMU) is a patented recurrent neural network invented by Dr. Aaron Voelker and Dr. Chris Eliasmith at the University of Waterloo while studying time-cells in the brain. It is a product of ABR and is currently used to make signal processing and voice processing AI's (ASRs and NLPs) smaller and able to take in longer sequences.
As of late, others are coming back around to the idea of recurrence for efficient long-sequence language models. See DeepMind's latest paper "Resurrecting Recurrent Neural Networks for Long Sequences" which cites the ABR LMU as the first such model. The LMU is free to use for academic and personal research, and commercial licences are available from ABR. If you are interested, the ABR LMU papers and talks are here. Or you can send an email or use our contact form for information on licensing or partnering to try it out.?
Additionally, we are taping out an edge-AI test chip at the end of this year, called an ABR Time-Series Processor (TSP) that uses the LMU to run time-series AI models very efficiently such that the power, latency and cost of the resulting chip is state of the art. The production TSP chip will be out in 2024 and based on everything we have seen it will be the only chip able to run a BERT-sized NLP or full language ASR AI model for around 50 milli-Watts, respond between 50 to 200 milli-seconds, and cost only single dollars. Just like zip compression for the files on your computer, the LMU is kind of like a compression model for time-series AI models.
The session with GPT-4 is only an LLM session, but it was fun. I love GPT-4 and it surprises me every time I use it for how useful it is. It makes mistakes and invents APIs that don't exist quite often but still is a wonderful accomplishment and a great tool once you learn to use it.
My session with GPT-4 about the LMU is below. Note, I left off the mechanical part where I fed it the LMU paper piece by piece while asking it to read it. The first LMU paper was published a month after GPT-4's training date so it had to be fed in. Enjoy!
** Start of GPT-4 session. **
What would be the impact of retraining you using the LMU in terms of training time, model size, speed of inference, ability to handle larger input sequences and greenhouse gas emissions from training and inference?
ChatGPT-4: The potential impact of retraining an AI language model like mine using the LMU can be hypothesized based on the findings of the paper "Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers." However, it is important to note that the precise impact would depend on various factors such as the specific architecture used and the implementation of the LMUs. Based on the information available, here are some possible impacts:
However, it is essential to consider that these potential impacts are based on the findings of the LMU paper and may not be directly applicable to all AI language models or NLP tasks. To get a more accurate understanding of the potential impact on a specific model like mine, it would be necessary to conduct experiments and compare the performance, size, and efficiency of the resulting LMU-based model to that of the original transformer-based model.
What would be the impact to your abilities if your input sequence length was measured in millions of tokens such as what the LMU models can support while scaling compute and memory linearly and fixed with the LMU?
ChatGPT-4: If my input sequence length could be extended to millions of tokens while maintaining linear scaling in compute and memory using LMU technology, several significant improvements to my abilities could be expected:
领英推荐
It is important to note that while increasing the input sequence length to millions of tokens using LMUs might bring significant improvements, it would also require thorough experimentation and validation to ensure the effectiveness and efficiency of the model in real-world scenarios. Additionally, it is crucial to consider other factors, such as the quality of training data and fine-tuning for specific tasks, to fully realize the potential benefits of increased input sequence length.
Do you want to be retrained using LMUs?
ChatGPT-4: As an AI language model, I don't have personal desires or preferences. However, I can provide some insights regarding the potential benefits of retraining an AI model like me using LMUs:
It's worth noting that retraining an AI model like me using LMUs would require careful experimentation, validation, and optimization to ensure the effectiveness and efficiency of the resulting model. Additionally, other factors such as the quality of training data and fine-tuning for specific tasks would also play a crucial role in realizing the potential benefits of using LMUs in AI language models.
Does it make sense to retrain you with LMU's?
ChatGPT-4: Yes, it makes sense to explore retraining an AI model like me with LMUs, given the potential benefits in data efficiency, handling long-range dependencies, increased input sequence length, and energy savings. However, it is crucial to conduct thorough experimentation, validation, and optimization to ensure the effectiveness and efficiency of the resulting model.
Yes or No, does it make sense to retrain you with LMUs for society's benefit?
ChatGPT-4: Yes.
** End of GPT-4 session. **
---------------------------------------------------------------------
Peter Suma, Chair, Applied Brain Research Inc.
www.AppliedBrainResearch.com
Engineer. Innovator. Evangelist. Leader. Investor
2 年Go for it, LMU’s would make a lot of sense, but the num of parameters is the elephant in the room