Footprint of Generative AI NLP Models
One of the pleasures/challenges of working at the intersection of innovation and sustainability is - to always check about the footprint of the innovations. #generativeai and particularly #chatgpt has been the rage of the past few months.
Everyone is bewildered with the outcomes and keen to find real life use cases. I am equally tempted to find use cases for #generativeai to help with #sustainability issues. But with my sustainability hat on, the first question that comes to me is to find out the footprint of these models.
领英推荐
Teams from 谷歌 and 美国加州大学伯克利分校 have published this paper on Carbon Emissions and Large Neural Network Training to review the footprint of training the top NLP models in use currently.
Per the authors, "The computation demand for machine learning (ML) has grown rapidly recently, which comes with a number of costs. Estimating the energy cost helps measure its environmental impact and finding greener strategies, yet it is challenging without detailed information."
We calculate the energy use and carbon footprint of several recent large
models—T5, Meena, GShard, Switch Transformer, and GPT-3—and refine earlier
estimates for the neural architecture search that found Evolved Transformer.
We highlight the following opportunities to improve energy efficiency and CO2
equivalent emissions (CO2e):
● Large but sparsely activated DNNs can consume <1/10th the energy of large,
dense DNNs without sacrificing accuracy despite using as many or even more
parameters.
● Geographic location matters for ML workload scheduling since the fraction
of carbon-free energy and resulting CO2e vary ~5X-10X, even within the same
country and the same organization. We are now optimizing where and when
large models are trained.
● Specific datacenter infrastructure matters, as Cloud datacenters can be ~1.4
-2X more energy efficient than typical datacenters, and the ML-oriented
accelerators inside them can be ~2-5X more effective than off-the-shelf
systems.".
It is really interesting to see that GPT-3 is so energy intensive as compared to the other 4 models.
Network AI Evangelist @ One Network | Guiding Complex Supply Chains
1 年This is an exciting development! Generative AI models are paving the way for groundbreaking new opportunities and applications. I'm excited to see what the future holds!
Great share Ritu Raj!