Footprint of Generative AI NLP Models
https://arxiv.org/ftp/arxiv/papers/2104/2104.10350.pdf

Footprint of Generative AI NLP Models

One of the pleasures/challenges of working at the intersection of innovation and sustainability is - to always check about the footprint of the innovations. #generativeai and particularly #chatgpt has been the rage of the past few months.


Everyone is bewildered with the outcomes and keen to find real life use cases. I am equally tempted to find use cases for #generativeai to help with #sustainability issues. But with my sustainability hat on, the first question that comes to me is to find out the footprint of these models.


Teams from 谷歌 and 美国加州大学伯克利分校 have published this paper on Carbon Emissions and Large Neural Network Training to review the footprint of training the top NLP models in use currently.

Per the authors, "The computation demand for machine learning (ML) has grown rapidly recently, which comes with a number of costs. Estimating the energy cost helps measure its environmental impact and finding greener strategies, yet it is challenging without detailed information."

 We calculate the energy use and carbon footprint of several recent large
 models—T5, Meena, GShard, Switch Transformer, and GPT-3—and refine earlier
 estimates for the neural architecture search that found Evolved Transformer. 
We highlight the following opportunities to improve energy efficiency and CO2 
equivalent emissions (CO2e): 
● Large but sparsely activated DNNs can consume <1/10th the energy of large,
 dense DNNs without sacrificing accuracy despite using as many or even more 
parameters. 
● Geographic location matters for ML workload scheduling since the fraction
 of carbon-free energy and resulting CO2e vary ~5X-10X, even within the same
 country and the same organization. We are now optimizing where and when 
large models are trained. 
● Specific datacenter infrastructure matters, as Cloud datacenters can be ~1.4
-2X more energy efficient than typical datacenters, and the ML-oriented 
accelerators inside them can be ~2-5X more effective than off-the-shelf 
systems.".        

It is really interesting to see that GPT-3 is so energy intensive as compared to the other 4 models.

James J. C.

Network AI Evangelist @ One Network | Guiding Complex Supply Chains

1 年

This is an exciting development! Generative AI models are paving the way for groundbreaking new opportunities and applications. I'm excited to see what the future holds!

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了