Accelerating Supply Chain Carbon Accounting: LLMs to the Rescue

Scope 3 emissions, also called indirect emissions, encompass greenhouse gas emissions that stem from an organization's activities but are not under its direct operational control or ownership. In simpler terms, these emissions arise from external sources, such as emissions associated with suppliers and customers, and are beyond the company's core operations.?

Scope 3 emissions are significant. A 2022 Carbon Disclosure Project study [1] found that, for companies that report to CDP, supply chain emissions are the biggest contributor to greenhouse gas emissions—accounting for an average of 11.4x more emissions as compared to operational emissions.

However, Scope 3 emissions remain largely unaddressed by most enterprises, who primarily focus on reducing Scope 1 and Scope 2 emissions [2]. Some supply chain reliant companies attempt to estimate Scope 3 emissions by collecting data from suppliers, but progress is hindered by challenges such as numerous stakeholders, complex data collection, lack of standardized emission factors, substantial investments, and resource requirements.

Conventional Approaches to Scope 3 Emission Estimation?

Traditional methods for estimating Scope 3 emissions involve collecting data from suppliers and customers. This data can include information on energy use, transportation, and other activities that generate emissions. Once the data has been collected, it can be used to calculate the company's Scope 3 emissions using a variety of methods, such as input-output analysis and life cycle assessment.?

Challenges Associated with Traditional Methods?

Traditional methods for estimating Scope 3 emissions are often inaccurate or time-consuming. This is because they require collecting data from a large number of suppliers and customers. Additionally, the data that is collected is often incomplete or inaccurate.?

A New Approach Using Large Language Models?

One alternative way to estimate Scope 3 is to leverage financial transactions across enterprises. Financial transactions could be a good proxy for quantity of goods and/or services purchased and embedded greenhouse gas emissions.?

The US Environmentally-Extended Input-Output (USEEIO) [3] is an advanced life cycle assessment (LCA) framework that traces economic and environmental flows of goods and services within the United States. USEEIO offers a comprehensive dataset and methodology that merges economic input-output analysis with environmental data to quantify the environmental consequences associated with economic activities. Within USEEIO, goods and services are categorized into more than 66 broad summary groups, referred to as commodity classes, based on their shared environmental characteristics. These commodity classes are associated with emission factors used to estimate environmental impacts using expenditure data.?

However, while spend-based Scope 3 emission estimation presents an opportunity to address this complex issue, it faces challenges related to commodity recognition and mapping to these 66 broad commodity classes. Manually mapping financial ledger entries to these categories is an exceptionally difficult task, if not nearly impossible.?

This is where large language models (LLMs) come into play. In recent years, remarkable strides have been achieved in crafting extensive foundation language models for natural language processing (NLP). These innovations have consistently showcased exceptional performance in comparison to conventional machine learning models, particularly in scenarios where labelled data is in short supply. Capitalizing on the capabilities of these large pre-trained NLP models, combined with domain adaptation techniques that make efficient use of limited data, presents significant potential for tackling the Scope3 challenge in the domain of climate and sustainability.?

Rather than depending on raw data from supply chain participants, our approach involves fine-tuning foundation models to recognize EEIO (Environmentally-Extended Input-Output) commodity classes of purchase orders or ledger entries which are written in natural language. Subsequently, we calculate their Scope 3 emissions using EEIO emission factors (emissions per $ spent) sourced from Supply Chain GHG Emission Factors for US Commodities and Industries v1.1 [3], along with expenditure size information. This framework holds the promise of simplifying and streamlining the process for businesses to monitor their Scope 3 emissions.?

The figure illustrates the proposed framework for Scope 3 emission estimation employing a large language model. This framework comprises four distinct modules: data preparation, domain adaptation, classification, and emission computation.

We conducted extensive experiments involving several cutting-edge large language models, including roberta-base, bert-base-uncased, and distilroberta-base-climate-f. Additionally, we explored non-foundation classical models based on TF-IDF and Word2Vec vectorization approaches. Our objective was to assess the potential of foundation models in estimating Scope 3 emissions using financial transaction records as a proxy for goods and services. The experimental results indicate that fine-tuned large language models exhibit significant improvements over the zero-shot classification approach. Furthermore, they outperform state-of-the-art classical text mining techniques like TF-IDF and Word2Vec, delivering performance on par with domain-expert classification.


Benefits of the New Approach?

The proposed method offers several advantages compared to conventional techniques for Scope 3 emission estimation. Leveraging readily available financial transaction data, the developed framework can estimate Scope 3 emissions using financial transactions as a representation of embodied emissions for global supply chains. This approach mitigates the need for humongous manual data collection efforts involving hundreds of upstream and downstream suppliers within a supply chain.?

Additional Insights?

The proposed framework for estimating Scope 3 emissions using large language models is a promising new approach. It holds the potential to assist enterprises in reducing their environmental footprint and contributing to a more sustainable future.?

Furthermore, this approach also carries the potential to enhance the transparency and accountability of supply chains. By employing a data-driven methodology to estimate Scope 3 emissions, enterprises can gain a deeper understanding of the environmental impact associated with their suppliers and customers. Subsequently, this information can guide more sustainable decision-making throughout the entire supply chain.?

The proposed approach is highly scalable and adaptable to large and intricate supply chains. This adaptability arises from its reliance on financial transaction data, which is readily available for the majority of enterprises.?

In conclusion, the proposed framework for estimating Scope 3 emissions through large language models represents a significant advancement in the realm of sustainability. It has the capacity to aid businesses in reducing their environmental impact, enhancing supply chain transparency, and contributing to a more sustainable future.?

****************


Co-authors: Ayush Jain , Jagabondhu Hazra , Shantanu Godbole , Kommy Weldemariam , Tamara Robinson

Tags: #ibm, #ibmresearch, #IBMResearchIndia, #climateandsustainability, # FragileEarth2023, #KDD2023?

This Blog is based on a recently accepted paper at Fragile Earth: AI for Climate Sustainability workshop at KDD 2023.

https://ai4good.org/fragile-earth-2023/

https://arxiv.org/pdf/2308.01741.pdf


Please have a look at some other relevant blogs from our team


?

References:

[1] “Scoping Out: Tracking Nature Across the Supply Chain”, CDP Supply Chain Report 2022, March 2023?

[2] https://impact.economist.com/sustainability/net-zero-and-energy/no-business-decarbonisation-without-supply-chain-buy-in?

[3] https://www.epa.gov/land-research/us-environmentally-extended-input-output-useeio-models


Tamara Robinson

Sustainabilty Leader | Product Marketer

11 个月

Thanks for sharing this. Very clear and comprehensive.

Sambaran Bandyopadhyay

Research Scientist II at Adobe Research | Ex AMAZON Science & IBM Research | PhD (IISc)

1 年

Thank you for posting this Mani. It was a very good read.

Jagabondhu Hazra

Senior Technical Staff Member | Master Inventor | Global Sub-Theme (Sustainability) Lead, IBM Research

1 年

Great work !

要查看或添加评论,请登录

社区洞察

其他会员也浏览了