Investing in the Age of AI
AI, Data Coverage, Regulatory Compliance, TechnologyArticles

How Machine Learning Can Expand Sustainability Data Coverage

Published: January 8, 2022
Modified: August 13, 2025
Key Takeaways

Using estimation models to improve sustainability reporting

Lack of data coverage is a major hurdle that can be overcome through the use of machine learning. Today, 80% of listed companies do not report required sustainability data. That means that, regardless of reliability issues, only 20% of publicly listed companies report comprehensive data on sustainability as a baseline. Many providers may then pile on partial or missing information, making it difficult to create consistent scores across peers and potentially skewing scores toward companies that disclose selectively by leaving out data on indicators for which they are behind. This is why Clarity AI leverages available company information and machine-learning algorithms to fill in the information gaps to give the fullest available picture.

Geographically, Europe has been leading the way with national regulations on climate change reporting for corporations, which crystallizes in the highest GHG reporting coverage among major world regions. Meanwhile, the US Securities and Exchange Commission is preparing a specific climate reporting regulation for 2022. The expectation is that reporting in North America will catch up to the rate in Europe within the next couple of years.

GHG Reporting Coverage, by region

Clarity AI’s Estimation Models

One application of machine learning is our estimation models. The underlying principle of the models is to figure out how sustainability performance metrics can be derived from other corporate attributes. A wide range of both data sources and features (information about the organization) are used as input for the estimation models, including, for example:

  • What industry are you in?
  • What types of products and services do you sell?
  • Are you a manufacturer?
  • Where do you make your products? • Where do you sell your products?
  • What are your labor costs?
  • What are other environmental features that may be correlated with the metric of interest? (This depends on the metric.)
Flowchart of Clarity AI’s Estimation Model Process

Key differentiators of Clarity AI’s methodology are the estimation of the intensity of the metric, the use of holdout data to test the predictive accuracy of the model, and accounting for both non-linear and interaction effects. These are crucial for estimating certain sustainability metrics as CO2 emissions.

Data Coverage by PAI

Access the full report here

Research and Insights

Latest news and articles

Climate

Top-down, bottom-up, disclosure: building a physical climate risk view that holds up

Climate risk management is becoming a fiduciary duty. In 2020, the Australian pension fund REST settled a landmark case with member Mark McVeigh, committing to new disclosure processes and acknowledging that climate change is a material financial risk to its investments. But disclosure alone is no longer enough. Clients are paying attention to what the…

Climate

Data-Center Power Has Quadrupled. Big Tech’s Reported Scope 2 Has Done the Opposite

Data center power demand has quadrupled due to the artificial intelligence boom, but Big Tech’s reported carbon footprints are doing the opposite. Global carbon accounting rules are at the core of this inconsistency: under current greenhouse gas global (GHG) reporting standards, companies can report their electricity-related emissions (i.e., scope 2) using different accounting rules: Companies…

Market Insights

Geopolitical Risk and Portfolio Decisions: How Investors Are Adapting Policies, Exclusions, and Oversight

Geopolitical risk is currently reshaping how investors think about exclusions, investment policy, and portfolio oversight. At the same time, it is rewriting the macroeconomic playbook that long-term capital owners have relied on for decades. Trade fragmentation, shifting alliances, and a more interventionist policy environment are forcing investors to reconcile top-down macro views with bottom-up portfolio…