Investing in the Age of AI
AI, Data Coverage, Regulatory Compliance, TechnologyArticles

Sustainable Investing: How Data Science can Improve Reliability of Reported Data

Published: February 3, 2022
Modified: February 3, 2022
Key Takeaways

Clarity AI standardizes ESG data to establish a reliable database prior to the implementation of CSRD reporting

Sustainability performance data are still in their early days. The Corporate Sustainability Reporting Directive (CSRD) will eventually make these data part of companies’ annual reports with third- party auditing. However, CSRD will not be fully implemented until 2025, and in the meantime, limited reliability in reported data is to be expected. This limited reliability applies even to broadly used quantitative metrics such as Scope 1 CO2 emissions, which —despite being a highly material metric—can suffer from high variability among data sources. This  due to errors, lack of standardization, and overall poor data quality from provider to provider. This is true even when dealing with data reported by the companies themselves. The higher the variability, the less reliable the data.

Variability among data providers used by Clarity AI

 

Clarity AI leverages three key differentiators to establish the most reliable database available today:

  • Assemble the largest collection of structured and unstructured data sources in a global database.
  • Use in-house and external technical data expertise to aggregate, clean, and standardize this database.
  • Leverage proprietary machine-learning algorithms and data science techniques to detect outliers and automatically select the best source for overlapping data, as well as to obtain accurate estimates for non-reported data.

Data Sources

Clarity AI draws on more than two million data points of various types (for example, quantitative, qualitative, and news). It has proprietary data from machine-learning models that estimate metrics to complement organizations’ non-disclosed information, and exclusive data sources from Clarity AI partnerships with worldwide, recognized data providers (for example, for controversial news) allow deeper and richer insight generation.

How Clarity AI Achieves the Most Reliable Database

Technical Data Expertise

Clarity AI’s data engineering and DevOps teams are experts in data life-cycle management, and they leverage bleeding-edge technology and tools for automated data ingestion, processing, validation, and storage. Our team can expertly clean and standardize a company’s other data, classifying it into peer groups and identifying key operating metrics.

Artificial Intelligence

Confirmed data are great; triple-confirmed data are better. Clarity AI uses its multiple sources, as well as overlapping coverage of key metrics, to ensure data consistency and reliability. To remove potential inconsistencies within this consolidated database, Clarity AI’s proprietary machine-learning algorithms choose the best sources and detect outliers just as an analyst would do based on domain expertise—but at scale and without human bias.

Case Study

The number for Salesforce’s 2019 Scope 1 CO2 emissions was reported inconsistently in a variety of data sources. Two data providers offered a value of 5,800 tons. A third provider said 5,000 tons, and a fourth reported 50,000 tons. Clarity AI’s algorithm concluded that the 5,000-ton value was the most reliable, and this conclusion was then backed up by Salesforce’s own annual report.

Access the full report here

Research and Insights

Latest news and articles

Market Insights

Redefining Wealth Advice with AI: Hyper-Personalized and Sustainable

The competitive edge for wealth managers has shifted from basic ESG compliance to the AI-driven ability to translate granular climate data into clear, hyper-personalized narratives. This was a central theme at a private event with Infront, where sustainable investing and its intersection with AI took center stage. The interpretation hurdle: Beyond the "Black Box" of…

Regulatory Compliance

Sustainable Finance Regulation in 2026: Fragmentation, Data Gaps, and the New Reality for Investors

Are we entering a new era of pragmatic complexity, or simply losing the thread of the sustainability agenda? With this question, Lorenzo Saa, Chief Sustainability Officer at Clarity AI, opened a recent conversation with Patricia Pina, Clarity AI’s Chief Research Officer, and Cornelius Müller, Policy Officer at the Sustainable Banking Coalition. The group discussed over…

Climate

The Climate Risk Toolkit: Scenarios, Models, and Getting it Right

Climate risk disclosure has shifted from a differentiator to the baseline, and the expectations keep moving. Institutional investors must now disclose and manage climate-related risks across multiple warming scenarios. The challenge is how: Join us to explore how financial institutions are operationalising climate risk through scenario analysis, forward-looking metrics, and AI-driven workflows. Through real case…