Clarity AI: Legacy Data Providers Have Discrepancies of More Than 20% in 13% of Direct Emissions Data
In a sample of more than 30,000 data points from three leading data providers, discrepancies were present more than 40% of the time.
Clarity AI, the leading global sustainability tech platform, announced today that of approximately 6,500 public companies that report direct emissions, leading data providers have discrepancies in this reported data 42% of the time, where a discrepancy is any difference of more than 1%. When increasing the discrepancy threshold to more than 20%, leading data providers have discrepancies in one out of every eight data points.
“These significant discrepancies in 13% of the data can make the carbon footprint of a climate fund increase or decrease by more than 20% and highlight the real challenges that investors face when selecting a data provider,” said Patricia Pina, Head of Product Research and Innovation at Clarity AI. “Data reliability is core to what we do, and we see huge benefits in using advanced technology to help ensure quality.”
Clarity AI identified three problem areas legacy data providers can encounter when collecting data:
- Human Error: Human errors account for more than 80% of the errors found. These vary in nature but some examples include: incorrect addition of category values, misinterpretation of report details, and inaccurate unit measurements (e.g., tons vs. gigatons)
- Inconsistent Reporting Boundaries: Data providers use boundaries (i.e., rules to decide which entities from the group to include or not, what to do with joint ventures, investments, etc.) for emissions reporting inconsistently
- Incomplete Disclosures: Companies publish incomplete disclosures that omit relevant emissions (e.g., Scope 3 categories, regions/offices, business lines)
“At Clarity AI we rely on technology and data to solve reliability issues. First, we curate a robust dataset of sustainability data points, which have gone through rigorous quality checks. Then, we have trained, calibrated, and validated an expert-supervised machine learning model to select the most reliable data points and filter out non-reliable data,” added Ron Potok, Head of Data Science at Clarity AI. “The flexibility of a machine learning model allows for much more complex relationships between the reliability of a datapoint and its features. The model analyzes the data from all angles at every level of granularity, which boosts its performance.”
Advanced technology, like machine learning, is the only scalable and efficient way to create clean, reliable data that investors can rely on. Clarity AI has trained state-of-the art machine learning algorithms leveraging input from sustainability experts and is the only sustainability tech provider in the market with a sophisticated machine learning reliability algorithm. Moreover, the algorithms and models only get better and better with time and continuous care from advanced technology experts, because when a data point is detected as non-reliable, it is sent for external review (i.e., to the team of experts) and corrected if necessary. Then, this data will enter back the system and further train and improve the model in a virtuous cycle.
About Clarity AI
Clarity AI is a sustainability technology platform that uses machine learning and big data to deliver environmental and social insights to investors, organizations, and consumers. As of August 2022, Clarity AI’s platform analyzes more than 50,000 companies, 320,000 funds, 198 countries and 188 local governments – 2-13 times more than any other player in the market – and delivers data and analytics for investing, corporate research, benchmarking, consumer ecommerce and reporting. Clarity AI has offices in North America, Europe and the Middle East, and its client network manages tens of trillions in assets under management. clarity.ai