2026 Guide | AI in Financial Services
AI, Data Coverage, Regulatory Compliance, TechnologyArticles

Natural Language Processing (NLP) to Improve SFDR Reporting

Published: January 25, 2022
Modified: August 14, 2025
Key Takeaways

SFDR Reporting Solution: Clarity AI’s controversy scoring system provides consistent incident assessment

Most of the solutions available in the market for the analysis of companies’ behavior rely heavily on the manual assessment of news or on sentiment analysis, leading to weaknesses like subjective assessment, volume limitation, and limited ability to interpret metrics. These limitations reduce the capability to provide timely and meaningful information in a consistent and transparent way.

Controversy scoring system

Clarity AI addresses these weaknesses and limitations with a controversy scoring system. It has built scores using a global news monitoring service as the main source of data, which provides access to a universe of more than 8,500 media publishers that cover 200 countries, with 100,000 new articles added per day from more than 33,000 sources. This adds up to approximately 70 million articles related to the Clarity AI company universe for the last three years.

Clarity AI’s controversy scoring system breaks down into 4 major steps:

1- Incident detection

2- Incident classification

3- Incident severity scoring

4- Event severity scoring

This system identifies the evolution of controversial incidents over a timeline —as well as their severity— for a given company in a specific category from among the 39 categories it assesses. For example, a company like Tesla has thousands of articles written on it (23,189 from summer 2017 to winter 2020). Out of those, 3,767 articles are relevant to the business ethics category, but they vary substantially in how severe they are. The AI model considers all relevant articles for each category, then using severity as one key proxy of PAI breach.

An “event” is considered to be the whole three-year series of incidents that refers to a specific company in one ESG controversy category. The event score is then calculated through the combination of the resulting maximum severities for the most relevant incidents within the event. As an output, we obtain an overall score at the company-category level.

Each of the steps relies on Clarity AI’s proprietary artificial-intelligence models, which are purposefully designed to detect, classify, and assign the corresponding severity. These algorithms are the key factor for our objective analysis, having been trained on subject-matter-expert intelligence through a human-in-the-loop process, with a selection of more than 30,000 articles covering all controversy categories and controversy levels, allowing the model to learn the relevant criteria to be considered in each step.

The combination of this controversy methodology and SFDR rules is illustrated in the following figure:

Clarity AI’s controversy model for SFDR compliance

Access the full report here

Research and Insights

Latest news and articles

Market Insights

How Investors Are Navigating Geopolitical Risk

Geopolitical risk has always been priced into investment decisions, but rarely has it demanded a rethink of the assumptions beneath them. Today it does. The question facing long-term investors is no longer whether geopolitical events move markets. It is whether the frameworks built over decades to guide portfolio construction, exclusion policy, and asset allocation still…

ESG Risk, Gender Equality

The diversity say-do gap: Two-thirds of companies with discrimination violations also claim diversity initiatives

June is a month when corporate communications are filled with Pride messaging, diversity commitments, and inclusion statements. But beyond the visibility of these declarations, a more complex question remains: do these commitments consistently align with companies’ actual conduct? At Clarity AI, we looked at whether companies with active discrimination controversies in practice also publicly emphasize…

Climate

The physical risk gap: What today’s datasets are missing

Access to physical risk data is no longer the problem. Most asset managers who need it have it. Far fewer have data that holds up when it matters: under regulatory scrutiny, in client reporting, or when trying to act on it. Taking place in the heart of the climate week season, after Zurich and London,…