FaithScreener
← Back to blog
Faith Investing

The Data Challenge of Ethical Investment Screening

Basel Ismail3/27/20265 min read

Screening a company for ethical compliance sounds straightforward until you actually try to do it. You need to know what a company makes, how it makes it, where its supply chain runs, how it treats its workers, what its environmental footprint looks like, and how its governance structures function. Getting reliable answers to these questions for a single company is a research project. Doing it across an investable universe of thousands of companies is an industrial-scale data problem that the industry is still figuring out.

The Disclosure Gap

The fundamental issue is that companies control the information they release. Mandatory financial reporting covers revenue, expenses, assets, and liabilities in standardized formats. But ethical screening requires data that falls largely outside mandatory disclosure requirements in most jurisdictions.

Revenue source granularity is a persistent problem. An analyst screening for Shariah compliance needs to know what percentage of revenue comes from interest-bearing activities. A Christian investor needs to know whether a pharmaceutical company produces abortifacient drugs. An ESG analyst needs to know what share of a utility's generation comes from coal versus renewables. Companies may report this information voluntarily, but the level of detail, the reporting boundaries, and the definitions used vary enormously.

Some companies provide excellent sustainability reports with detailed breakdowns of revenue by product category, emissions by scope, and workforce demographics by region. Others publish glossy reports that say very little of analytical value. And a significant number of companies, particularly in emerging markets and among smaller-cap stocks, publish nothing at all beyond what regulators require.

Supply Chain Opacity

The further you move down a company's supply chain, the harder the data problem gets. A consumer electronics company might have excellent labor practices at its headquarters and first-tier assembly facilities. But what about the mining operations that produce the rare earth minerals in its components? What about the subcontractors that its contract manufacturers use during peak production periods?

Modern supply chains are deep, geographically dispersed, and often deliberately opaque. Companies themselves frequently do not have complete visibility into their own supply chain practices beyond the first tier. Asking external analysts to evaluate those practices without the company's own internal data is asking for information that often does not exist in any accessible form.

Regulatory pressure is starting to change this. The EU's Corporate Sustainability Due Diligence Directive will require large companies to identify and address human rights and environmental impacts throughout their value chains. But implementation is gradual, and the directive applies mainly to large EU-based companies and major non-EU companies operating in Europe. The vast majority of listed companies globally remain outside mandatory supply chain disclosure requirements.

Estimation and Inference

When direct data is unavailable, screening providers resort to estimation. The methods vary in sophistication. At the basic end, you might assign industry-average values to non-disclosing companies, assuming that a steel manufacturer that does not report emissions has roughly the same carbon intensity as its peers that do. This is better than nothing, but it obscures meaningful variation within industries.

More sophisticated approaches use statistical models that incorporate available data points, including company size, geographic footprint, product mix, and partial disclosures, to estimate missing values. These models improve with better training data, but they introduce model risk. An estimate is not a measurement, and screening decisions based on estimated data carry uncertainty that is often not communicated to end investors.

Natural language processing of company filings, news sources, and regulatory databases is increasingly used to fill gaps. An NLP system can scan thousands of annual reports to identify mentions of specific products, practices, or controversies that manual analysis would miss. But NLP has its own limitations. Sarcasm, context-dependent meaning, and the difference between a company describing a practice it has adopted versus one it aspires to adopt can trip up automated text analysis.

Timeliness and Frequency

Most company disclosures are annual. Sustainability reports typically lag the fiscal year by three to six months. This means the most recent ESG or compliance data available for a company might be 18 months old by the time an analyst uses it.

A lot can change in 18 months. A company can acquire a subsidiary in an excluded industry, take on significant debt, face a labor dispute, or suffer an environmental incident. Relying solely on annual disclosure cycles means screening decisions are based on stale data, and the staleness is structural rather than accidental.

Real-time monitoring through news feeds, regulatory filings, and social media can partially address the timeliness problem. But real-time data is noisy. A news article about a labor dispute at a company's supplier does not tell you whether the dispute is material, whether the company bears responsibility, or whether the situation has been resolved. Converting real-time signals into reliable screening inputs requires analytical infrastructure that goes well beyond simple news aggregation.

Standardization Efforts

The proliferation of reporting frameworks has been both a blessing and a curse for ethical screening data. GRI (Global Reporting Initiative) provides comprehensive sustainability reporting guidelines. SASB (now part of the ISSB under IFRS) focuses on financially material sustainability topics by industry. TCFD addresses climate-related financial disclosures. The EU Taxonomy defines environmentally sustainable activities with specific technical criteria.

Each framework addresses part of the data problem, but their overlap and differences create complexity for data consumers. A company might report its carbon emissions under TCFD guidelines, its water usage under GRI, and its industry-specific metrics under SASB, using slightly different boundaries and methodologies for each. Reconciling these into a consistent picture requires expertise and effort.

The ISSB's work to create a global baseline of sustainability disclosure standards is the most promising consolidation effort to date. If widely adopted, it would significantly reduce the burden on both companies and data consumers. But adoption is voluntary in most markets, and convergence with existing frameworks will take years.

Where Technology Is Helping

Despite the challenges, the quality of ethical screening data is improving faster now than at any point in the past. Satellite imagery can independently verify deforestation claims and monitor industrial emissions. Machine learning models trained on diverse data sources can detect patterns that suggest undisclosed risks. Digital supply chain platforms create auditable records of sourcing decisions. Blockchain-based systems are being tested for supply chain traceability, though practical adoption remains limited.

The combination of better mandatory disclosure requirements, more sophisticated estimation techniques, and technology-enabled monitoring is gradually closing the data gap. It is not closed yet, and investors who rely on ethical screening should understand the limitations of the data underlying their portfolio decisions. But the trajectory is clearly toward more complete, more timely, and more reliable information about how companies actually behave, not just what they report.

Related Reading

business-intelligenceethical-investing
Want to screen a stock?

Try the FaithScreener tool free. 124,000+ stocks across 42 markets, 10 frameworks, side by side, in one click.

Open the screener