Evading Statistics Propaganda in Business Analysis

5 min read Original article ↗

Many business and predictive analytical efforts require the sourcing of external factor data such as macroeconomic statistics at various geographical levels. Finding relationships between sales volume and other contributing cash flow metrics to leading economic indicators can help to establish more reliable forecasts for business planning. Unfortunately many of the compiled statistics made publicly available are misleading due to underlying political reasons.

The statistics propaganda problem

There are three kinds of lies: lies, damned lies, and statistics (Mark Twain - 1906)

The fact that statistics are often purposely misleading is by no means new and the reason is quite simple. National and international organizations that collect, compile and disseminate statistics are funded by participating governments accompanied by special project funding from business interests through umbrella organizations. Those who fund the most have the most influence on the analysis undertaken and reported results.

In an ideal and commonly believed situation, data is collected and compiled into statistics so that appropriate policies can be developed for the greater good of national and international populations. In reality, the opposite is commonly the case. Vested political and business interests first define the desired policy and statistics are then creatively compiled to develop a story for justifying the policy. It is a simple form of propaganda in a guise of altruism to help justify the desired policy.

The behavior is not simply a theory, but an experienced reality from working in and with international statistical organization over many years.  Organizations that are corrupted in this way include the IMF, World Bank, United Nations, Eurostat, OECD and many national statistical agencies. A case in point at the national level is misreported Greece debt levels up until 2010 which were known but not reported by above mentioned international statistical organizations long before that time.

To add salt to the wound, while these organizations are funded through public taxpayer money, many of the statistical analysis is then only available as paid publications and subscriptions.

Implications for business analysis

Of particular interest to business analysis are reliable statistics on economic growth, confidence indicators and forecasts at geographical and demographic levels where sales demand is sought. Anticipating economic growth in such markets can assist with marketing efforts as well as cost and location investment/divestment planning. The data is also applicable for security investment analysis as macroeconomic factors influence business profitability and market dynamics.

The issue then becomes isolating and accessing reliable sources of statistics that have not been distorted for political motivations.

Blind faith in compiled statistics provided by statistical organizations is the ingredient for  classic 'garbage in - garbage out' analysis.  

Fortunately, there are some solutions that can help to overcome corrupted analysis from statistics propaganda.

Solutions for clean and reliable statistics

A widely used metric of economic growth is gross domestic product (GDP) and serves as an ideal example for establishing reliable data for business analysis. The fact that there are 3 main methods for estimating GDP gives rise to a wider flexibility for reporting a desired outcome. The 3 GDP calculation methods are:

  • Output production as the value added from industry goods and services produced.
  • Income earned from companies, employees and self-employed individuals.
  • Expenditure spent on all finished goods and services produced in the economy.

The easiest solution is to simply trust the numbers produced by statistics agencies with awareness of the issue. After all, some of the data will be accurate some of the time and the argument can also be made that, since business reacts to the data released, it becomes a self-fulfilling prophecy to some extent even though the message has been manipulated.

A more foolproof approach is to access data and a more granular source level and re-generate aggregations as necessary. This approach also has the benefit of sourcing data to targeted geographical areas and market subsets for a better fit to specific business analyses. In the example of national accounts for GDP growth, we can source data directly from individual country national statistical offices which has not yet been manipulated at the international organization level. This, however, does not eliminate the problem since misreporting is also often evident at the national level.  Some web APIs and applications can assist with data extraction from international and national statistical sources such as the FRED economic data Excel add-in.

As the most extreme solution, we can also attempt to proxy estimate the statistics from publicly available market data. In this resource intensive approach we can be surer of limited bias and have the flexibility to introduce customized assumptions that will assist with the analysis at hand. In the case of GDP, the Income approach can be approximated by combining company earnings data with tax collection information and other salary growth indicators. Company earnings filings for large market capital equities can be used as a starting point for estimating economic growth. Nonetheless, work is again required to reverse creative accounting methods by employing approaches such as Economic Value Added.

There is a growing demand for clean and unbiased statistical data sources to use for all types of business analysis.  Several web services have come online to meet this demand but so far lack the ability to efficiently link raw source data together due the absence of established metadata standards for common dimension definitions. Quandl is an excellent platform except that the data is supplied by the usual suspects.  If you know of good sources of clean and reliable statistical data, please share.