Correlation and Causation in People Analytics

Understanding the difference between correlation and causation

The difference between correlation and causation is one of the single most important distinction for any analytics project. The reason it's so important is that correlation is quite easy to find however just because something is correlated does not mean that they cause each other to occur. Don't get fooled by pretty graphs and vanity metrics, and instead, look for robust systems with lots of customisability. Many providers fall into this category meaning that companies spend large sums on certain software as they show a correlation to their desired outcome, however, without actually addressing the causation factors, results may end up being far worse than predicted. To put this into context, I'll give you some examples:

Correlation:

"People that own expensive cars are happier"

When tested, this would most likely show a positive correlation, however, if you tried to make people happy by simply giving them an expensive car you'd see that it has almost no lasting effect. This is because the causal factors haven't been addressed, these could be things like lack of financial freedom, feelings of low societal worth or no housing security. Addressing these causal factors is much more likely to improve the long term happiness of the people involved.

To put this into a People Analytics context you may see that some staff have low engagement, so you look for the reasons why. When you analyse the data you see that low engagement is most common for people in the bottom 25% of earners in the organisation. From here you could take a ‘tayloristic’ approach and simply pay these people more, however, there is no proof this will work, it also lends itself to being a short term solution once the new pay levels become normalised.

So you may come to find that actually, the key area where staff are losing engagement is around the complexity and diversity of their jobs, they happen to be the lowest earners because the pay scale is linked to the complexity of the job role but this isn't the cause of their low engagement. Therefore adapting roles, allowing more freedom and implementing internal promotion programmes are more likely to lead to an increase in engagement amongst these individuals.

Causation:

Now, causation will always show a very strong correlation, otherwise, there isn't enough of a relationship between the two variables for one to be causing the other. Often, in statistics, outlier results can be considered as anomalies and ignored. This isn't the case in people analytics, these outliers could be caused by three things: inaccurate data, an additional variable that only affects certain data-points, or because there is only a correlated relationship between two variables rather than a causation relationship.

A true causation factor means the if X happens Y will happen, this could be something like:

"More Christmas tree's are sold in December due to the lead up to Christmas"

This statement is unequivocally true, this happens without fail every time. Therefore the timing of Christmas is the causation factor to the top month for selling Christmas trees. Granted this one is pretty obvious, be sometimes they won't be as obvious and will require lots of digging to find out what is really happening.

For example, I was talking with an engagement expert about a retailer they work with. Revenue increased for a retailer's stores when engagement rose. However, there was one store where this wasn't true despite this strong correlation between engagement and revenue. The causation factor for why that store hadn't performed as well had nothing to do with engagement. That store simply had a competitor open on the same street.

From what I have seen there are several vendors making claims that aren't possible to substantiate. A personal favourite was a claim along the lines of "our product provides up to 200% ROI", this statement means nothing, anyone could make this claim even if 90% of their customers have a negative ROI". My recommendation is to look for bespoke solutions as this will enable you to more accurately predict the scenarios that matter to you and drive a better ROI for your projects.

If you are interested in talking to me about the subjects in this post use the contact form here.

Next
Next

Dyslexia is a learning difficulty, right? Why would I employ someone with it?