Trusting Too Much In Data

We are virtually bombarded by the phenomenon of data-driven headlines. On any given day we might read that cigarette smoking is on the rise, or that book sales are down, but without seeing the data plot over a long time we don’t really know what those headlines mean.

In recent weeks I’ve run into multiple posts, articles, and discussions concerning some findings that employee morale does not equate to productivity. I’ve read a few of the discussions and a couple of the articles, and the subject and the reports proves to be an excellent example for discussion about how easily we can mislead ourselves with data.

By way of background, apparently some of the research groups and “better management” consulting firms have recently assembled some data analyses that refute the assumption that higher employee morale will drive higher employee productivity. Some of us immediately acknowledge that the findings merely state the obvious. Some of us immediately argue that the analyses could not have drawn an accurate conclusion. Some of us immediately demand to see the data and the analyses, suspicious that something obviously complex couldn’t possibly be summed up so simply.

Because of the diversity in response, and the probability that everyone is, at least in part, correct in taking their various sides of the arguments, I thought it would make a good focus for discussing some ways we allow ourselves to be misled. In particular, I want to discuss two major behavioral errors.

    1. To truly understand what the data means, we must understand the whole picture of the data, including how it was collected and analyzed
    2. Metrics do not necessarily make meaningful data – the truth may be far more complex

Herein, let’s focus on the first error. The second will make a good discussion for a second post.

I will say that I tried to search out some links or references to actual data, but the articles that I happened to read did not provide any connection to actual analyses or data; they only transmitted findings. That means that the findings may have been paraphrased or otherwise interpreted differently than the original analysis intended. That is the biggest problem with accepting the findings as reported.

We are virtually bombarded by the phenomenon of data-driven headlines. On any given day we might read that cigarette smoking is on the rise, or that book sales are down, but without seeing the data plot over a long time we don’t really know what those headlines mean.

Does the survey data from this year show more or less activity than last year? Is that where the claim comes from? Is this year’s number more than that of three years ago? Is there a genuine trend or does the number just represent a normal fluctuation in a stable phenomenon? Unless the report shows us the data and the analysis, we just don’t know. If we accept the headline at face value, we could be misled.

Consider the assertion that employee morale does not improve productivity. How many ways can we imagine that the assertion might be misleading or that the data might lead to that conclusion without telling the truth?

      • What was the null hypothesis? Was it, “there is no difference in business productivity before morale improved and output after morale improved,” for example? Remember that if the statistical mathematics cannot conclusively disprove the null hypothesis then we must accept the null hypothesis. That means that if the data is noisy, we must assume that random chance drives the outcome, not the factor we selected. How noisy was the data?
      • What exactly was compared to come up with the data? Were different businesses with different levels of morale compared for productivity? Was the data generated from organizations with a before and after morale assessment and productivity? How much time passed? What is the integrity of the data?
      • How was employee morale assessed? Was it assessed the same way for each organization? Was it assessed the same way in the before data set and the after data set? If one organization declared a particular morale score, would a different organization with the same score actually demonstrate comparable employee morale? Did the morale assessment and productivity span multiple industries or regions?
      • Did each organization that provided data assess productivity the same way? Does the productivity of a business in a media sector equate to the productivity in a manufacturing sector? Was productivity reported per capita of employees or for the entire organization?
      • Was the data continuous, ordinal, or binary in nature? It is reasonable to report productivity in continuous data terms such as 4500 pieces per day or 2.3 products per man-hour. Morale assessments are harder to imagine in definitive, continuous terms. They are more likely to be opinion numbers such as 1 through 5 where 1 equals “poor” and 5 equals “excellent.” Ordinal data is notoriously difficult (usually impossible) to make resolve into a statistical distinction. Was the data simply a yes or no response to a question like, “did your productivity improve after employee morale improved?” What kind of data was it?
      • Did the analysis only look for a single cause, or did it conduct a components-of-variation investigation? Did morale influence productivity for some, but not all organizations? Is it a contributing cause while not a single cause? Did the analysis look at that?
      • Were some of the organizations already performing near peak potential for productivity and so improving employee morale had little influence on output?
      • Did some of the organizations already display strong morale and, therefore, could not change morale enough to noticeably influence productivity? Alternatively, did some organizations improve from “poor” to “less-poor-but-still-not-good” and, therefore, did not drive a noticeable change in productivity performance?
      • Do some of the participating organizations have other challenges that affect productivity more than morale such as process or equipment failures that reduce the sensitivity to the morale influence?
      • Was there a reason or a demand for increased productivity, or was productivity limited by cash flow or customer demand so that a shift in morale could not affect productivity?

That’s ten important questions to address to know if we should accept the findings of the studies concerning a link between employee morale and productivity. While none of the discussions or articles I happened across directly addressed any of these, the more responsibly penned article did discuss that there are a great many factors within any organization that affect productivity and that employee morale is only one of them. It went on to point out that if you have other problems, fixing employee morale will not single-handedly fix productivity. 

One of the discussions included a great many comments about how some of the actions taken to improve morale might drive less employee productivity because they invite distraction from work or take employees away from their work. The discussions and the more responsible article each focused on the common sense elements of the topic, not the data. That is a very important observation.

Anyone who examines the idea of employee morale affecting productivity with a common sense filter will immediately conceptualize that morale is only a contributing factor. Such an examination results in the, “of course the data says it doesn’t drive productivity,” response to the headline.

Critically challenging the headline with questions and doubts similar to the ten challenges outlined above leads to the, “that analysis can’t possibly be right,” response to the headline. We can’t possibly sum up something as complex as human behavior so simply.

We must use our common sense filter and we must critically challenge every analysis and data set if we are to trust the results to lead us to wisdom. That brings me to the most important message I feel I must share. If we are to make data-driven decisions, our leaders and decision makers must be experienced data analyzers.

It is imperative that our leaders know how to collect and critically tear apart and analyze data if they are to intelligently understand, trust, and ultimately use our data and analyses to make important decisions. If they are just reading and accepting the headlines we are writing, then they are not intelligently assessing the data or the analyses. They are just doing what we tell them to do; so who needs them?

That paragraph should spur some comments, I’m sure. Unfortunately, it seems common that leaders do not know how to critically tear apart and challenge an analysis or a data set. They rely too much on “minions” to do the math and feed them a headline. That is not good! My fellow minions should not take offense.

If you are a minion supplying headlines, begin challenging your leaders to challenge the headlines. The better they get at understanding how easily data and statistics can mislead, the more intelligent their decisions, by which you must live, become.

If you are one of your organization’s decision makers, get out of your office and start learning how to build the data yourself and how to analyze it. Challenge the analyses presented to you, not just to make sure it is done well, but to make sure you truly understand how it was done, where the noise might come from, or how to improve your information for the future.

Do not trust the headlines. Examine data and analyses critically before making a decision. Always filter with common sense. Learn how to analyze data so that you know how to ask the right questions.

Stay wise, friends.

If you like what you just read, find more of Alan’s thoughts at www.bizwizwithin.com

More in Operations