Data scientists model uncertainty in future operations with probabilities. Their models can deliver inferences about the business operations being studied based on data collection, regression and hypothesis testing. There’s one detail that must be well understood before applying inference models: The decisions influenced by these inferential models are highly sensitive to turning one little input knob: the significance level. And this dial is fully under the control of the experimenter.
The significance level represents the chance of making a specific type of error: rejecting our experimental hypothesis when, in fact, it is true. For example, we may study a re-engineered component to see if it improves equipment health and contributes to higher reliability and more revenue. Before we collect any data, we establish a hypothesis: Equipment reliability is unchanged by the new component. And we then examine whether there’s enough evidence to reject this hypothesis and replace it with an alternative: The re-engineered component has improved reliability across our equipment set.
This significance level, called “alpha,” also determines the width of confidence intervals on a metric associated with uncertainty (a random variable). For example, we may want to measure the number of assets that are inoperable due to failure of this specific component at any point in time. If we set alpha to 5%, then 95% of the confidence intervals generated by our model will include the true value of the number of inoperable assets.
By adjusting the significance level, we can flip the resulting conclusion on its head. Make alpha smaller, and it’s harder to reject our original hypothesis. In our example, a smaller alpha is more likely to tell us that the component upgrade did not make a difference. We have reduced the chance of concluding that the new component is better when is not. In other words, we’ve reduced the probability of a Type I error.
We may also erroneously conclude that the new component is not helping, when in fact, it is. This mistake is a Type II error—represented by a beta. Alpha and beta are directly related so that every time we reduce the risk of one error, we increase the risk of the other. What’s the correct setting for alpha and beta? How much risk can we accept? We’d better have a good idea because this small knob controls our conclusions.
Another type of error is answering the wrong question to begin with—sometimes known as Type III error. If the data used for this analysis is all from the past, and I’m really asking the question: Will the new component help our *future* operations? Then regardless of alpha and beta, we commit a Type III error. With complex operations and a dynamic environment, the past does not equal the future. So I’m trying to answer a question about the future with the wrong data model—oops!