The promise of machine learning is simple: Take some operational data, add some ground truth data (labels), put it into a black box and receive wisdom. The reality of machine learning is more complicated.
Data, especially the ground truth data used to label interesting events, is hard to get and can be difficult to interpret. This is enough of a challenge that the Pentagon’s Joint Artificial Intelligence Center calls it out explicitly in its RFI for predictive maintenance, taking up half of the challenge areas presented. We have also encountered this challenge first hand on many occasions. For example, in one case we had two years of maintenance records but could not make effective use of it:
Because these events are from months or even years in the past, incomplete records make it very difficult to investigate what actually happened. Ground truth data is a veritable Gordian knot. Untying it to understand the thread of truth it contains is impractical. However, without ground truth data, machine learning systems quickly starve and the operational excellence project that depends on them stall as well.
We have seen two common responses to data starvation like this:
Both approaches have drawbacks.
Relying on data scientists does not scale. The data scientists can sometimes help demonstrate technical risk retirement but that will very commonly fail to lead to meaningful adoption. As we have related through our observations about increasing production with zero marginal cost predictive analytics and our experience with Agile approaches, it is important to have operational experts lead operational excellence projects.
Creating a data lake defers reckoning. The business practices which led to the inconsistencies and gaps will not go away on their own. Nor will the data in the data lake suddenly become enriched based on changes made a year from now. In the meantime, support for the operational excellence program can fade as technology and money is applied but few actionable insights are gained.
Instead of trying to untie the data knot, what if the problem could be solved by cutting it – by removing the need for historical ground truth data altogether? As we discussed in “The importance of working in the now,” our experience is that not only is there a way to do this, but that doing it this way, using machine learning with realtime data, is more effective than the alternatives.
Our approach is embodied in our Time Series AI platform. Cut the knot.