Machine Learning: The Natural Form for Capturing Tacit Knowledge?

Chris Lee
Apr 6, 2021

Key takeaways:

Tacit knowledge is made of examples from personal experience and interaction with others
Intelligence-first is a machine learning (ML) based approach which leverages the way tacit knowledge is gained - capturing good examples at low cost

Skilled technicians possess an amazing amount of useful knowledge. This is why aging of the workforce is a serious concern – the loss of these workers means less effective operations. The natural response is to document their know-how before it is lost. However, the information these experts carry is largely tacit knowledge. Such knowledge is very difficult to document because the expert possessing this knowledge doesn’t really know how they know it. In this post, we’ll look at what this means and how machine learning (ML) fits into the picture as a natural choice for capturing tacit knowledge.

To start, it helps to contrast two ways of “knowing” about a system: physics-based and ML-based.

Physics-based models describe a system as a set of equations based on a “complete,” explicit understanding of the system. That means every interaction is captured mathematically, describing the output in terms of some inputs. While this is typically done using a system of partial differential equations, you can get the idea of how it works with a simpler example.

Say I want to understand the distance () a block of mass () will move in time () across some surface with a coefficient of friction () when pushed by a force ().

I can string together a bunch of basic equations describing motion under a constant force:

and describing frictional forces:

to get a physics based model of the system:

Using that physics based model, I can describe behaviors of the block + surface system to the degree that all the assumptions in the governing equations are met.

Machine learning models describe a system as a set of relationships between input variables (call them signals) and outputs but without the user making those relationships explicit. Instead, these relationships are learned by processing training examples through an algorithm (of which there are many). The details of the relationships and of the algorithms are hidden from the user so knowledge of the underlying physics are not required.

Without getting into details of a specific algorithmic approach, ML might approach the block + surface distance question above by instrumenting the block and the pusher with a number of sensors, by pushing the block a number of times with different forces for different durations and measuring the distance it travels. The ML algorithm would then take these labeled training examples (i.e. distances the block traveled plus all the sensor readings for each push trial) and create a model which relates sensor readings to a distance. This way, given a new set of sensor readings as input, I can say how far the block will travel to the degree that the new system resembles the one I trained on and that the new input values are within the range I used during training. That is, if the new block + surface is unlike the one I learned with or if the push force or duration of push is well outside the range I collected training data for, then the results may not be accurate.

Each of these approaches has advantages and limitations. Complexity is the enemy of physics-based models. They require a deep understanding of the system being modeled and are only accurate to the degree that the system’s actual behavior matches the assumptions of the model. Even in a simple example like the block, the assumptions quickly become limiting. For example, the surface must be the same over the entire distance (e.g. without patches of dirt, oil or scratches), the surface must be level, the force must be constant over the entire push, the block must start in motion (but very close to zero), and so on. The more complex the system, the more difficult it is to write a full description mathematically and the more unlikely that all of the constraints can be parameterized. The result is diminishing returns on model accuracy as the system’s complexity increases. This is the advantage of machine learning based approaches – for ML, the complexity of the system is not the limiting factor. Given enough input data (signals) and enough training examples, ML can make predictions in the operational space needed. This is a big reason why ML is becoming so popular today: Real systems are increasingly complex and data is getting cheap – exactly the conditions under which ML shows its advantage over physics.

But what does this have to do with tacit knowledge?

The experienced techs learned by seeing examples, not by creating and solving equations. When the motor hums like this, it is likely to have excess bearing wear like that. Given operational parameters X, the outcome will be Y. Without ever describing the physics or the constraints, the tech’s brain is very good at finding the relationship between input and output. Sound familiar? That’s the same thing machine learning does. Tacit knowledge is built from examples, not from detailed formulae. So rather than trying to get the tech to explain the why behind everything they know (they can’t), might it be better to capture all the whats (if X then Y) and record them in an ML system?

That’s what Falkonry’s intelligence-first approach is about. It codifies what works in tacit learning and captures it in a machine learning framework. Just as the expert’s path to knowledge is built from examples, intelligence-first focuses on identifying the examples and seeking expert knowledge about them. The approach acknowledges that getting good, contextually complete examples is really hard after the fact, so it promotes capturing those examples in the moment, when hearts and minds across organizations are aligned on addressing the problem at hand. It recognizes that using ML in this way requires a different approach to software, one that favors subject matter expertise over data science by removing complexities like data cleansing, normalization, gap filling and feature extraction; one that collects examples in a way that fits with the operations team’s daily workflow so that staying engaged brings value instead of just extra effort. Contact us today to discuss how you can start more effectively capturing the tacit knowledge in your plant.