In today’s competitive energy markets, operators need to be highly efficient at their operations in order to compete. This includes the need to effectively use their existing knowledge to improve asset performance, as well as the need to make sense of the large volume of data available to them.
Fortunately, machine learning (ML) can be a powerful tool to help make sense of the data and drive the creation of valuable insights about assets. In this article, we’ll examine the deployment of machine learning analytics, challenges faced, suggested best practices, machine learning basics and types, and benefits.
First, however, there are some misconceptions of what machine learning is and is not—and how it can help your organization.
Simply defined, machine learning is a type of AI that enables a system to learn from data. Machine learning is considered a subset of artificial intelligence (AI), but they are NOT synonymous. In fact, the joke amongst data scientists is: “If it’s written in Python, it’s called machine learning; if it’s written in Powerpoint, it’s called AI.”
It’s also important to clarify some myths surrounding machine learning; for example, the myth that machine learning will replace people’s jobs and take over the human workforce. In fact, we have found that the key to effectively using Machine learning is for Subject Matter Experts to apply their knowledge and experience alongside the outputs of Machine Learning. In this respect, machine learning is a tool to help experts be more effective at their jobs.
Let’s examine some basics and different classes of machine learning:
Supervised Learning -- machine learning that is based on known outputs.
Unsupervised Learning -- machine learning that is based on unknown outputs.
There are two well known challenges to using Machine Learning: Data Quality and Model Interpretation.
First, because machine learning is an automated process, the data you use must be consistent and clean. Data scientists can at times spend up to 50% of their time cleaning the data, which includes removing outliers, filling in missing data, and normalizing data.
Examples of data preparation needs to include solving problems around:
Of course, you can also use Machine Learning and analytics to help with this task. See for example the Imputer from Sklearn., which can be used to fill in missing data gaps using machine learning. At NarrativeWave we use this approach and many other methods, such as the automatic substitution of missing data and outlier detection to preprocess data for machine learning.
Second, interpreting the outputs of models can be challenging. For example, an unsupervised technique can create clusters of data across multiple data tags and dimensions, but figuring out what those clusters mean requires human interpretation. Other techniques, however, such as Decision Tree Regressors, excel at finding easier to interpret correlations between data tags. Overall, particularly when applying machine learning techniques to critical assets, it is important to incorporate human interpretation with the results.
Machine learning can provide significant benefits when applied to large datasets and groups of assets.
Benefits and Uses include:
At NarrativeWave we have developed several best practices when applying machine learning techniques:
It’s imperative you examine and start with what matters as value to your organization, and then align your approach to that. The machine learning technology you deploy will then have a clearly defined output and goal, resulting in better adoption. In fact, a key cause of failure of data science efforts is lack of definition of the final goal.
As discussed earlier, the goal here is to identify and remove errors, outliers, and missing data. This includes validating sensors reading (are they believable?), as well as cleaning other data sources from a data engineering perspective. Visualizing the data can help in this process, as can several well known analytics for outlier detection and imputation of missing data.
When using machine learning, it’s important to take an agile, iterative approach. Not every solution you try will work, and so having a framework (and organizational understanding) that you need to experiment, measure, fail, adjust, and redeploy models is essential.
Starting with the simplest model which you believe will solve the problem is often best. Simpler models are faster to build and test, require less computational resources, and are easier to interpret. We will often start with statistical methods, move up to machine learning, and then to deep learning in that order, unless we’re confident that a more complex approach is already proven.
Machine learning offers significant benefits for energy asset operators, and has been successfully used on multiple asset classes and types. In addition, machine learning can offer exceptional insights using many variables and large amounts of data, helping engineers and asset managers to focus on areas of high importance.
Some of our favorite resources include: