Machine Learning – Introduction for decision makers

Technologies like Machine Learning, Artificial Intelligence or Deep Learning are now accessible to a broader population with the thread to disrupt more and more industries. Promises of innovations and innovative applications based on these technologies are surrounding us every day. To have a better grasp of how and why these technologies will impact your business, first, you need to understand what exactly are they. This post aims to give you an executive introduction to the Machine Learning technology, before diving deeper into the business implications during our free annual event Innovation Leader 2018 – Machine Learning Business Applications on April 26th.

What is Machine Learning?

 

“A field of study that gives computers the ability to learn without being explicitly programmed” – Arthur Samuel.

Machine Learning is a rudimentary form of Artificial Intelligence. Every day, we analyse data, and we make decisions based on these analyses. For example, you decided what clothes to wear this morning based on information like what you plan to do today, where you are going to be (office, beach, gym), the weather at these locations etc. We have learned to process all this information intuitively. This example is rather straightforward. What about deciding on your next investment? The decision is made complex by the amount of data to consider. The increase in the volume of accessible data surpassed now our human capability to make sense of these data, so we train computers to do it for us.

In short, the term Machine learning refers to an algorithm or a program that can do predictions and recommendations based on patterns detected in large amounts of data. The main characteristic of these algorithms is their ability to self-learn and continuously improve based on the new data or experiences they receive. Simplistically, we can see machine learning as a program that answers questions based on data analysis. And there are two types of outputs that it can provide:
1. predictions of what will happen. For example, the number of calls in a call centre.
2. recommendations of what to do. For example, books to buy or where to invest your money.

Why is everybody interested in Machine Learning now?

The foundation of Machine Learning lays in statistics and data modelling with theories that are far from novelties in the scientific environment. In fact, it was more than 200 years ago in the 1800’s that Adrien – Marie Legendre and Carl Friedrich Gauss, a French and respectively a German mathematicians brought to live the concept of regression. This method is today one of the most used machine learning methods, according to surveys done by Kdnuggets and Kaggle. Unfortunately, at that time, we had neither the ability to collect nor store a sufficient amount of data nor the computers’ calculation power to process these data. Since then several technological advancements made the applications of this methods and others developed in the meantime accessible outside of labs.

How does this work?

No matter what method we are using, we are always going through the same three steps: Data Preparation, Model Creation and Model Validation. To hone the accuracy of our machine learning application, we iterate this process several times.
Step 1: Data Preparation. The focus on this phase is on data collection and preparation. The quality of the data used is crucial to the precision of the final result. So we need to check for missing or wrong information and use a normalised and clean dataset.
Step 2: Model Creation. At this stage, we test several algorithms to find the one that best fits the problem we want to solve and our dataset.
Step 3: Model Validation. This phase aims to determine how well our model is performing with a new set of data. In the light of the new results, the model might need adjustments, and therefore we go back to step 1.

What are the different machine learning methods?

There are numerous machine learning algorithms to choose from, but as manager or executive, you might want to leave this choice to an expert in the domain. Nevertheless, it’s necessary for you to know the three main categories and what type of problems can be addressed by each of them.

1. Supervised learning

The Supervised learning algorithms use an initial set of labelled examples to define a model. For example, labelled relations between the input and output. The algorithm’s work resumes in finding the patterns in the input-output relation and build a model that can predict the outputs for another unlabelled set of data.
You can choose to use this type of algorithms when you have a predefined set of data with a known result. For example in the medical field for cancer tumour recognition, in human resources as a first step filter of candidates for a job offer or in the finance for the credit card fraud detection.

2. Unsupervised learning

The second category, the unsupervised learning algorithms receive no specific information about the relation between the input data and expected output. The algorithm will identify unknown patterns or groups with similar characteristics/ behaviours.
We see this type of machine learning used in market research to identify customers segments or in mobile phone infrastructure to define the best network locations.

3. Reinforcement learning

The last type of algorithms use a system of rewards and penalties and go by the name of Reinforcement learning. The algorithm takes its guidance from its environment: when the environment reacts positively (gives a reward), the algorithm interprets it as a confirmation that its outcome was correct. On the contrary, if the reaction is negative (a penalty), the algorithm learns that the result was not the expected one. The emerging industry of self-driving cars integrates this type of machine learning. Another industry that uses this technology and is closer to our day-to-day is the eCommerce in their warehouses to optimise the product picking.

Despite the rise in the number of applications deployed, the usage is not yet within everyone’s reach. First, we often underestimate the amount of data required. To obtain sufficient confidence in the accuracy of the algorithm, we need to provide them with hundreds of thousands times more information then we humans need to recognise an object or to do a translation. Second, choosing the right model can still be challenging even for the data scientists.
Stay tuned for more articles on the subject and don’t forget to book your place for the Innovation Leader Event.

References:
https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/an-executives-guide-to-ai
http://usblogs.pwc.com/emerging-technology/machine-learning-101/
https://projecteuclid.org/euclid.aos/1176345451
https://towardsdatascience.com/a-tour-of-the-top-10-algorithms-for-machine-learning-newbies-dde4edffae11
https://www.kdnuggets.com/2017/12/top-data-science-machine-learning-methods.html
http://inverseprobability.com/2016/03/04/deep-learning-and-uncertainty