Machine Learning: Why now? What’s in the future?

The following blog is a transcription of the presentation done by  Andréa Dunbar, Director of Embedded Visual Systems, CSEM, and Alumna of EPFL EMBA program speaking at Innovation Leaders 2018 event on Machine Learning. You can also see the full video below.

Andréa Dunbar, Director of Embedded Visual Systems, CSEM, and Alumna of EPFL EMBA program at Innovation Leaders 2018

First I’d like to thank the organizers for inviting me here to speak today and thank you all for coming. Tonight, I’m going to talk about machine learning. Why is it such a hot topic at the moment? Why is it happening now? Also, what’s going to happen in the future, both in the short term future and then at the end I’m going to talk about some possible long-term future of AI and machine learning.

First, I’m from CSEM. CSEM is a technology transfer company, and its mission is to help Swiss companies become innovative by taking the latest technologies coming out of places like EPFL, ETHZ and pushing them into their products so that they stay on the leading edge of technology. To do this, we have about 450 people who are very multidisciplinary, and we’re also very much international with about 40 different nationalities. When technologies have enough potential, we will make a startup.

However, let me come back to machine learning one of the technologies that we’re trying to help companies put into their systems. Chris already gave a very nice introduction to machine learning, but I’m going to go further giving you some of the details so I can make out why it’s so relevant today, why we’re doing this now. Most machine learning that is out there today is called supervised machine learning. Supervised machine learning means that you have been telling the code or the machine what it’s trying to predict. So, first of all, you’re taking your data, and it might be a bunch of images, and in that data, you’re labeling them, and you’re saying all of these images are chairs: “Learn that that’s a chair, machine! You need to learn that that’s a chair!” You can be taking millions of images to do this. Labeling is very important, and this is why it’s supervised learning.

Supervised learning means you’re labeling the data which is often a very laborious task. If it works, it’s a chair! Great your system is working well! If it says it’s a bed when in fact it’s a chair, then you say: “Hey you’re not doing this right, and you change it, and you come back and train it again and again and again. This is the first part of the machine learning system. That’s one particular stage that we’ve all called training tonight. Once the system has been trained, you move to the second part which is where you test how well your system is predicting things. Is it predicting 98 out of a hundred times it’s a chair? Is it 99? Also, where are you putting this accuracy of prediction? That’s for you to decide but it’s also going to depend on how much data you can feed into your machine and how deep you manage to make your code as we call it.  So two separate phases: first we train when it’s trained we test and then we’ve got to the measure of accuracy we want to implement.

An example: self-driving cars. It doesn’t always work okay but what I’m going to show you if you look down on the left-hand side is an image, a visual image just from a normal camera and then on the main screen you’re going to see what the machine learning part is labeling. So as we started moving along the car, it sees a truck in front, sees the trucks turning off, there are cars on the right and left, there are magenta boxes, there’s a green line that you’ve trained the machine to know that this is the center line. It’s a cyclist coming up on the right, and there is this red block. You’re getting to a red light you need to stop. It stops. You have pedestrians coming on the left. So you see there’s a huge amount of information. With this vast amount of data, we’ve trained your machine even though it’s never seen this data. This is why it’s called intelligence. It is without ever having seen this particular image, it extracts information from this data and uses it to drive a car. This is Google cars from YouTube.  They have driven labeling data for two million miles at 30 miles an hour. We’re talking about teraflops of bites. This is like 10 ^ 15! It’s a massive amount of data! However, this is what manages to make these systems work.

So why now? Why are we getting these systems like this here, today?

There are two main reasons in my opinion:

  • First of all, data is now digital. You could not do this when your data was previously handwritten. You need to have digital data. Today much data is digital: we’ve digitalized photos, we’ve digitalized all logistics and big companies it’s all digital data. So we have access to digital data.
  • The second reason is that we can treat this data with the computers that are now on the market. My computer today is a thousand times more powerful than it was 20 years ago. A thousand times my desktop!

So we now have the digital data, and this allows us to treat more and more complex problems. The first type of algorithm that you might want to process: “y equals x” doesn’t bother about the data. You don’t need to feed this type of algorithm with many data. However, if you go to a more complicated machine learning technique where you want it to predict or to extract information from something it’s never seen before you need data. The algorithm is going to improve as you feed it with data: the more data you give it, the better it’s going to get only up to a certain point and then it’s going to plateau. Once this plateau attained, It doesn’t get any better.

Now let’s make an even more complicated system with a deeper neural network where we can look at more complex cases. Then we’re going to need even more data to have an even better result. And on it goes. The large neural networks today have a thoughts layers. It’s a considerable amount of data but you can treat more complex problems, and your accuracy can go up. So that’s why we’re doing machine learning now: BECAUSE WE CAN!

Where are we using this machine learning today: self-driving cars. Anywhere where you want to extract information from data when it’s not always the same: medical images, translating languages, anomaly detection. When you have a defect in a piece of fabric it never looks the same, so you can’t just say: “Just look for this.” Every time it’s different.

Now I’m in a business conference so what I want to explain is why data is an asset to your companies and how it can be a protected intangible asset for you if you make a machine learning product that works with the asset. So you start with your asset that is your data, you make a machine learning product that works, and you give it to your clients, and you say: “Hey look this can detect if there’s a defect in your material that you’re making. Use it!” Every time they use it, they’re collecting more data for you to use and train your network again and you’re going to do it better next time. Those users are going to use it and then going to do it again, and then you’re going to have the best product in the market. Then you’re going to go back to them again, and they’re coming back to you, and you have this wonderful feedback loop. Suddenly you’re ahead of the game, and you’re better than anyone else on the market. So there is an urgency to get into the game. Many people in this room are in a unique position that they are in a company that has access to a very particular data set. So you can solve a very particular problem, and if you get onto the market first, I think it gives you an advantage.

Most machine learning today is done in the cloud, and I think we saw with what happens when you go up to the cloud: it treats your data, and it comes back down. Advantages and disadvantages but that’s where we are today. One of the drawbacks of using the cloud is that it takes quite a lot of energy because you have to move, stock, store your data, bring it back. The computing center in Nevada is six hundred and eighty thousand meters square. They think that energy and data centers might be as much as 20% of the world consumption by the year 2025. So they’re
quite energy intensive.

The other thing is that the big companies are using the data for the moment. Little companies don’t seem to be using the data. This is a problem. The big companies are getting people to give their data for free, and it’s giving them the advantage. So we need to grab back that data. The third thing is security! So we’ve heard about that as well tonight: people are stealing your data they’re using it without your permission, and we want our data back. So these are three issues that we are facing today. Another issue that goes with transferring to and from the cloud that we saw tonight: it doesn’t always work! There’s a bandwidth issue, a latency issue and sometimes that’s a problem.

Where is the medium future of machine learning going to be?

In my opinion, we have to take out of the cloud and do the machine learning where you’re getting your data. This will allow you to move to the edge. Where you take your image is where you treat your data. Then you don’t have to worry about bandwidth problems, latency, you can get automatic feedback, and you can do it instantaneously.

I show here an example of what we do at CSEM. Here is a face recognition system that we’ve done. It’s in a envision and package. The aim is to detect and track faces. Everything is in this module that we do. So we measure key features: face detection is based on the physiological distance between the eyes and to the nose. We named the people, and the system learns who the people are and can recognize them afterwards. However, the key to this is that it’s an edge device: it has no communication to the cloud, all of the electronics are embedded: the sensing, the storage, the processing, and the communication is all on this device. You don’t need to communicate your data to anybody else. You don’t need to share your data. You don’t need to connect to the cloud. Advantages of such a system: energy. The system uses a lot less energy because you’re not storing and transmitting your data.

I believe these things will ultimately be autonomous. For example, a watch which works on solar battery. I think that eventually many edge devices will also work through energy harvesting and we no longer need to have these massive data centers which are costing us in terms of the transmission of energy and stopping the energy. The data which at the moment is in the big people’s hands, we can grab that back and keep hold of our data. Use our data for our purposes even if you’re a small company. Last but not least the security. You’re no longer shifting large amounts of data where you don’t want them to go, and you have to take care of your security. So I’m not saying it’s more secure, but you have control over your own security. So these are the advantages of being at the edge of computing.

What next?

Everything we talked about tonight, apart from the clustering that was mentioned, has been supervised learning. So supervised learning is we label the data, we feed it into an algorithm, we train it this way: laborious, tiring we don’t like doing it, nobody likes doing it. So how do we go beyond that, how do we go to a paradigm where we can use our data without this process? As often, we look at the biological systems and how they do it. I mean you don’t tell your kid every day: “That’s a horse. That’s a horse. That’s a horse. That’s a dog.” You know they managed to learn this without that. They do something we call unsupervised learning or weekly supervised learning. Now for unsupervised learning, you have to have a lot more data, but you don’t have to label it. What the algorithms do unsupervised learning is it learns patterns. Like a child would learn to categorize a dog and it wouldn’t know it’s a dog until you say yeah that’s a dog but then they know every other dog fits into that category. Alternatively, we use weekly supervised systems where we give them a set of rules: “if you do this, you get chocolate.” Quickly they learn that that’s a good thing to do.

These are two systems that are not used much today in machine learning but are extremely powerful and will revolutionize machine learning to a certain degree. Why aren’t they used? Well, two reasons in my opinion. Just as here and now we’re using supervised machine learning because we have digital data and powerful computers, for these two next techniques they’re not powerful enough. The actual processing algorithms that we have today are not good enough. So we have two issues again. Look at the human brain: it uses between 10 to 25 watts, and they do about 10 to the power of 17 operations per second. So quite a lot but so do the supercomputers they can also do 10 to the power of 17 operations per second but they take 10 million watts to do. That is quite a lot less efficient, and on top of that, what we see in the human brain is that we have a three-dimensional organization. So the three-dimensional it’s not two-dimensional like silicon electronics. The brain learns. It is an adaptive system which changes as you move along. This is not the case for the machine.

One other limiting factor of silicon electronics is that the way electronics are used today we call it the von Neumann configuration. All that means is you have memory on one side and processing on the other. It’s inefficient. You take a bit of data, stock it in the memory to work on it, you send it to the processing. The processing works on it, and then it sends it back to the memory. It’s a very linear, slow process. To improve that type of memory processing we need to look to the human brain to do it. When we achieve increasing the processing memory, we can use more powerful algorithms and move into the new paradigm of unsupervised and weekly supervised learning.

There is already some work done on this. I want to finish by saying there are projects like the human brain project at EPFL which I think is an absolutely fantastic project. It’s really going to the root of how things work, how the brain works. When you know how something works, then you can start to mimic it artificially. Here, I want to use the parallel of flights. At the very beginning of aviation, we looked at the birds, and we just tried to copy them. So you can imagine that we could build the brain: 1.7 kilos of water and a bit of matter and stick it together. That is not going to get what we need. What we need to do is understand fundamentally how the human brain is working then we can move and adapt our silicon electronics. We’re already doing it a bit: we have a dynamic memory which is putting into the processing, people are working on normal fit computers and so on. I believe that this is going to be the future.

Thank you for your time! Thank you for listening!

AI is dead, long live AI

Pierre Kauffmann, Senior Enterprise Architect, IBM speaking at Innovation Leaders 2018 event on Machine Learning and Artificial Intelligence.

When it comes to Machine Learning (ML), our brain produces two opposite reactions: one is the fear of an Artificial Intelligence (AI) overlord that will enslave us, and another is the excitement around the excellent capabilities that AI can bring to humankind. In other words: horror and fantasy, numerous bestsellers themes for Hollywood movies or the press. The basic concept of AI and inherently ML is to have one or several Machines crunching Data. Nothing changed since the beginning of the IT. Therefore, Data is the key.

In this talk, we take a look at the Data, the business data for the enterprise, not the social one. The business data is very different by nature and available in a very different form; consequently, companies data remains siloed. Recently a new tendency emerges: to break these silos and create so-called Data Lakes or more ambitious Data Ocean. Unfortunately, because of the limited governance, these efforts create instead Data Swamp resulting in hardly usable data. This situation is due to several factors: Data is created at speed in huge volumes, it’s rarely verified, and it comes with uncertainty meaning Data is doubtful. Moreover, most of the Data (over 80%) is unstructured, therefore meaningless to non-AI systems.

I explain then why even traditional AI systems do not bring much value if not properly fed. Mainly they can categorize. As an example, an AI system can well distinguish a human hand from a watch or can also easily classify maps. However, the challenge is now to go further and extract meaning out of this data. For example, with a 12th-century world map, the point is no longer to classify it as a map but to understand where Africa or Asia are on this map or even deeper where Italy or UK is. I make the point that it is not because you have data that you know what data means, and this is crucial for any AI system.

Before starting to train your Machine Learning algorithm, you need to spend almost 80% of your project time on preparing the Data. During this phase, you need to identify the Data, connect to its sources, collect it and understand it before you curate and enrich it. Without these costly time-consuming tasks, you will have to deal with considerable noise in your Data, making Machine Learning less helpful for your organization, see the giant Panda example.

The best analogy is to consider any AI or ML system to be like a new employee, you need to teach him or her your organization jargon, your rules, the action to take, the solution to infer and so on. Then, this new employee will make mistakes, and a supervisor needs to correct the outcome and tell the employee what is wrong and what is right in your organizational context. You will never let a new employee trained on a specific task take a business-critical decision, thus don’t let the machine decide! Instead, let the machine advise you and “Augment” you rather than replace you. Therefore, at the other end of the process, assuming that a well-trained AI system is in place, you still need to retrain your AI / ML system over and over like any new employee and justify to its peer humans why it brings value to the organization.

My final word is about being sure that when creating an Augmented Intelligent system rather than an artificial one, you seek expertise among your ecosystem following a simple motto: “No one is immune to a good idea.”

AI is on the roadmap, where do I start?

Marc Lecoultre, Co-founder of speaking at Innovation Leaders 2018 event on Machine Learning.

Have you ever heard from your board or management “We need to do something with AI, we need to show the market that we are ahead of the competition”? I hear this very often when I visit my clients for the first time. The next sentence is most often “but we don’t know where to start”.

Through this article, I would like to share my experience of implementing AI-based services for businesses. I practiced AI, or rather Machine Learning (ML) I should say to be precise, for over 15 years. I have worked on dozens of projects in various companies and industries and I can not say that either one is more prepared to implement ML.

You may think that you are reading yet another article on AI and you are not going to learn anything; before turning the page, I would like to reassure you immediately, this is not an article on AI like those you’ve probably already read, you will not find generic facts about AI, not even a description of case studies already seen. I am going to talk about AI for ALL, AI for YOU, without math or theory. I will present a real working example that will demystify the whole subject.

I would like to convince you that AI is concrete, and not only for global or high-tech data driven companies. Too often, my clients get lost in the overabundance of information, in my opinion, they are not getting the right information.

Have you ever tried to find business information on AI or research for companies that could help you in Switzerland? It is not an easy task. It’s hard to find talent that understands what it takes to implement AI-based services. It’s not like a traditional IT project where suppliers are well identified.

You hear about AI everywhere and every day, in newspapers, on the radio, on television, there are dozens of conferences every month on AI. You have probably read many articles about different use cases made by others, most of them come from the United States and are far from your daily concerns. You may be even frustrated, everyone tells you that the AI is like the Holy Grail, and you’re like a little kid looking at this amazing toy that thinks it can never have it.

When I visit customers, they are often a little nervous at the beginning of the meeting. They feel uncomfortable, they don’t really know what we are going to tell them. They do not master the subject as might be the case when you meet a vendor who will set up a new IT infrastructure. In the end, we saw people smile, as if they were relieved. It motivates me every day to do what I do at, to democratize AI and make it accessible to all.

But then, what can AI really do for you? For enterprise applications, there are primarily two types of domains where AI can impact: improve decision-making and drive operational improvement. I could make an endless list of use cases you’ve heard about. To illustrate the fact that you do not get the right information about the potential of AI, there is this interesting case: reduce the churn rate by using social media data. At first it seems promising to use social media data to learn more about your customers and infer their behavior. If you are a well-known B2C company and sell your product around the world you may receive additional information from this type of data. If you are the manager of a political campaign you could also get useful information…

Remain serious, have you ever tried to search on Twitter or Facebook the number of posts referencing your brand or product? You are probably marketing an excellent product or service, but your customers do not talk about it on social media. We did the test for one of our client, a health insurance company. We have collected all posts on Swiss health insurance and found less than a thousand documents and most of them reflect on news in the media and almost none speak of specific insurance companies. It was useless…

This is not “AI for All” but rather “AI for them”! Remember, you have been asked to do something with AI for your business. Now that you have read articles, watched videos, searched the Internet you are probably sitting at your desk and I hear you say, it’s not for us.

But yes, it’s for you. You probably will not use social media data, but your data, in most cases, is sufficient. You probably have several years of historical, structured and unstructured data. Based on it, we can build useful AI services like the one I’m going to show you now.

The case I want to share with you is the optimization of a back office we did for a client. Suppose you have customer requests with different complexity that you receive each day, such as insurance claims, loan applications, tax returns, customs declarations, subscription to a service offered by your company. We have encoded the complexity from 0 to 4, 0 being the least complex and 4 the most complex.

In your team you have people with different skills and capacities, some can read perfectly Italian and German but not English, others do not work on Wednesdays, and so on. How do you divide the workload among your team members so that the processing time is small but of high quality?

How would you do that? With a traditional approach, you would probably design a rule-based system that infers the complexity of each new request from them. You can clearly see the disadvantages of this approach. You will end up with a static system that will not be able to take into account changes, either of the behavior of your customers or changes in the skills of your team. The system may be impossible to maintain because of the large number of rules needed to describe the complexity of the actual process.

We have developed a system that learns the complexity of a task and then assigns it to the right people or groups of people. Using ML algorithms, we can create a self-adaptable system that you can test from Excel. Yes, I just said Excel, I told you, this article aims to demystify AI, what is more common than an Excel sheet?

How does it work? For this case, we used Microsoft Azure Machine Learning Studio to develop the application. It is one of the well-integrated machine learning tools available today in the cloud, it is visual and intuitive. The result of this development process is a web service that you can call using the Azure Machine Learning add-in found in Excel. All you need to do is add the web service by providing the URL and API key. When you save the Excel workbook, your web services are also saved. You can share the workbook with other users and allow them to use the web service.

In this Excel sheet, there are two specific regions: one containing the data of one or more client requests, and the other showing the prediction of the complexity. For the request, you find fields like the customer identifier, the age, the annual income, the number of children, … In the prediction zone, you have five columns, one for each complexity, they show the probability that the request is of this complexity. The last column corresponds to the final prediction, the one with the highest probability.

When you change the values of a request, Excel sends the data to the web service and receives its complexity prediction in return. You can play and test different scenarios to validate the predictive model.

I will now present how we have done and give some details on the different steps needed to implement such services. In the following, there are no technical details, it is a high-level overview of the process needed to get this example to work.

We start the process by creating what we call an experiment. As part of this experiment, we will train a model to learn to attribute complexity to unknown demands, i.e. new customer requests. As with all supervised ML problems, we need historical data from which the system will learn. In this case, we use a set of 20’000 queries with their associated complexity, the label.

After loading the data, the second step is called feature engineering. It is the process of using domain knowledge of the data to create new features that make machine learning algorithms work better. For example, you can extract date and time information from a timestamp and create features such as weekend, day or night, day of week, year, …

Then, you need to divide your dataset into at least two subsets called training set and test set. We will not use the third set, called validation set, in this example for simplicity, although for your implementation, I recommend using the validation set. The training set is used to train the model and the test set to test its accuracy on data that has not been seen. This will tell us if the model is able to generalize.

In our example, we decided to train two algorithms, a neural network and a decision tree. Both models are scored with the test set and then compared to each other. In our example, we used the overall accuracy to decide which model to use for production. In this case, the decision tree did better with an overall accuracy of 0.94 versus 0.88 for the neural network.

Here we finished, to deploy our web service we just need to click a button and it is deployed and available from Excel or any other application using a simple http POST request. As you can see, this is not rocket science but be very careful here, it requires the right skills, methodology and management organization to be successful.

What are these ingredients necessary for an AI project to succeed? First, you need affordable computing power to run the training phase of your model; some algorithms can be greedy in terms of computing. You can find many specialized cloud infrastructures and start right away. You do not need to invest in an on-site infrastructure that costs you a lot of money to buy, install and maintain.

Then you need sophisticated algorithms to build your model. The good news is that the open source community is big in AI. Many very good algorithms are available in libraries for languages like R and Python, two programming languages commonly used for machine learning.

The previous two components are required but not sufficient to implement an AI project successfully. The third ingredient is data. Data is at the heart of your project and is a very important success factor. If you plan to implement a specific AI service such as the one presented, you will need clean and reliable data from your organization.

I would like here to clarify something that is not always understood. To require data does not mean to collect as much data as possible. An AI project will not start with your data but with your strategy, what does your company want to achieve with AI? It does not matter what data already exists.

This does not mean either deleting data that could be used in the near future. I have seen companies delete their data: to save a few megabytes of disk space, a company decided to keep a summary of orders after one year and delete the lines of its orders. Data storage costs nothing today, please keep the data that you already collect, even the logs of your website, application logs that are usually deleted every n days are full of information.

You have to compromise, when you are dealing with data you need a good data strategy and data governance that will determine what data to keep or delete. Data governance refers to the overall management and caretaking of data, from its creation to its deletion, covering usability integrity and security.

Apart from the technical and organizational aspect, an important part of good data governance relies on building a data culture within your organization. At every layer, there should be a culture of data being a valuable asset and to care of it.

You do not have to wait for the entire data strategy to be in place before starting your AI projects. It can be built in parallel and benefit from your first experiences.

Last but not least, you need talent that know how to use this technology, manipulate data and lead an AI implementation project. The problem: the skills required are scarce and most of them are absorbed by the largest companies in the world. You will need to find these skills outside your company, and to partner with specialized suppliers in AI. For these kinds of projects, a great experience helps a lot. It’s not just about technical knowledge, we use a lot of our guts and we know what algorithm will work or not without always being able to explain it.

Now is the right time to get onboard and use AI intelligently for the benefit of your processes, products and services. I encourage you to discover the potential of AI for your own business and concretely asses the relevance of this approach.

We created an AI Starter Kit. It is a flat rate offering, which allows your company to discover the potential of AI to solve one of your problems. At the end of it, you will have an AI-based service implemented like the one I presented in this article.

In conclusion, I would like you to remember that, yes, AI can be complex and requires specific skills, experience and an adapted management to be implemented and deployed successfully. But I also would like you to feel that “AI for all” can become “AI for you”.