Machine Learning: Why now? What’s in the future?

The following blog is a transcription of the presentation done by  Andréa Dunbar, Director of Embedded Visual Systems, CSEM, and Alumna of EPFL EMBA program speaking at Innovation Leaders 2018 event on Machine Learning. You can also see the full video below.

Andréa Dunbar, Director of Embedded Visual Systems, CSEM, and Alumna of EPFL EMBA program at Innovation Leaders 2018

First I’d like to thank the organizers for inviting me here to speak today and thank you all for coming. Tonight, I’m going to talk about machine learning. Why is it such a hot topic at the moment? Why is it happening now? Also, what’s going to happen in the future, both in the short term future and then at the end I’m going to talk about some possible long-term future of AI and machine learning.

First, I’m from CSEM. CSEM is a technology transfer company, and its mission is to help Swiss companies become innovative by taking the latest technologies coming out of places like EPFL, ETHZ and pushing them into their products so that they stay on the leading edge of technology. To do this, we have about 450 people who are very multidisciplinary, and we’re also very much international with about 40 different nationalities. When technologies have enough potential, we will make a startup.

However, let me come back to machine learning one of the technologies that we’re trying to help companies put into their systems. Chris already gave a very nice introduction to machine learning, but I’m going to go further giving you some of the details so I can make out why it’s so relevant today, why we’re doing this now. Most machine learning that is out there today is called supervised machine learning. Supervised machine learning means that you have been telling the code or the machine what it’s trying to predict. So, first of all, you’re taking your data, and it might be a bunch of images, and in that data, you’re labeling them, and you’re saying all of these images are chairs: “Learn that that’s a chair, machine! You need to learn that that’s a chair!” You can be taking millions of images to do this. Labeling is very important, and this is why it’s supervised learning.

Supervised learning means you’re labeling the data which is often a very laborious task. If it works, it’s a chair! Great your system is working well! If it says it’s a bed when in fact it’s a chair, then you say: “Hey you’re not doing this right, and you change it, and you come back and train it again and again and again. This is the first part of the machine learning system. That’s one particular stage that we’ve all called training tonight. Once the system has been trained, you move to the second part which is where you test how well your system is predicting things. Is it predicting 98 out of a hundred times it’s a chair? Is it 99? Also, where are you putting this accuracy of prediction? That’s for you to decide but it’s also going to depend on how much data you can feed into your machine and how deep you manage to make your code as we call it.  So two separate phases: first we train when it’s trained we test and then we’ve got to the measure of accuracy we want to implement.

An example: self-driving cars. It doesn’t always work okay but what I’m going to show you if you look down on the left-hand side is an image, a visual image just from a normal camera and then on the main screen you’re going to see what the machine learning part is labeling. So as we started moving along the car, it sees a truck in front, sees the trucks turning off, there are cars on the right and left, there are magenta boxes, there’s a green line that you’ve trained the machine to know that this is the center line. It’s a cyclist coming up on the right, and there is this red block. You’re getting to a red light you need to stop. It stops. You have pedestrians coming on the left. So you see there’s a huge amount of information. With this vast amount of data, we’ve trained your machine even though it’s never seen this data. This is why it’s called intelligence. It is without ever having seen this particular image, it extracts information from this data and uses it to drive a car. This is Google cars from YouTube.  They have driven labeling data for two million miles at 30 miles an hour. We’re talking about teraflops of bites. This is like 10 ^ 15! It’s a massive amount of data! However, this is what manages to make these systems work.

So why now? Why are we getting these systems like this here, today?

There are two main reasons in my opinion:

  • First of all, data is now digital. You could not do this when your data was previously handwritten. You need to have digital data. Today much data is digital: we’ve digitalized photos, we’ve digitalized all logistics and big companies it’s all digital data. So we have access to digital data.
  • The second reason is that we can treat this data with the computers that are now on the market. My computer today is a thousand times more powerful than it was 20 years ago. A thousand times my desktop!

So we now have the digital data, and this allows us to treat more and more complex problems. The first type of algorithm that you might want to process: “y equals x” doesn’t bother about the data. You don’t need to feed this type of algorithm with many data. However, if you go to a more complicated machine learning technique where you want it to predict or to extract information from something it’s never seen before you need data. The algorithm is going to improve as you feed it with data: the more data you give it, the better it’s going to get only up to a certain point and then it’s going to plateau. Once this plateau attained, It doesn’t get any better.

Now let’s make an even more complicated system with a deeper neural network where we can look at more complex cases. Then we’re going to need even more data to have an even better result. And on it goes. The large neural networks today have a thoughts layers. It’s a considerable amount of data but you can treat more complex problems, and your accuracy can go up. So that’s why we’re doing machine learning now: BECAUSE WE CAN!

Where are we using this machine learning today: self-driving cars. Anywhere where you want to extract information from data when it’s not always the same: medical images, translating languages, anomaly detection. When you have a defect in a piece of fabric it never looks the same, so you can’t just say: “Just look for this.” Every time it’s different.

Now I’m in a business conference so what I want to explain is why data is an asset to your companies and how it can be a protected intangible asset for you if you make a machine learning product that works with the asset. So you start with your asset that is your data, you make a machine learning product that works, and you give it to your clients, and you say: “Hey look this can detect if there’s a defect in your material that you’re making. Use it!” Every time they use it, they’re collecting more data for you to use and train your network again and you’re going to do it better next time. Those users are going to use it and then going to do it again, and then you’re going to have the best product in the market. Then you’re going to go back to them again, and they’re coming back to you, and you have this wonderful feedback loop. Suddenly you’re ahead of the game, and you’re better than anyone else on the market. So there is an urgency to get into the game. Many people in this room are in a unique position that they are in a company that has access to a very particular data set. So you can solve a very particular problem, and if you get onto the market first, I think it gives you an advantage.

Most machine learning today is done in the cloud, and I think we saw with what happens when you go up to the cloud: it treats your data, and it comes back down. Advantages and disadvantages but that’s where we are today. One of the drawbacks of using the cloud is that it takes quite a lot of energy because you have to move, stock, store your data, bring it back. The computing center in Nevada is six hundred and eighty thousand meters square. They think that energy and data centers might be as much as 20% of the world consumption by the year 2025. So they’re
quite energy intensive.

The other thing is that the big companies are using the data for the moment. Little companies don’t seem to be using the data. This is a problem. The big companies are getting people to give their data for free, and it’s giving them the advantage. So we need to grab back that data. The third thing is security! So we’ve heard about that as well tonight: people are stealing your data they’re using it without your permission, and we want our data back. So these are three issues that we are facing today. Another issue that goes with transferring to and from the cloud that we saw tonight: it doesn’t always work! There’s a bandwidth issue, a latency issue and sometimes that’s a problem.

Where is the medium future of machine learning going to be?

In my opinion, we have to take out of the cloud and do the machine learning where you’re getting your data. This will allow you to move to the edge. Where you take your image is where you treat your data. Then you don’t have to worry about bandwidth problems, latency, you can get automatic feedback, and you can do it instantaneously.

I show here an example of what we do at CSEM. Here is a face recognition system that we’ve done. It’s in a envision and package. The aim is to detect and track faces. Everything is in this module that we do. So we measure key features: face detection is based on the physiological distance between the eyes and to the nose. We named the people, and the system learns who the people are and can recognize them afterwards. However, the key to this is that it’s an edge device: it has no communication to the cloud, all of the electronics are embedded: the sensing, the storage, the processing, and the communication is all on this device. You don’t need to communicate your data to anybody else. You don’t need to share your data. You don’t need to connect to the cloud. Advantages of such a system: energy. The system uses a lot less energy because you’re not storing and transmitting your data.

I believe these things will ultimately be autonomous. For example, a watch which works on solar battery. I think that eventually many edge devices will also work through energy harvesting and we no longer need to have these massive data centers which are costing us in terms of the transmission of energy and stopping the energy. The data which at the moment is in the big people’s hands, we can grab that back and keep hold of our data. Use our data for our purposes even if you’re a small company. Last but not least the security. You’re no longer shifting large amounts of data where you don’t want them to go, and you have to take care of your security. So I’m not saying it’s more secure, but you have control over your own security. So these are the advantages of being at the edge of computing.

What next?

Everything we talked about tonight, apart from the clustering that was mentioned, has been supervised learning. So supervised learning is we label the data, we feed it into an algorithm, we train it this way: laborious, tiring we don’t like doing it, nobody likes doing it. So how do we go beyond that, how do we go to a paradigm where we can use our data without this process? As often, we look at the biological systems and how they do it. I mean you don’t tell your kid every day: “That’s a horse. That’s a horse. That’s a horse. That’s a dog.” You know they managed to learn this without that. They do something we call unsupervised learning or weekly supervised learning. Now for unsupervised learning, you have to have a lot more data, but you don’t have to label it. What the algorithms do unsupervised learning is it learns patterns. Like a child would learn to categorize a dog and it wouldn’t know it’s a dog until you say yeah that’s a dog but then they know every other dog fits into that category. Alternatively, we use weekly supervised systems where we give them a set of rules: “if you do this, you get chocolate.” Quickly they learn that that’s a good thing to do.

These are two systems that are not used much today in machine learning but are extremely powerful and will revolutionize machine learning to a certain degree. Why aren’t they used? Well, two reasons in my opinion. Just as here and now we’re using supervised machine learning because we have digital data and powerful computers, for these two next techniques they’re not powerful enough. The actual processing algorithms that we have today are not good enough. So we have two issues again. Look at the human brain: it uses between 10 to 25 watts, and they do about 10 to the power of 17 operations per second. So quite a lot but so do the supercomputers they can also do 10 to the power of 17 operations per second but they take 10 million watts to do. That is quite a lot less efficient, and on top of that, what we see in the human brain is that we have a three-dimensional organization. So the three-dimensional it’s not two-dimensional like silicon electronics. The brain learns. It is an adaptive system which changes as you move along. This is not the case for the machine.

One other limiting factor of silicon electronics is that the way electronics are used today we call it the von Neumann configuration. All that means is you have memory on one side and processing on the other. It’s inefficient. You take a bit of data, stock it in the memory to work on it, you send it to the processing. The processing works on it, and then it sends it back to the memory. It’s a very linear, slow process. To improve that type of memory processing we need to look to the human brain to do it. When we achieve increasing the processing memory, we can use more powerful algorithms and move into the new paradigm of unsupervised and weekly supervised learning.

There is already some work done on this. I want to finish by saying there are projects like the human brain project at EPFL which I think is an absolutely fantastic project. It’s really going to the root of how things work, how the brain works. When you know how something works, then you can start to mimic it artificially. Here, I want to use the parallel of flights. At the very beginning of aviation, we looked at the birds, and we just tried to copy them. So you can imagine that we could build the brain: 1.7 kilos of water and a bit of matter and stick it together. That is not going to get what we need. What we need to do is understand fundamentally how the human brain is working then we can move and adapt our silicon electronics. We’re already doing it a bit: we have a dynamic memory which is putting into the processing, people are working on normal fit computers and so on. I believe that this is going to be the future.

Thank you for your time! Thank you for listening!