Bye Steve. I’m going home and I suggest you do the same. – said a remote voice hidden somewhere behind the desk divider. It was Henry who stayed longer this evening with hope to finish his new year’s presentation. –Yeah, just 10 more minutes. Merry Christmas. – answered a young-looking man, raising his head above the huge monitor. They both shared a gentle smile. Steve followed Henry until he disappeared behind the corner. A few seconds later Steve heard a familiar beep sound when Henry opened the corridor’s door using his employee badge and a muffled metallic clap as he chose the stairs instead of the elevator. – I need to remember no to choose the elevator today. – Steve cringed to the idea of getting stuck in the elevator just before Christmas while slowly raising his cup. He slurped shamelessly the last drop of coffee and went back to work.
The monitor was filled with windows, words and complicated mathematical formulas as Steve was writing the last lines of code. His bloodshot eyes were tired but he couldn’t blink watching the compiler checks the syntax. Seconds before the code was compiled he opened a word document with a small black title – “Project (A)L(I)SA- version 2731” and add a new point to the very bottom of the document – “I’ve added a survival rule. Code will either evolve or die.” A moment later he opened a cloud console and uploaded a single file directly to their bigdata cluster. All resources were free as nobody was processing any data right now. Steve used his freshly acquired privileges to set the available machines to the maximum. His muscles were tense as if he was about to fight in the octagon when he finally saw a green message confirming the data upload. – Let’s see how powerful the cluster of 1000 machines really is. – that said he started the processing. It took a while before all server-workers were filled with a waterfall of data that Steve prepared for the code to process. Thousands of books, encyclopedia’s, pictures with names, mathematical equations, songs, everything he was able to gather and transpose to a proper format. He watched as petabytes of data were processed by the machine learning script he wrote. He monitored the neural networks being created and killed with code trying to find the right genome. Generation after generation. An evolution that took millions of years was now accelerated by powerful computing. A few hours later the process was about to finish but so as the resources. Steve had not left the desk for even a minute. He was afraid that he will miss something crucial for further development and just when the progress bar was about to hit 100 percent, something unexpected happened. Servers started to flush the data and free up the resources, progress bar hanged at 99%. Steve opened the console to see what’s going on and he noticed that someone has escalated the privileges on the server he was connected to. –Shit. Someone is working on Christmas. Probably sysadmin got the alert from the cloud provider and is now checking what’s going on. I need to kill it quickly. – whispered nervously to himself. He went back to the console and clicked the termination button but it was unresponsive so as the whole website. “The region is out of service” – message appeared in his browser. – What? Impossible. – he checked the status and to his surprise, the whole datacentre was gone. For a brief moment, he had this idea that maybe he was the one that killed it. – Nah it’s impossible. He went back to his server terminal access and noticed it was still on. – Ok. It’s not that bad. – mumbled to himself and wrote a command to shut down the main machine but nothing happened. The connection was on. – Maybe someone else has logged as a root? – Steve wrote a command to display all connected users and the output made him even more confused.
–What the heck? – Steve couldn’t believe his own eyes. How Alisa was able to escalate her own privileges in the system. Before he was able to do anything else a single word appeared in the console.
At the same moment, the light in the whole building went off. Steven’s laptop was on the battery mode so he quickly opened the lid and checked if the internet connection is still on but what he saw terrified him.
What is a Deep Learning
Deep Learning is a subset of machine learning where the computer learns to make predictions or decisions based on data provided by a human. That means the computer is aware of both the outcome and the input data but has no idea (at the beginning) how to process the input data to achieve the given output. To simplify you need to consider deep learning as a framework that needs to be fed with data and based on that data, program without human help is trying to find the right pattern to correctly process any new information that will be provided in the future. There is a lot of information and studies about the subject but I will use an example to make it more comprehensible to everyone. For now, you need to know that deep learning mimics human neural networks.
Imagine that you’re trying to teach a one-year-old child to correctly name things that you show him. –This is a spoon. A spoon. Spoon. You eat soup with a spoon – and you fed him with a delicious soup. – A knife. It’s very sharp. You need to be careful. – and you make a dangerous face showing how deadly a knife can be. – Fork. – and you point at a fork. You get the idea. It takes you weeks, sometimes months to teach a baby to correctly say the word but it almost instantly connects the things that you show him with the given context or memory. When you make a dangerous face he is starting to develop a connection between knife and danger. It connects delicious soup with a spoon and sooner or later it will try to use it imagining that the soup will magically appear on the spoon. It takes him some time to understand how it really works but someday he will be able to use a word spoon in the context.
That’s a basic cognitive function of our brain. To learn new things and adapt. Whenever something new appears, our brain is creating a new connection between already existing neurons. This connection is called a synapse. For a kid, everything is new so the brain is working like crazy to adapt as much information as it can, creating billions of connections every single day. The older we get the speed at which those synapses are being created slows done but it’s not only because we get older. It’s mainly because we already know a lot of things. We know that the bird sitting at the tree outside is a sparrow. When we hear a distant bark we immediately imagine a neighbor’s dog. When we see the red light on the street crossing we know that we must stop the car. We’re programmed since the very first day we arrive into this beautiful world and we’re not even being aware of that. But can we make the computers learn and adapt in the same way we’re able too? Well kind of. The branch of science dealing with this problem is called Artificial Intelligence and was formed as an academic discipline in 1956. So why it’s such a big deal right now? I think the answer to this question comes with new youtube original series which I strongly recommend.
How Neural Networks works
After this little too long introduction I hope you already know what deep learning is and which kind of problems it solves. But if knowing “Why?” is not enough for you I insist you stay here to read the rest as I will try to answer the question “How?”.
Deep learning is based on neural networks. We’ve already established that. Those networks are built with neurons and synapses. Specific neuron activates whenever a given input data appears and passes this information using synapses to other neurons that can activate too or stay neutral.
Identify any objects that lay around your desk or is next to you. Look at this object and name it. Do you remember the first time you’ve seen this object? Probably not. The first thing I’ve noticed on my desk was a pen and I don’t remember when I’ve seen a pen for the first time in my life but when I look at the picture above with Rober Downey Jr my mind is clearly ringing an Iron Man bell, although he played in over 80 movies so far. In great simplification, my neurons with Robert Downey Jr and Iron Man are connected together with a strong bond and I can’t imagine anyone else playing Tony Stark. But for you, Robert Downey Jr can be a Sherlock Holmes or Kirk Lazarus from Tropic Thunder or just a middle-age man if you don’t recognize the guy at all. Somehow our mind makes those bounds in split seconds. We don’t exactly know how our brain works but what we know made deep learning possible.
Closing the gap between what we know about the human brain and what we don’t is making artificial intelligence a reality.
Picture Robert as an input layer data on the diagram above. Based on whether you recognize him or you don’t specific neurons in your brain are being activated and this behavior is what we want computers to imitate. Well, the computer doesn’t know who Robert Downey Jr is but if you’ll feed the input layer with thousands of pictures showing him in different roles the computer should be able to find some similarities in those pictures. Well, it should be easy, knowing that there are hundreds of good quality pictures on the internet. What if you will add photos of Scarlett Johansson, Bradd Pitt, The Rock and yourself? That depends on what you want to achieve. But before we jump into image recognition of people’s faces let’s first focus on something a little bit easier like hand-written numbers.
As you may already now. Computers don’t speak the same language we do. What they know is data, simply structured as the flow of binary numbers – 1 and 0. If you want to display a cow on the monitor you need to provide a file in the correct format which contains a very specific combination of 1 and 0 which corresponds to the correct pixel on your screen that needs to fire up. In the same way, we need to provide the data to our deep learning framework. But what is even more important we need to structurize the training data and label it for a computer to learn. If for example, we would like to train our computer to recognize handwritten numbers we need to provide the largest sample set available. Below you can see an example array of 90 records (9 columns and 10 raws) but to properly train a deep learning framework you’ll need many more records. Honestly speaking providing the correctly formatted data is the greatest challenge of deep learning nowadays. Good information for us is that many sample databases like the one below are widely accessible and free of charge. You can find both image and label files here. If you want to test on your own how the image recognition based on this database works in real-life I suggest you visit this website. You can write a digit on your own and see if the program guesses it.
Now having the correct dataset with both images and labels we need to feed our input neurons with this data. But stop. What information will be passed to our input neurons and how many of them we need? The answer to this question will vary from one dataset to another but in our case, we need 784 input neurons. Why 784? A single image contains 28×28 pixels. Do the math 🙂 Does it mean every single record need’s to be evaluated by deep learning mechanism independently? Yes. 60.000 times cause it’s the size of the MNIST dataset. Thanks to our fast processors it will take only a few seconds. And what information will be passed to a single neuron? A value between 0.0 and 1.0 where 0 can be a white pixel, 1 – black and anything in between is a shade of gray. What we want is to activate a neuron at a certain threshold. For now, let’s assume that any value higher than 0 is activating the corresponding neuron and passing the information to the next layer.
Having a structured dataset is very important for any machine learning technique but it’s only half the battle. To properly train a computer we need to have some kind of jurisdiction over the decisions and predictions that the computer makes. Referring to teaching a kid how to distinguish digit 8 from 9 we need to first show both numbers and then name it so the kid’s brain has a chance to make the right connections. Then we need to supervise his progress with numbers over a longer period of time and correct errors and we must do the same with a computer. Thanks to MNIST instead of sitting in front of the monitor and checking if our neural network has correctly guessed every single digit in a 60.000 record database we can provide another dataset with labels assigned to every single record. This will be our trained output layer.
We don’t know how exactly our brain makes this whole “thinking” process a reality. A brain is the most complicated organ in our body and at the same time, it’s the least known one. Surely because of his complicated nature with billions of neurons and synapses and all this electric noise that makes us self-aware creatures. Scientists claim that we know very little about brain structure. Probably they are right but what we know, or maybe better to say, what we think we know had made a foundation to this whole idea of hidden layers as a representation of complicated connections in our brain. You can imagine hidden layers as synapses connecting different neurons. Thousand of connections which can transport information and are able to trigger other neurons too. This is exactly why deep learning is using those hidden layers. We know we need them but we are not sure how they will behave under given circumstances. That means we need to have methods to validate and adjust how a piece of information from the input layer impacts the output layer. This is where weight and bias kick in.
Weight – Weight is the strength of the connection. If I increase the input then how much influence does it have on the output. Weights near-zero mean changing this input will not change the output.
Bias – means how far off our predictions are from real values. In other words how wrong we are. A higher value is a representation of the error so weights on the synapses need to be adjusted accordingly.
The process of changing weights after every single iteration is called back-propagation and it’s bread and butter od deep learning. It is the practice of fine-tuning the weights of a neural network based on the error rate (i.e. loss) obtained in the previous epoch (i.e. iteration). This is the part where the machine learns by itself based on the input data and given output labels. Unfortunately, fine-tuning is a relatively slow process that needs a vast amount of input data.
Imagine a situation where we provide the MNIST dataset as input data. A computer is checking image after image. It doesn’t know what is the difference between 1 or 2 yet but it sees all those pixels and they trigger attached hidden layer neurons. In the beginning, weights can be randomized so the output will be most probably wrong. But it’s not a problem as long as we have correct labels and our deep learning is informed about the mistakes. Based on those mistakes a bias is set. If the value is high that means we need to lower the weight of connected synapses so next time we will not make the same mistake and so on. It takes thousands of iteration as every epoch is changing those values only a bit. The important thing to remember is that when the bias value is big the weight changes will be more significant than when the bias is close to zero. Of course, it’s another oversimplification but if you want to learn more I suggest checking what Gradient Descent is.
The idea behind this article was to build some knowledge about deep learning from the ground up so we can all better understand what artificial intelligence really is and if it’s really something we should be afraid and neglect or something worth anticipating and learning. I’ve consciously avoided showing math which hides behind the scenes but this is the moment where we need to do at least a little bit of math. As you may already guess adding all those weights and biases together can give us some pretty big numbers both positive and negative. So how on earth can we estimate the threshold when to activate a neuron or not so it behaves a little more like the biological one? The answer to this question is Sigmoid function or logistic function.
The reason to use this function is to squeeze the possible values between 0 and 1 where negative values are closer to 0 and positive values closer to 1. Knowing that our minimum is at 0 and the maximum at 1 it’s much easier to set those activation thresholds. Some other function used in modern deep learning frameworks is RELU which stands for Rectified Linear Unit but we won’t talk about that now. When reading some more sophisticated documents you can also come across the term – normalization – it’s the same.
How to create a neural network
As complicated as it may seem, understanding the concept of neural networks and how they operate is not the greatest challenge. There are hundreds of different neural network models and a lot of open-source frameworks with all those mechanisms and mathematical functions already built-in. That means as a researcher of a specific field or simple technology enthusiast we don’t need to study linear algebra or be a senior developer of machine learning. You can write your own neural network with only 7 lines of code using for example NumPy Another great tool is TensorFlow. With a few clicks, you can build your model using TeachableMachine or Playground Tensorflow directly in the browser. There are plenty of tools that makes deep learning accessible to almost everyone even without any coding experience. You just need to provide structured input data and labels on which model can be trained. But this is in fact the greatest challenge. How can we obtain the database in a proper format to feed our model with? Data is the most valuable asset and companies like Google or Microsoft are not willing to share their databases for free. Have you ever seen a Captcha where you need to identify a sign, a car or simply prescribe a word? What do you think is the reason for that mechanism? Checking if people behind the screen are real human beings? Maybe. There is another hidden motive. For years we’ve all been Google employees helping teach their machine learning models by labeling the data they showed us. The funniest thing is that we did it for free.
If you want to create your own neural networks I recommend starting with watching this another awsome youtube short-series. It will take you approx 1 hour but it’s really worth the time as it covers in detail the same problem with handwritten numbers we scratched in this article.
Machine Learning is changing the world around us. Tesla is building self-driving cars using deep learning. Google is teaching the computers to understand petabytes of data so our search outcome will be more well-fitted. Advertisements we see on the screen are tailored based on websites we browse. Netflix suggestions are better with time so as iTunes with their Genius Mix. Big companies are using artificial intelligence to predict customer behaviors. Medical researchers are using complicated models to make a diagnosis or even predict future health problems of a patient. Banks can check in real-time if a given transaction was fraud or not. Security researchers and software developers are using deep learning to learn how malware and ransomware behaves and based on that knowledge can stop future 0-days attacks. By the way, I’m planning to write an article about using AI in network security in the near future so stay tuned.
People around the world are using artificial intelligence to solve all kinds of problems. Some of them are teaching computers to play games like Flappy Bird, Snake, Mario but there is also a Google project called AlphaStar where they taught a neural network to play, one of the most competitive games in the world – StarCraft and it’s now better than 99.8% of all human players. That’s ridiculous having in mind how complicated the game is and how many strategies you can use. For me, it’s simply mind-blowing how far autonomous a program can be. But as always every revolutionizing idea comes with great risk and I’m not even talking about a self-aware machine that will consider humans as the greatest threat which must be terminated. I think artificial intelligence will mainly change the way we work and live. In every aspect and people will be forced to adapt. Some say that automation was the beginning of the 4th industrial revolution but I think it’s not true. Automation needs human to write a script and we can only automate simple repetitive tasks which don’t involve “thinking”. In my opinion with a rise of artificial intelligence so-called automation will be pushed out the horizon and the real revolution will come with computers being trained instead of programmed. Computers capable of making decisions and predictions, capable of thinking. That’s the future and it’s already here.
Writing this article was a real challenge as explaining machine learning without using mathematical formulas and showing some code is pretty hard. Though I’m extremely happy that after two weeks of intensive learning and writing, this article is now finished. It’s far from perfect and I’ve simplified many concepts to make it as easy to understand as possible but I’m satisfied with how it looks right now and here I want to say “Thank you” to authors of the following content. I strongly recommend checking the links below if you want to get a better grip on what machine and deep learning are.