Over the last year, I had the opportunity to plunge into the world of AI (Artificial Intelligence). Although one could say that this is too short to be able to say something relevant; it was certainly very interesting. The various experiments were insightful.
Actualization of AI and data generated as a by-product of our lives can give us great advantage in our day to day challenges. AI, Machine Learning and Deep Learning aren’t the same thing, and it’s important to understand how these learnings can be applied differently.
This is the amazing language of mathematics, statistics, logic and our fantasies about the future. AI refers to the machine that possesses the characteristics of human intelligence. This is the concept we think of as General AI. It’s defined as the capability of a computer program to perform tasks or reasoning processes that we usually associate with intelligence in a human being. Compelling argument against this definition is that it depends on what we consider being intelligent in the behaviour of a human being, but nevertheless it is one of the official definitions. Also, it raises a lot of questions regarding the ethical principles, but that’s a different topic. The actual idea about AI is very old. It’s founded in the Greek myths, in the shape of a mechanical man, designed to mimic our behaviour. Nowadays, we can divide it into two groups: applied and general. Sometimes the term for applied AI is also known as Narrow AI. Applied AI represents the technologies that are able to perform specific tasks as well as we humans can, or even better (autonomous driving, face recognition, stock trading etc.). General AI, which can handle any task, but only in theory, is less common. This is where some of the most exciting advancements are happening today.
Machine Learning is the term sometimes used interchangeably with Artificial Intelligence. Undeserved. Machine Learning is a part of AI, and thus it’s wrong to address them as they are the same. The main idea is to create algorithms, which can produce some output values based on some input values with the help of statistical analysis. So, in general, it automates analytical model building. Task is to search patterns in a large amount of data and to adjust the output actions accordingly. For example, when you shop online and you are being offered products related to your current purchase, or to your profile. Roughly speaking, this is the result of statistical analysis, but there is really a lot of math under it.
We can distinguish two groups of Machine Learning: supervised and unsupervised. It’s related to usage of labelled/unlabelled data during the learning (pattern searching) process. There is also a mixture called Semi-Supervised Learning. More detailed explanations regarding this division and functionalities are to come in future posts. At this point, we will focus on Deep Learning as a subset of Machine Learning where groups mentioned above also extend.
To put Deep Learning into perspective, we can say the following: Machine Learning took some of the fundamental concepts of AI and steered them against solving real world everyday problems by using neural networks. This works similar to how human being perceives the world. First it observes, then it thinks about it and in the end the conclusion is drawn. The thinking part is learned from examples, and the more examples the better the conclusion is. Just like babies. When babies are born, their memories are white and blanc. When they see the face of mom and dad, they slowly start to recognize them. In the beginning the baby makes mistakes. As time passes, the baby gets better and better in recognizing, because of all the data the baby processes. The recognition goes further and becomes more efficient, resulting in the baby also recognizing mom and dad by voice.
Deep learning – Connecting the knots
In order to achieve this, Deep Learning uses a layered structure of algorithms, called an artificial neural network which is inspired by the human brain. Neurons, axons and dendrites can be grasped as mathematical functions with adjustable coefficients fine-tuned during the training. We can look at them as a bunch of connected knots (where the knot is a mathematical function). When you define the input, it has to travel through the knots in the network. Every knot will receive some value, which will be processed and passed further down the path. The way the knot treats the value it receives, depends on the adjustable coefficients we mentioned. At the end, there are knots specified for output values of the network. Their value is to be analysed in order to make prediction. It sounds simple, but it’s not, because every piece of data needs to have its numerical representation. The structure (if there is one) and relations between the more complex data structures need to be preserved during this transition. When everything is in place, the experiments can begin. These experiments include tweaking the network architecture, knots number, functions, etc., in order to test the desired output.
In a nutshell
So, to sum up. The what? Everything we do every day. The how? Collecting the data and finding patterns. The why? Optimizing these processes, and making the most out of the time we have. Regarding this, I would like to repeat something once said by Sherlock Holmes and mark his words as my conclusion: “The world is woven from billions of lives, every strand crossing every other. What we call premonition is just movement of the web. If you could attenuate to every strand of quivering data, the future would be entirely calculable. As inevitable as mathematics.”