[AI Story] Humans vs. Artificial Intelligence (5) AlphaGo (AlphaGo) Intuition

2024-07-04

In the fifth installment of the series, we learn about AlphaGo (AlphaGo), which surpassed humans by imitating human intuition and reasoning. If you haven't watched past content in the series yet, I recommend checking it out first.

‍[AI Story] Artificial Intelligence Challenging Humans (1) Deep Blue (Deep Blue)

[AI Story] Humans vs. Artificial Intelligence (2) Watson (Watson)

[AI Story] AlexNet (AlexNet) opened the era of human vs. artificial intelligence (3) deep learning

[AI Story] Humans vs. Artificial Intelligence (4) Pepper (Pepper), a robot that reads emotions

‍

“I won. We landed on the moon.”

‍

These are the impressions* revealed by Google DeepMind Chief Executive Officer Demis Hassavis (AlphaGo) after AlphaGo (AlphaGo) won the match against Lee Se-dol 9th. Through the confrontation of the century that took place in 2016, we witnessed a decisive moment comparable to the human landing on the moon in 1969. As a result of this, we are all already living in the midst of tremendous technological innovation, and I was able to realize how much impact it will have.

‍

However, until now, it was thought that it would be much harder for machines to beat humans in Go than chess. It's a complicated game with a much larger number of possible cases than chess. In other words, it's hard to expect the traditional method of calculating the number of cases a computer can do and producing results as before.

So how did AlphaGo succeed in this seemingly impossible challenge? Through this content, we will learn about AlphaGo's birth process and the secret to its success, and think about how its shocking appearance at the time influenced the future direction of society and technological development.

‍

Until AlphaGo came out

Even as we entered the 21st century, AI technology seemed to be at a standstill for a while. However, it was still unknown to the public; in fact, AI technology was steadily advancing towards a new heyday. Through trial and error, the researchers found breakthroughs such as machine learning and deep learning, and were moving closer to AI that was more like human thinking.

And the rise of AlphaGo is a decisive momentum that formalized the advent of a new era.

‍

CEO of Hassavis DeepMind and Lee Sedol, 9th Dan, sources

‍

How did AlphaGo win

<네이처>Paper published on Google 'Conquer the game of Go with deep neural networks and tree search(Mastering the Game of Go with Deep Neural Networks and Tree Search)'** is “Every game with perfect information has optimal value features, ... (All games of perfect information have an optimal value function,...)“It starts with the word**.

Looking at this, we can infer the direction in which AlphaGo was developed. After all, Go is a game where you can find the best spot in a limited space called a checkerboard. However, it can unfold in a world with 19 horizontal and vertical lines “The number of Go cases is greater than the number of atoms in the universe.”The problem was that it was so huge, though. ***

The key How to reduce the number of Go cases to almost infinityIt will be adjusted to

To this end, AlphaGo constructed a deep neural network (DNN, Deep Neural Network) combining a policy network (policy network) and value network (value network), and combined it with Monte Carlo Tree Search (MCTS). It's designed to combine a policy network that narrows down the search by predicting the next number with a high chance of winning, and a value network that estimates winners by calculating win rates, and make the most advantageous choices through MCTS. Also, to this end, you will go through the stages of policy network guidance learning, policy network strengthening learning, and value chain strengthening learning. ****

Supervised learning of policy networks (Supervised learning of policy networks)

A supervised learning method using a huge amount of Go big data was used to find the optimal number. I trained them to imitate professional human writers by studying the vast amount of reports accumulated over time using convolutional neural networks (CNN, Convolutional NeuralNetworks). Thanks to this, the prediction probability, which was previously at the level of 44%, has increased to 57%.

‍

AlphaGo's neural network learning process, sources

‍

‍Policy network strengthening learning (Policy Networks)

We further enhance the performance of policy networks through repetitive self-help, that is, supervised learning through practice. This made it possible to overcome the limitations of being optimized only for stories. Through reinforcement learning, which allows machines to explore the most rewarding choices on their own, they will be able to win 80% more than before.

‍

The structure of policy and value networks, sources

‍

Value Network Reinforcement Learning (Value-Network Learning)

Analytical ability is further enhanced by assigning weights based on reports accumulated through one country and progressing to the next major country. Eventually, you'll be able to find the optimal number to increase your chances of winning.

‍Find the optimal number

In summary, AlphaGo was able to select the optimal next number in a short time by combining the policy network and value network with the MCTS algorithm.

As shown below 'a. Selection Reduce the number of cases by determining whether this is a good number or an unsuccessful number (Policy Network) b. Expansion, c. Evaluation The Nature paper explains that it is possible to determine the next number through faster prediction and evaluate the value of the quantity (Rollout, Value Network). d. Backup The final number is predicted and determined by combining the results of processes b and c. ' It says *****.

‍

Searching with policy and value networks, sources

‍

While finishing

Since then, AlphaGo retired from the Go world with a brilliant official record of 12 wins and 1 loss in 13 games. Eventually, Lee Se-dol's first victory with the 9th team became the final victory for humans in the Go match against AI. Furthermore, AlphaGo's impact later had a tremendous impact not only on the Go world but on society as a whole.

‍

Since then, AlphaGo itself has tried to use it in various fields such as medicine, physics, biology, and climate change due to its potential not limited to Go. It went beyond the limitations of AI for a specific purpose, like Deep Blue before. Unexpectedly, AI technology has entered a process of evolution towards artificial general intelligence (AGI), which is comparable to human intellectual ability.

Also, as a result of this, we were able to begin full-scale discussions on preparing for a future where AI coexists. It was a great opportunity to prepare for future shocks caused by faster technological developments than expected. Thanks to this, discussions about a future where humans and AI coexist beyond vague expectations and fears were in full swing.

However, we still don't know what kind of future this discussion will lead us to. But one thing is certain: since AlphaGo, we're at a more important turning point than ever before.

‍