Machine learning efficiency is surpassing Moore's Law

Posted Jun 16, 20205 min read

Introduction:Whether progress will continue unabated. Moore s Law theory is about to hit the wall in the next few years. It remains to be seen.

Eight years ago, a machine learning algorithm learned to recognize a cat, and it shocked the world.

A few years later, artificial intelligence can accurately translate the language and defeat the world Go champion.

Now, machine learning has begun to stand out in complex multiplayer video games such as "StarCraft" and "dota2", as well as subtle games such as poker, and artificial intelligence is rapidly developing.


But how fast is it, what drives it? Although a better computer chip is the key, AI research organization OpenAI believes that we should also measure the speed of improvement of actual machine learning algorithms.

A paper written by OpenAI's Danny Hernandez and Tom Brown and published on arXiv points out that researchers say they have started tracking and measuring the efficiency of machine learning, that is, doing more with less resources. They use this method to show that artificial intelligence has become more efficient at an extremely fast rate.

Algorithm efficiency improvement to accelerate research

Generally speaking, there are three factors driving the progress of AI:the amount of calculation, data and algorithm innovation. Computing power is easier to track, but improvements in algorithms are a bit elusive.

We can define the efficiency of an algorithm as reducing the amount of computation required to train a specific function. It is the main indicator of the progress of an algorithm in computer science. The efficiency gains of traditional problems(such as sorting) are easier to measure than machine learning because they can measure task difficulty more clearly. However, the efficiency lens can be applied to machine learning by keeping performance constant.

Since 2012, the amount of computation required to train neural networks in ImageNet classification to achieve the same performance has been reduced by a factor of 2 every 16 months. Compared to 2012, the amount of computation required to train neural networks to AlexNet(a benchmark image recognition algorithm) is now 44 times less. The results of the study show that for AI tasks that have recently invested a lot of money, algorithm advances have generated more benefits than traditional hardware efficiency.


The total amount of calculation(in terabytes/day) used to train to the AlexNet level, the lowest calculation point at any given time is displayed in blue, and all measurement points are displayed in gray.

Especially in other popular functions such as translation and games, the improvement speed is faster in a shorter time frame. In terms of translation, in the English-French translation three years later, the Transformer algorithm's computing power is 61 times lower than the seq2seq algorithm; only one year later, DeepMind's AlphaZero in the Go game, its calculation amount is 8 times less than AlphaGoZero, it can be compared with AlphaGoZero Competing; and just three months later, OpenaAI Five Rerun used Dota 2 with five times lower computing power than the original, which surpassed the world champion OpenaAI Five.

The improved efficiency of the algorithm allows researchers to conduct more interesting experiments at a given time and money, accelerating future AI research.

Moore's Law of Machine Learning

Is there some kind of algorithm Moore's Law in machine learning?

The researchers said that there is not enough information to explain this. Their work included only a few data points, and the original Moore's Law chart was also barely observed, so any inference is purely speculative. In addition, the research focuses only on a few popular features and top-level programs. It is unclear whether the observed trends can be more widely extended to other AI tasks.

For languages, games and other fields, large-scale computing is still very important for overall performance, so tracking efficiency is particularly important. The long-term trend of measuring efficiency and overall performance will help to quantify the progress of the overall algorithm. Researchers have observed that hardware and algorithm efficiency improvements are multiplyable and can reach similar scales in a meaningful range, suggesting that a good model of AI progress should integrate both metrics.

The results also show that for AI tasks with high investment levels(researchers spend a lot of time and effort), the efficiency of the algorithm may exceed the benefits of hardware efficiency(molar efficiency).


Moore's Law was proposed in 1965, that when the price is unchanged, the number of components that can be accommodated on the integrated circuit will double every 18-24 months, and the performance will double.

At that time, the integrated circuit had only 64 transistors, and then there were personal computers and smart phones(iPhone11 has 8.5 billion transistors). If we observe an exponential increase in the efficiency of AI algorithms for decades, what might it bring?

For these reasons, researchers began to publicly track the overall performance of efficiency, first exploring visual and translation efficiency benchmarks, including ImageNet and WMT14, and then considering adding more benchmarks over time. Tracking various measures, including hardware measures, can depict a more complete picture of progress and help determine where future efforts and investments are most effective.

The future of artificial intelligence

It is worth noting that this research focuses on deep learning algorithms, which are currently the dominant artificial intelligence methods. Whether deep learning can continue to make such huge progress is the focus of debate in the field of artificial intelligence. Some top researchers in this field question the long-term potential of deep learning to solve the biggest challenges in this field.

OpenAI showed in an earlier paper that the latest popular artificial intelligence requires quite amazing computing power to train, and the resources required are growing at an alarming rate. Before 2012, the growth of computing power used by artificial intelligence programs mainly followed Moore's Law, and since 2012, the growth rate of computing power used by machine learning algorithms has been 7 times faster than Moore's Law.


This is why OpenAI is interested in tracking progress. For example, if the training cost of machine learning algorithms is getting higher and higher, it is important to increase the funding for academic researchers; if the efficiency trends prove to be consistent, then it is easier to predict future costs and plan investments accordingly.

Whether progress will continue unabated, the theory of Moore's Law will still hit the wall in the next few years, remains to be seen.

But as the authors write, if these trends continue in the future, artificial intelligence will become more powerful, and may be faster than we thought.