Data Structures and Algorithms | Part One Lesson Three: Algorithm Complexity (Part 1)

Posted May 27, 20209 min read

Program = data structure + algorithm

Author Xie Enming, the public number " Programmer Union ".
Reprint please indicate the source.

"Data Structure and Algorithm" full series

brief introduction

  1. The correctness of the algorithm
  2. The complexity of the algorithm
  3. "Asymptotic" metric
  4. The first part of the fourth lesson

1 . The correctness of the algorithm

In the last lesson Data Structures and Algorithms | Part I, Lesson 2:Little Ducks Going to Travel , we told an interesting little story just to lead Algorithm complexity.

The complexity of the algorithm is very important, and there is a lot of content to talk about, so we are divided into two classes.

When programmers need to solve problems related to computer science, they(usually) write a program. This program contains an implementation, which means that the algorithm needs to be implemented in a programming language.

We know that an algorithm is just an accurate description of the steps to solve a problem. It does not depend on the programming language or working environment used by programmers.

We introduced a recipe for "Cooking Instant Noodles" in Data Structures and Algorithms | Part One Lesson 1:What are Data Structures and Algorithms , this recipe Although simple, it can be said to be an algorithm.

We describe this recipe in Chinese. If I now translate this recipe into a foreign language(such as English), but the content of the recipe remains the same, I just changed a language to explain it. Another Englishman will cook instant noodles according to this English recipe. It will be the same as what I made.

So, what do programmers need to do when they need to implement algorithms in a programming language?

Like the farmer Oscar, he must first verify that his algorithm is correct, which means that it produces the expected results and solves the problems involved. This is very important(if the algorithm is not correct, then we have no need to select and optimize it), and sometimes it is a very difficult step to verify the correctness of the algorithm.

Algorithm accuracy , English is Algorithm Correctness(algorithm means algorithm , correctness means correctness). There are many methods to prove the correctness of the algorithm, we will not elaborate on this course, because this is not our focus. If you are interested, you can search online.

Of course, if the algorithm is correct, there is no guarantee that the program implementing this algorithm will be free from bugs.

Once we have a correct algorithm, we use a programming language to implement it(by writing a program to execute it). But we may write a program with many small errors. These small errors may be related to the programming language used, and may be introduced in the process of writing the program. Because algorithms generally do not describe how to avoid making mistakes, such as how to manage program memory, how to check segmentation faults(Segmentation Fault), etc., but these are all issues that programmers need to consider when implementing algorithms.

2 . Algorithm complexity

Once the programmer is convinced that his algorithm is correct, he will try to evaluate its efficiency, for example, he will want to know "is this algorithm fast?"

One might think that the best way to know whether an algorithm is fast or not is to implement the algorithm and test it on a computer.

Interestingly, this is usually not the case. For example, if two programmers implement two different algorithms and test the speed of the program on their computers. Then the programmer with the faster computer may mistakenly think that his algorithm is faster, but it is not necessarily true.

Moreover, it is sometimes not easy to implement an algorithm and test it. Not to mention that sometimes it is difficult to write the implementation code based on an algorithm. If the algorithm to be implemented involves the launch of a rocket, is it possible to test the algorithm with a real rocket every time? We are not as rich and self-willed as Iron Man.

Iron Man

For these reasons, computer scientists have invented a very convenient and powerful concept, which is what we will learn next:the complexity of the algorithm.

"Complexity"(complexity, which means "complexity" is a noun. Its adjective is complex, which means "complex") is a bit misleading, because we do not emphasize "difficult to understand"(very complicated, Difficult), but refers to efficiency(efficiency).

"Complexity" does not mean "degree of complexity". Some algorithms are difficult to understand(very complicated), but their complexity can be very low.

If you want to explain the algorithm complexity in one sentence, it can be:
"If a program that implements this algorithm is given an input of size N, what is the function of the number of operations that the program will perform on the order of N(f(N))?"

The f in f(N) means function(function). I believe that everyone has learned it in the previous math class(such as y = f(x) which we used to have). f(N) means "a function of N".

The above sentence is based on the fact that the procedure for solving a problem depends on the starting conditions of the problem. If the starting conditions change, the program execution time will become longer or shorter.

Complexity can quantify(through mathematical formulas) the relationship between the starting conditions and the execution time of the algorithm.

The above sentences are a bit difficult to understand at first glance. What is operation? What is the number of operations and what is the magnitude of the number of operations?

Don't worry, we will explain slowly:

Some mathematical concepts are involved in the learning of complexity. So, learning mathematics is very helpful for programming, and English is also very important. If you are good at English and mathematics, learning programming will be easier. You can refer to my previous article: For programmers, why is English more important than mathematics? How to learn .

The order of magnitude is the level of the scale or size of the exponential quantity, with a fixed ratio between each level. The commonly used ratios are 10, 2, 1000, 1024, and e(Euler's number, approximately equal to the transcendence number of 2.71828182846, which is the base of the natural logarithm).
-Excerpt from Baidu Encyclopedia

In order to "count the number of operations", we must first define what is an operation. The operation is actually not difficult to understand. For example, in the first lesson of the first part, we talked about an algorithm of "cooking instant noodles".

  1. Pour the right amount of water into the pan
  2. Light a fire on the stove(if it's an induction cooker, you don't need a fire)
  3. Put the pot on the stove
  4. Wait for the water to boil and switch to medium heat
  5. Put the instant noodles into the pan
  6. Cook for half a minute
  7. Put in all seasoning packages
  8. Cook for 1 minute
  9. Out of the pot

You can think of each of the above steps as an operation(operation, "operation", this English word also means "operation").

But when calculating the number of operations, which operations should we choose? Even the most intelligent scientists may not be able to answer very clearly. Because which operation to choose for calculation depends on the problem(even the algorithm) considered.

We must choose some operations that the algorithm often performs, and use it as a benchmark to measure the complexity of the algorithm.

For example, to make fried eggs, three basic operations can be considered:

  • Break eggs
  • Shredding the eggs being fried
  • Fried eggs

Therefore, we can count the number of eggs in each fried egg recipe to understand the complexity of the recipe(algorithm)(fried eggs is a well-known dish, we can expect all fried egg recipes to have the same complexity:N eggs, we perform 3N operations). The operation of adding salt, spring onion, pepper or other condiments is very fast and does not need to be considered within the complexity of the recipe(algorithm).

Fried Egg

As another example, in the story of the Ducklings Going to Travel in the previous lesson, the big box containing the ducklings is N rows and N columns. Make N2(N squared) round trips; if you use the second new algorithm, you only need N round trips.

This is a good measure of complexity because farmer Oscar's back and forth between the pond and the truck is the longest of all the operations of the algorithm. Other operations are relatively less time-consuming, such as Oscar picking a duck from a large box, asking which pond the duck is going to go to, and so on.

Therefore, we can say that the total time of this algorithm is almost mainly spent on this operation back and forth between the pond and the truck, so it can be used as a measure of the efficiency of this algorithm.

If the concept of complexity is still a bit vague for you, please do n t worry, the following practical courses should make you suddenly cheerful.

3 . "Asymptotic" metric

Above we introduced the basic concept of complexity, and we continue to talk below.

We can say:complexity is a measure of the asymptotic behavior of the algorithm.

Asymptotic English is Asymptotic. So, what does this somewhat incomprehensible word "asymptotic behavior"(or just "progressive") mean?

This means "when the input becomes very large"(even "going to infinite"). The "input" here is a quantification of the initial conditions of the algorithm.

In the "Little Ducks Go Traveling" story, this means "when there are many rows/columns of little ducks in a big box", for example, N is 250. In computer science, the meaning of "many" is slightly different:for example, search engines will say "there are many pages", 1 trillion pages ...

"When the input becomes very large"(even "going to infinity") there are two results:

On the one hand, constant time(constant quantity) is not considered. Because "constant time" does not depend on input.

For example, before the farmer Oscar starts to take the ducklings to every pond, he will first open the back door of the truck. The time to open the door after the truck is considered "constant":whether he has 20 rows of 20 rows of ducklings or 50 rows of 50 rows of ducklings in his large box, it takes the same time to open the door. Since we have to consider the efficiency of the algorithm when the number of rows/columns of the large box is very large, compared with Oscar's back and forth time between the pond and the truck, the time after opening the truck door is negligible.

On the other hand, the "constant multiplication factor" is not considered:the complexity measure does not distinguish between algorithms that perform N, 2N, or 250N operations. why? We consider the following two algorithms that depend on N:

The first algorithm:

Do N times(Operation A)

The second algorithm:

Do N times(operation B then operation C)

In the first algorithm, we do N operations A; in the second algorithm we do N operations B and N operations C. Suppose these two algorithms solve the same problem(so they are both correct), and all operations are taken into account in the complexity measure:then the first algorithm does N operations and the second algorithm does 2N operating.

But can we say which algorithm is faster? Of course not. Because it depends on the time taken by the three operations A, B, and C:maybe operations B and C are both 4 times faster than operation A, then overall, the second algorithm with 2N operands is The first algorithm for N is faster because 1/4 + 1/4 = 1/2(operation B and operation C are both a quarter of operation A).

Therefore, constant multiplication factors that do not necessarily have an effect on the efficiency of the algorithm are not considered in the measurement of complexity.

This also allows us to answer the question at the beginning:if two programmers have two computers, one is five times faster than the other. 5 This constant multiplication factor will be ignored, so the two programmers can compare the complexity of the algorithm without problems.

It can be seen that in the measurement of the complexity of the algorithm, we ignore a lot of things, which allows us to have a fairly simple and universal concept. This generality makes complexity a useful tool, but it also has obvious shortcomings:in some very special cases, more complex algorithms can actually be completed in less time(for example, constant factors in practice Perhaps it can play a very crucial role:suppose that the farmer Oscar s truck rear door is stuck, he actually spent a whole day opening the rear door).

However, in most cases, complexity is still a reliable indicator of algorithm performance. Especially when the gap between the two complexities becomes larger and larger as the input increases.

An algorithm with N more time-consuming operations may take longer to run when N is 10 or 20 than another algorithm with N2(N squared) less time-consuming operations; but when N is When 20,000 or N is 5 million, the algorithm with lower complexity will definitely be faster.

4 . The first part of the fourth lesson

Today s lesson is a bit more difficult, you might as well read it several times. In the next lesson we continue to study the complexity of the algorithm. let's work hard together!

Next lesson: Data Structures and Algorithms | Part 1 Lesson 4:Algorithm Complexity(Part 2)

My name is , the public number "[Programmer League]( . "jpg)" operator, elite lecturer Ms. Oscar , a lifelong learner.
Love life, like swimming and have a little understanding of cooking.
Life motto:"Go straight to the benchmark"