Data Structures and Algorithms | Part One Lesson 5: Practice of Algorithm Complexity

Posted May 27, 202010 min read

Program = data structure + algorithm

Author Xie Enming, the public number " Programmer Union ".
Reprint please indicate the source.
Original: https://www.jianshu.com/p/060...


"Data Structure and Algorithm" full series

brief introduction


1 Introduction
2. Find the largest and smallest elements
3. Find unique elements
4. Find unique elements:another method
5. The first part of the sixth lesson

1 Introduction


After Data Structures and Algorithms | Part One Lesson Three:Algorithm Complexity(Part 1) and Data Structures and Algorithms | Part One Lesson Four:Algorithm Complexity Degree(below) , we have finished the algorithm complexity, it is time to do some practical exercises to consolidate what we have learned.


The complexity of the algorithm is a good knowledge point, but what does it have to do with our algorithm course? Let's take a look.

Algorithmics(Algorithmics) is the science of designing and researching algorithms. Its history is much longer than the history of computer science, but today algorithmics are almost all practiced by computer scientists.

Algorithms is a very broad field and requires a lot of mathematical knowledge. Of course, not all computer scientists need to be genius algorithmists. From an algorithmic point of view, the problem most programmers face is actually very simple.

But sometimes we need to implement something more complicated. In this case, the basic knowledge of the algorithm will be very useful. We do not require you to invent a revolutionary new algorithm and give specific proof of its complexity, but in order to be able to use those algorithms found on the network or in the software library accurately, it is still necessary to receive "basic training" of.

Knowing the algorithm will make you more efficient, better understand the problem you want to solve, and will not write non-standard code:although some code can run normally, it is unreasonable from the perspective of the algorithm . An inexperienced programmer may directly use these unqualified algorithms(he will think:"The code can run, so there should be no problem"), but because you understand the algorithm, you can quickly find code problems and rewrite A much better version.

Listening to this, you may be a little eager to try. The following are two simple research questions on algorithm complexity, they can let you understand the role of algorithm complexity more accurately.

2 . Find the largest and smallest elements


Question 1:
There is a list of positive integers, and we want to find the largest integer in the list.

The classic solution to this problem is as follows:traverse this list and keep the largest element found so far, called it the "current maximum value".

We can describe this algorithm as:

At the beginning, "current maximum" is equal to 0. We use each element in the list to compare with the "current maximum value". If the current traversed element is larger than the "current maximum value", then set the "current maximum value" to the value of the current element. After traversing the entire list, the "current maximum" is really "deserved".

Below we give an implementation of this algorithm, which is implemented using PHP, the "best programming language in the world"(of course, you can also use other programming languages):

<? php
function max($list) {
   $current_max = 0;
   foreach($list as $item)
       if($item> $current_max)
           $current_max = $item;
   return $current_max;
}
?>

We can quickly verify that the algorithm is correct:we only need to confirm that when this algorithm is executed, the "current maximum" is always equal to the largest value in the list elements traversed so far.

We also notice that this algorithm will "end", it will not fall into an infinite loop:this algorithm traverses the entire list, and then stops. This may seem like an unimportant detail, but there are actually some programming languages ​​that can represent an infinite list of elements:in this case, our algorithm is incorrect.

Let us now study the complexity of this algorithm. What operations should we consider? Obviously, most of the work is to compare the current element with the "current maximum"(after all, the initialization of the "current maximum" current \ _max(initialized to 0) does not take up much running time), so we calculate the "comparison operation "As the operand of this algorithm.

What parameters does the execution time of the algorithm depend on? As you can imagine, the execution time does not depend on the value of each element in the list(here, we assume that the comparison time of two integers is constant, regardless of their value). Therefore, we use the length N of the element list to quantify the input.

For a list of N elements, we have to perform N comparisons:each element is compared once with the "current maximum". Therefore, the time complexity of the algorithm is O(N):its execution time is linear and proportional to the number N of elements in the list.

So, what is the space complexity of this algorithm? This algorithm uses a list in which the elements occupy a certain amount of memory space. However, this list already existed before we searched for its largest element, and the memory space it occupied was not allocated by our algorithm, so we said that the number of elements in this list, N, would not be considered due to the complexity of the algorithm. In the calculation of degrees, we only consider the memory directly applied by our algorithm.

The memory space directly applied by our algorithm is almost negligible, because at most it takes up a temporary variable(current \ _max) to store the "current maximum value". Therefore, the memory space occupied by our algorithm does not depend on the length of the list:(We record the space complexity as O(1), indicating that it does not depend on N).

For our algorithm, there is only one small detail left to pay attention to:if our list is empty, then the maximum value returned will be 0. To say "the maximum value of an empty list is 0" is obviously not necessarily correct:in some cases, if the list is empty, it is better to return an error.

So we can improve our algorithm:instead of assigning an initial value of 0 to the "current maximum value", we take the first element of the list(if the list is empty, an error is returned) as the "current maximum value" "Initial value. Then, we start the comparison from the second element.

The improved algorithm performs N-1 comparisons(because we do n’t have to compare the first element with itself). However, this does not change the time complexity of the algorithm:the time difference between N and N-1 does not depend on N, it is constant, so we can ignore it:both algorithms have the same time complexity, they both It is linear in time(time complexity is O(N)).

Finally, we noticed that the second algorithm also works for negative numbers(if all elements of the list are negative, the first algorithm will return 0, which is obviously incorrect). Therefore, the improved second algorithm is more general and better.

Of course, the algorithm for finding the minimum value in the list is similar to that for finding the maximum value, so we wo n’t go into details.

3 . Find unique elements


Now we come to the second question.

Question 2:
There is a list 1, which contains duplicates(elements that appear multiple times):we want to build a list 2 that contains the same elements as list 1, but each element in list 2 appears only once.

For example, the following elements are in Listing 1:

AABCDBCA

Then List 2 will contain the following elements:

ABCD

Do you think of an algorithm to solve this problem? Before reading my solution, please think for yourself.

My solution


My algorithm is as follows:

For a given list L containing repeated elements, we want to construct a new list U(take the first letter of the English Unique("unique")), the list U is initially empty, we need to fill it element.
We traverse list L. For each element in list L, we confirm whether it exists in list U(you can use an algorithm similar to the previous one to find the largest element, after all, compare elements one by one).
If the traversed element in list L is not in list U, add this element to list U; if it already exists in list U, do not add it.
After traversing list L, list U has the same elements as list L, but these elements are not repeated.


Exercise: Use your favorite programming language to implement the above algorithm for extracting unique elements from the list.

the complexity


What is the complexity of this algorithm? If you fully understand the calculation of the complexity of the previous algorithm to find the maximum value of the list, then this should be very simple for you.

For each element in a given list L, we will perform the operation of traversing list U, so the number of operations performed is related to the number of elements contained in list U.

But the problem is:the size of the list U will change during the traversal of the given list L, because we will add elements to the list U. When we traverse to the first element in list L, list U is still empty(so we do not perform any comparison operations); when we traverse to the second element of list L, list U has 1 element, so We have to perform another comparison operation.

But when we traverse to the third element in list L, we become less certain:if the first two elements in list L are not the same, they are added to U, in this case Next we will perform 2 comparison operations(compare the third element in list L with the two elements in list U respectively); if the first two elements are the same, then the second element in list L will be It is not added to the list U, and only one comparison is performed.

As we have already said in our course, the calculation of complexity needs to be considered in the "worst case"(worst case):that is, the complexity when the number of operations performed is the largest. Therefore, we will assume that all elements of a given list L are different.

In the "worst case", we add all the elements of the given list L to the list U one by one. Suppose there are a total of N elements in the given list L. When traversing to the Nth element of the given list L, we have added(N-1) elements to the list U, so we need to do(N-1) Comparison operations.

So the total number of comparison operations we have to do is 0 + 1 + 2 + ... +(N-1). The number of operations at the beginning is small, and the more operations are performed later(a bit like life, the responsibility is relatively small at birth, slowly the responsibility is getting larger and larger, and the things to be handled are more and more, but it also shows that you are in Grow, after all, "the talented person works hard").

Adding the above series of numbers, the total operand obtained is N _(N-1)/2(this is not difficult, it is the sum formula of the arithmetic sequence in mathematics), because we consider when calculating the complexity is When N is large, the above result can be approximately equal to N_N/2, that is, N2/2 operations.

Therefore, our algorithm has O(N2) time complexity(we removed the constant factor 1/2). We can also call O(N2) "quadratic/square" complexity(just as we call O(N) "linear" complexity).

Compared with the previous algorithm for finding the largest element, this algorithm now has a higher spatial complexity in addition to its slower speed(higher time complexity):we constructed a list U that did not originally exist(so apply Memory space).

In the worst case, the list U also has as many elements as the given list L:therefore space will be allocated for N elements, which makes the space complexity O(N). The space complexity of the previous algorithm for finding the largest element is constant(O(1)), but now the space complexity of this algorithm is linear(O(N)).

The algorithm only needs to compare elements, so the operated elements do not have to be integers:we can use the same algorithm to eliminate repeated words in the word list, repeated floating point numbers, and so on. Therefore, many algorithms are independent of the specific type of elements used.

4 . Finding unique elements:another way


There is another algorithm for finding non-repeating elements(smart as you might think of it):we can first sort the elements in the given list L so that all the repeated elements are adjacent, so that the repeated elements are excluded It will become very simple.

For example, the given list L initially looks like this:

AABCDBCA

We can sort the list L before constructing the list U so that it becomes the following:

AAABBCCD

In this way, our algorithm for constructing list U is simple later.

The algorithm is as follows:
Just traverse the sorted list L, and remember the last element traversed. If the current element is the same as the previous element, the element is repeated, so do not include it in the list U of unrepeated elements.

If the repeated elements are not adjacent to each other, the above algorithm is no longer effective. So we must first sort the list.

What is the time complexity of this new algorithm? Elimination of repetition is done in a single traversal of the list, so it is linear(O(N)). But since we must sort the list first, the sorting operation in the first step must also be taken into account in the total complexity of this new algorithm.

Of course, the sorting of the list mentioned here is a little too early, because we will talk about the sorting algorithm in later courses.

Although we have not yet learned the sorting algorithms and their complexity, I still want to talk about the complexity of this new algorithm.

It turns out that the complexity of this algorithm depends on the complexity of sorting:because sorting basically performs N2 operations, which is far more than the N operations when we later build the list U, so the overall complexity is O(N2).

However, there are also higher-end sorting algorithms that, although still performing more than N operations, are much less than N2.

We will learn the sorting algorithm in the following courses. At present, you only need to know that this new algorithm with one more step of sorting is more effective and more "advanced" than the old algorithm.

"Searching for specific elements in the list" and "finding the elements with the largest/smallest value in the list" are very similar algorithms, both of which are linear time(the time complexity of the algorithm is O(N)), and the space complexity is O(1).

The algorithm for eliminating duplicate elements in the list is more complicated, because the simplest algorithm has a square time complexity(O(N2)) in time and a linear complexity(O(N)) in space.

I hope these more specific studies will convince you that algorithmics and algorithm complexity are still useful. Now you should also be accustomed to the basic concepts of "algorithm", "time complexity", "space complexity".

At the beginning of the next lesson, we have to learn the data structure, so that we can combine and integrate the data structure and the algorithm. After all, this pair of "live treasures" are closely related.

5 . The first part of the sixth lecture


Today's class is here, come on!

Next lesson: Data Structure and Algorithm | Lesson 6 of Part One:Arrays and Linked Lists(Part 1)


My name is 谢恩铭 , the public number "[Programmer League]( https://github.com/frogoscar/ProgrammerLeague/blob/master/qrcode . "jpg)" operator, elite lecturer Ms. Oscar , a lifelong learner.
Love life, like swimming and have a little understanding of cooking.
Life motto:"Go straight to the benchmark"