Ape Thinking Series 6-Business is just like that

Posted Jun 5, 20208 min read

After reading the previous chapter, I believe that you have fully mastered the routines of the agent. The ape man factory also knows that the content is still very difficult for novices to understand. Other methods mentioned in the article, you can try to achieve it, ape man The factory will not repeat them one by one. Today we will open a new direction and let everyone think about some new issues. However, the previous chapter involved compilation principles, class loading mechanisms, and a little knowledge of jvm. It is very important. Please be sure to master the processes and concepts.

Ape Thinking is an original series of articles to help you quickly grasp the basic knowledge from a small white. A lot of the basic knowledge lies in the flexibility of thinking. For more exciting content, please pay attention to the Princess No. Ape Man Factory, click Ape Man Cultivate and get!

Database transaction:Generally speaking, it refers to the smallest and inseparable unit of work in the database access logic; it is a group of relatively complete logical unit operations on the database. A database transaction can be a SQL statement or a group of SQL statements.

Let's look at an example. For example, if account A transfers 100 yuan to account B, then account A must decrease by 100 yuan, and account B must increase by 100 yuan. This operation is impossible to separate, it is an atomic operation. Account A must be reduced by 100 yuan and account B must be increased by 100 yuan. A cannot decrease by 100 yuan, B cannot increase by 100 yuan, and A cannot decrease by 100 yuan, and B can increase by 100 yuan. The operations must be successful or fail simultaneously, and must be consistent. The operations of A and B cannot be interfered by other transactions to affect the correct execution of the results. The concurrent transactions need to be isolated, and after this operation is performed, the operation results of the A account and the B account cannot be changed. The operation data must be recorded to disk, which is a persistent state.

These are the four characteristics that database transactions must have. Atomicity(atomicity), consistency(consistency), isolation(isolation), durability(durability), commonly known as ACID four characteristics or four principles.

There are several statements about the operation of MYSQL transactions that you should also understand:

Start transaction:starttransaction;(open to show the transaction)

Commit transaction:commit;(commit the transaction, after execution, the data is solidified to the disk and cannot be changed)

Rollback transaction:rollback;(rollback transaction, undo the previous DML statement operation)

In the minds of many people, it is mentioned that the four major principles of ACID are data transactions, but who guarantees these four principles? The answer is of course the database. For example, MYSQL database, you try to use the table of MYISAM engine to engage in transactions? Therefore, there is a big premise for database transactions. The database must support transactions.

If you are using the table of the INNODB engine, transaction autocommit is enabled by default. Without the start transation statement, any insert update statement is automatically submitted. In addition, the insert update statement is an atom. Oh, it is not possible to write a piece of data, or to modify a piece of data that still exists, only half of it is modified? Understand this first, and use it freely in future studies.

Well, this question seems a bit difficult. Let's look at atomicity first-either execute or not execute at all. Manipulating multiple sentences is too complicated to think about, but simple. For example, if we want to modify the phone with the user ID of 1, it is 13888888888:

update travel_user set travel_user_phone= 13888888888 where travel_user_id=1;

What should we do if we consider the execution failure first? If the execution fails, it is natural to ensure that the data is restored. Then the first thing to do is to record the original data. With the original record, in the subsequent modification process, if there is any problem, you can restore the original data. Then the second thing is to modify the record. If the modification fails, it is good to restore the data according to the original record. The third thing is to write the modified record to the disk. If it fails, there is the original record, just restore it. If it succeeds, the operation is completed and the transaction is successfully completed.

Think about it, where does this original record need to be recorded? Things in black and white are reassuring. Naturally, they are documents. Mysql provides a guy called redo log to do this.

Of course, MYSQL is not so stupid. It must be written to the memory first. The memory operation will be fast. The memory write is almost the same. Just start a thread to asynchronously write the data in the memory to the disk. Of course, the data has been stored in the memory for a long time. In the event of a power outage in the computer room, the data will be gone. This risk is left to the user to configure and let him decide how long it will be refreshed until the disk lasts. Of course, the forward operation should be recorded, and the reverse operation should also be recorded. For example, if an insert is recorded, a delete must be recorded for data recovery, which is called undo log.

Of course, considering other issues, such as data synchronization under the cluster, MYSQL also provides a thing called binlog to record the logical operation of the data-the executed SQL statement. In this way, if you modify the data of the master node, you only need the master node to send the binlog statement to other nodes for execution.

Then the new problem comes again, the content of redo log must be consistent with binlog, otherwise it will be messy. How to solve it?

This is easy to handle, write redo log first, if the writing fails, then we think there is a problem with this transaction, roll back, and no longer write binlog. If the redo log is successfully written, half of the binlog write fails, and the transaction is rolled back, and then delete the invalid binlog(it will be troublesome if it is not synchronized without deleting). If writing redo log and binlog are successful, the transaction is completed, add a mark to indicate success. There is also a scientific name for this process-"second stage submission".

prepare:The redo log is flushed to disk, and the transaction enters the prepare state.

commit:binlog is refreshed to disk, the transaction enters the commit state, and marked with log_xid.

Before discussing this topic, let's look at a few questions. What might happen if two transactions are executed concurrently? One transaction overwrites the data of another transaction. This is called the lost update problem. A transaction A reads a d1, a transaction B writes a data, but is not committed, transaction A query again, read the uncommitted data, this is called "dirty read". A transaction A reads a d1, a transaction B modifies d1, and commits the transaction. Transaction A reads d1 again and finds that the two records are inconsistent. This is a virtual read. A transaction A made a query, transaction B wrote data d2, and submitted the transaction, transaction A made another query(the statement can be different), and found that the query result contains d2, which is called phantom reading. Both phantom reading and phantom reading are called "unrepeatable reading".

In order to solve these problems, the database defines four isolation levels:

Read uncommitted:read uncommitted, transaction A uncommitted data transaction B can also be read, allowing dirty reads, but does not allow the problem of missing updates.

Read committed:read committed, the data transaction B that was committed by transaction A can only be read, dirty reads are not allowed, and non-repeatable reads are allowed.

Repeatable read:repeatable read, the data after transaction A is submitted, but transaction B can't read it, but transaction B can be read repeatedly, allowing phantom reads, but not allowing virtual reads, that is, the same record. The default level of MYSQL is repeatable read.

Serialization:serializable, transactions are queued for execution, one transaction can only be executed after the execution of the next one, transaction queued execution, performance can be imagined, almost useless.

How does MYSQL guarantee the isolation level of transactions?

In fact, MYSQL is a chicken thief in design. Every time a transaction is opened, the transaction is numbered, and then each row of records contains three hidden columns. _DATA_TRX_ID records the ID of the last transaction, DB_ROLL _PTR records the redo log information of the current record, and DB_ROW_ID is used to mark the row ID of the newly inserted data. _

This is all right. Add, delete, modify, and check operations are as follows:

select:

Read the record whose creation version is less than or equal to the current transaction version number, and delete the version whose version is empty or greater than the current transaction version. Used to ensure that records exist before reading.

insert:

Save the version number of the current transaction to the line DATA_TRX_ID.

update:

Insert a new line, and use the current transaction version number as DATA_TRX_ID, and set the DATA_TRX_ID of the original record line as the current transaction version number

delete:

Save the current transaction version number to the line _DATA_TRX_ID. _

MYSQL will save the uncommitted transactions in a transaction list calledtrx_sys. At the same time also engaged in a thing called ReadView. ReadView contains several important information. ReadView{low_trx_id,up_trx_id, trx_ids}, low_trx_id means the largest transaction id number in the transaction linked list, the smallest transaction id number in the up_trx_id table transaction linked list, trx _ids represents the id set of transactions in all transaction linked lists. When was ReadView created? This is related to the isolation level. If the isolation level is READ COMMITTED, then a ReadView is generated every time the data is read.If the isolation level is REPEATABLE READ, then a ReadView is generated when the data is first read.

Then it is easy to judge whether the transaction is visible or not:

All records on the data row with DATA_TRX_ID less than up_trx_id indicate that the transaction that modified the row has been submitted before the current transaction is started, so it is visible to the current transaction. For records where DATA_TRX_ID is greater than low_trx_id, it means that the transaction that modified the row record is after the current transaction, so it is not visible to the current transaction.

As for whether the transaction in the middle of(up_trx_id, low_trx_id) is visible, this needs to be determined according to different transaction isolation levels. For the READ COMMITTED transaction isolation level, for the transaction execution process, the data of the committed transaction is visible to the current transaction; and for the REPEATABLE READ isolation level, when the transaction starts, the transaction list that has already started All changes to the transaction are invisible, so at the REPEATABLE READ level, low_trx_id basically keeps the same value as up_trx_id.