Issue 05: Can foreign keys be used?

Posted May 27, 20209 min read

SQL tuning header.png

This article mainly describes table attributes-foreign keys.

The design intent of the foreign key is to ensure the consistency and integrity of the operation of the logically related table data on the database side.

Foreign keys will be directly avoided in the development specifications written by most companies! ** Foreign keys have advantages and disadvantages, and it does not mean that every scenario is not applicable, there is no need to have one size fits all. Can foreign keys work? The following will tell you the answer for different scenarios.

First, the advantages and disadvantages of foreign keys

Advantages:

  • Streamline related data and reduce data redundancy

    Avoid extra operation and maintenance operations for a large number of redundant processing later.

  • Reduce the complexity of application code and reduce additional exception handling

    Related data management is handled by the database side.

  • Increase the readability of the document

    Especially at the beginning of the table design, when drawing the ER diagram, the logic is simple and clear, and the readability is very strong.

Disadvantages:

  • Performance pressure

    Foreign keys generally have cascading functions, cascading updates, cascading deletes, etc. Massive data scenarios cause a lot of performance pressure. For example, if you insert a new record, if the table into which the record is inserted has 10 foreign keys, then it is necessary to check whether the inserted records are reasonable for the 10 related tables one by one, which delays the normal insertion of the record time. And the update of the parent table will be associated with the child table plus related locks.

  • Poor flexibility of other functions

    For example, the update of the table structure.

Second, the use of foreign keys

Foreign key reference action list:

  • CASCADE:Cascading, the child table follows the parent table to update the foreign key value
  • SET NULL:The child table is updated with the main table. The foreign key value is NULL
  • RESTRICT/NO ACTION:default, restrict parent table to change foreign key value
  • SET DEFAULT:The current effect is the same as RESTRICT.

Let's first briefly look at the usage of foreign keys in MySQL. MySQL foreign keys are only supported by two engines, InnoDB and NDB. Here we only focus on InnoDB.

This example MySQL version is the latest version 8.0.19

Example

Below f1 is the parent table, and f2, f3, and f6 respectively represent different types of foreign key tables, that is, child tables.

-Reference to the base table, which is the parent table
mysql-(ytt_fk/3305)-> create table f1(id int primary key,
    r1 int, r2 int, r3 int, key idx_r1(r1), key idx_u1(r2, r3));
Query OK, 0 rows affected(0.02 sec)

-With the reference table cascading to update the foreign key table, that is, if the parent table is updated, it will cascade update the foreign key of the child table
mysql-(ytt_fk/3305)-> create table f2(id int primary key,
    f1_r1 int, mark int, constraint f1_fk_r1 foreign key(f1_r1) references f1(r1) on update cascade);
Query OK, 0 rows affected(0.02 sec)


-As the reference table updates the foreign key value is NULL, that is, if the parent table is updated, the foreign key of the child table will be cascaded to be NULL
mysql-(ytt_fk/3305)-> create table f3(id int primary key,
    f1_id int, foreign key(f1_id) references f1(id) on update set null);
Query OK, 0 rows affected(0.02 sec)

-Multiple key-value foreign keys. The child table can refer to other keys of the parent table that are not primary keys
mysql-(ytt_fk/3305)-> create table f6(id int auto_increment primary key,
    f1_r2 int, f1_r3 int, foreign key(f1_r2, f1_r3) references f1(r2, r3));
Query OK, 0 rows affected(0.02 sec)
Scenario 1:A scenario that strongly requires data consistency, program weakening, database-side enhancement, small changes in table structure, and low concurrency.

Use one record to verify tables f2 and f6. From a functional point of view, the advantages of foreign keys are obvious, and data integrity verification is fully satisfied on the database side.

mysql-(ytt_fk/3305)-> insert into f1 values  (1,10,100,1000);
Query OK, 1 row affected(0.00 sec)

mysql-(ytt_fk/3305)-> insert into f2 values  (1,1);
Query OK, 1 row affected(0.01 sec)

mysql-(ytt_fk/3305)-> insert into f6 values  (1,100,1000);
Query OK, 1 row affected(0.00 sec)

-Update reference table f1
mysql-(ytt_fk/3305)-> update f1 set id = 2 where id = 1;
Query OK, 1 row affected(0.01 sec)
Rows matched:1 Changed:1 Warnings:0

-f2 also successfully cascaded update
mysql-(ytt_fk/3305)-> select * from f2;
+ ---- + ------- +
| id | f1_id |
+ ---- + ------- +
| 1 | 2 |
+ ---- + ------- +
1 row in set(0.00 sec)

-The reference table r2 field is not allowed to update because table f6 has a foreign key constraint on field r2.
mysql-(ytt_fk/3305)-> update f1 set r2 = 11;
ERROR 1451(23000):Cannot delete or update a parent row:a foreign key constraint fails(`ytt_fk`.`f6`, CONSTRAINT` f6_ibfk_1` FOREIGN KEY(`f1_r2`,` f1_r3`) REFERENCES `f1`(` r2 `,` r3`))
Scenario 2:Frequent data loading, but the database is also strictly required to ensure data consistency.

Here, only the table f6 is verified, and a new table f6 \ _no \ _fk is cloned at the same time. The table structure is the same as f6 except that there are no foreign keys. Import 400W sample data.

-Import f6, there are foreign keys, the time is more than 32 seconds.
mysql-(ytt_fk/3305)-> load data infile '/var/lib/mysql-files/f1_sub.dat' into table f6;
Query OK, 4000000 rows affected(32.57 sec)
Records:4000000 Deleted:0 Skipped:0 Warnings:0

-Import f6_no_fk, no foreign key, more than 25 seconds.
mysql-(ytt_fk/3305)-> load data infile '/var/lib/mysql-files/f1_sub.dat' into table f6_no_fk;
Query OK, 4000000 rows affected(25.95 sec)
Records:4000000 Deleted:0 Skipped:0 Warnings:0

As you can see from the above, a separate test imports 400W records, and tables with foreign keys have no advantage over non-foreign key tables. Under the optimization for the above scenario, close the foreign key to check the parameters, and then open it after the import is complete.

mysql-(ytt_fk/3305)-> truncate f6;
Query OK, 0 rows affected(0.04 sec)

-Turn off foreign key checking.
mysql-(ytt_fk/3305)-> set foreign_key_checks = 0;
Query OK, 0 rows affected(0.00 sec)

-Re-import, more than 28 seconds.
mysql-(ytt_fk/3305)-> load data infile '/var/lib/mysql-files/f1_sub.dat' into table f6;
Query OK, 4000000 rows affected(28.42 sec)
Records:4000000 Deleted:0 Skipped:0 Warnings:0

-Turn on foreign key checking.
mysql-(ytt_fk/3305)-> set foreign_key_checks = 1;
Query OK, 0 rows affected(0.00 sec)

It can be seen from the above results that after the foreign key check is turned off, the import time is almost the same as the table f6 \ _no \ _fk without foreign keys.

Scenario 3:Less concurrency, simple things.

Next, let's look at the simple way of submitting things. I simply wrote a stored procedure that commits every 500 records.

DELIMITER $$
CREATE DEFINER = `ytt` @` 127.0.0.1` PROCEDURE `sp_generate_data`(IN` tb_name` VARCHAR(64), IN `f_number` INT)
begin
declare i int default 0;
set @@ autocommit = 0;
while i <f_number DO

  set @stmt = concat("insert into", tb_name, "(f1_r2, f1_r3) values  (ceil(rand() * 10), ceil(rand() * 10))");
  prepare s1 from @stmt;
  execute s1;
  set i = i + 1;
  if mod(i, 500) = 0 THEN
    commit;
  end if;
end while;
drop prepare s1;
commit;
set @@ autocommit = 1;
end $$
DELIMITER;

Next insert 100W records,

-The total time to write the foreign key table is 1 minute 14 seconds
mysql> call sp_generate_data('f6', 1000000);
Query OK, 0 rows affected(1 min 14.14 sec)

-Non-foreign key table write time is 1 minute and 8 seconds
mysql> call sp_generate_data('f6_no_fk', 1000000);
Query OK, 0 rows affected(1 min 8.45 sec)

-Turn off foreign key checking
mysql> set foreign_key_checks = 0;
Query OK, 0 rows affected(0.00 sec)

-Time is 1 minute 4 seconds
mysql> call sp_generate_data('f6', 1000000);
Query OK, 0 rows affected(1 min 4.28 sec)

mysql> set foreign_key_checks = 1;
Query OK, 0 rows affected(0.00 sec)

From the test results, the retrieval time with and without foreign keys is almost the same in this scenario.

Scenario 4:The type of the foreign key reference field of the main table needs to be expanded, the original data overflows, and it is impossible to save a larger value.

For example, at this time, the data type defined by the field r2 is not suitable, and it needs to be changed to a larger one.

mysql-(ytt_fk/3305)-> alter table f1 change r2 r2 bigint;
ERROR 3780(HY000):Referencing column 'f1_r2' and referenced column 'r2' in foreign key constraint 'f6_ibfk_1' are incompatible.

mysql-(ytt_fk/3305)-> alter table f6 change f1_r2 f1_r2 bigint;
ERROR 3780(HY000):Referencing column 'f1_r2' and referenced column 'r2' in foreign key constraint 'f6_ibfk_1' are incompatible.

How to change it? Need to delete the foreign key first, modify the type, and add constraints. This scenario is not suitable for foreign keys.

mysql-(ytt_fk/3305)-> alter table f6 drop constraint f6_ibfk_1;
Query OK, 0 rows affected(0.00 sec)
Records:0 Duplicates:0 Warnings:0

mysql-(ytt_fk/3305)-> alter table f6 change f1_r2 f1_r2 bigint;
Query OK, 0 rows affected(0.04 sec)
Records:0 Duplicates:0 Warnings:0

mysql-(ytt_fk/3305)-> alter table f1 change r2 r2 bigint;
Query OK, 100000 rows affected(0.73 sec)
Records:100000 Duplicates:0 Warnings:0

mysql-(ytt_fk/3305)-> alter table f6 add foreign key(f1_r2, f1_r3) references f1(r2, r3);
Query OK, 0 rows affected(0.03 sec)
Records:0 Duplicates:0 Warnings:0
Scene 5:The child table has a trigger requirement to update the necessary fields.

Regarding this point, the trigger of the child table will not be cascaded with the update of the parent table, that is, the trigger is invalid at this time. For example, add a before update trigger to f2.

-Pre-update trigger
CREATE TRIGGER `tr_af_update` BEFORE UPDATE ON` f2`
 FOR EACH ROW set new.mark = new.f1_r1;

mysql-(ytt_fk/3305)-> insert into f2 values  (1,10,5);
Query OK, 1 row affected(0.00 sec)

mysql-(ytt_fk/3305)-> select * from f2;
+ ---- + ------- + ------ +
| id | f1_r1 | mark |
+ ---- + ------- + ------ +
| 1 | 10 | 5 |
+ ---- + ------- + ------ +
1 row in set(0.00 sec)

-Update the parent table,

mysql-(ytt_fk/3305)-> update f1 set r1 = 2 where r1 = 10;
Query OK, 5133 rows affected(0.15 sec)
Rows matched:5133 Changed:5133 Warnings:0

-The cascade corresponding to sub-table f2 has been changed, but the trigger action is not executed.
mysql-(ytt_fk/3305)-> select * from f2;
+ ---- + ------- + ------ +
| id | f1_r1 | mark |
+ ---- + ------- + ------ +
| 1 | 2 | 5 |
+ ---- + ------- + ------ +
1 row in set(0.00 sec)

-Normal operation should be like this
mysql-(ytt_fk/3305)-> update f2 set id = 2;
Query OK, 1 row affected(0.00 sec)
Rows matched:1 Changed:1 Warnings:0

-The clone corresponding to the mark field becomes the value of the f1_r1 field.
mysql-(ytt_fk/3305)-> select * from f2;
+ ---- + ------- + ------ +
| id | f1_r1 | mark |
+ ---- + ------- + ------ +
| 2 | 2 | 2 |
+ ---- + ------- + ------ +
1 row in set(0.00 sec)
Scene 6:The parent table is a partitioned table, and there is a need for foreign keys.

For the partition table, the foreign key of the child table with the partition table as the parent table is temporarily not supported.

mysql-(ytt_fk/3305)-> create table f1_partition like f1;
Query OK, 0 rows affected(0.02 sec)

mysql-(ytt_fk/3305)-> alter table f1_partition partition by key() partitions 4;
Query OK, 0 rows affected(0.10 sec)
Records:0 Duplicates:0 Warnings:0

mysql-(ytt_fk/3305)-> create table f7(id int primary key,
    f1_partition_id int, foreign key(f1_partition_id) references f1_partition(id));
ERROR 1506(HY000):Foreign keys are not yet supported in conjunction with partitioning

For a simple example, look at the situation where the foreign table is updated, the parent table is updated, and the child table is cascaded.

-SESSION 1
mysql-(ytt_fk/3305)-> begin;
Query OK, 0 rows affected(0.00 sec)

mysql-(ytt_fk/3305)-> update f1 set r2 = 101 where r2 = 100;
Query OK, 1 row affected(0.00 sec)
Rows matched:1 Changed:1 Warnings:0


mysql-(ytt_fk/3305)-> select sys.ps_thread_id(connection_id()) as cid;
+ ------ +
| cid |
+ ------ +
| 47 |
+ ------ +
1 row in set(0.00 sec)

There are 11 locks in total, and the Update is simply executed, and only one row is updated.

-SESSION 2
mysql-((none)/3305)-> select count(*) from performance_schema.data_locks where thread_id = 47;
+ ---------- +
| count(*) |
+ ---------- +
| 11 |
+ ---------- +
1 row in set(0.00 sec)

Looking at the refinement of the lock, the parent has f1 with 5 locks and the child table f6 has 6 locks.

This is mandatory for MySQL to ensure data consistency. This is definitely not suitable for scenarios with high TPS requirements.

mysql-((none)/3305)-> select object_name, lock_type, lock_mode, lock_status, lock_data from performance_schema.data_locks where thread_id = 47 order by object_name;
+ ------------- + ----------- + --------------- + ------- ------ + ------------------------ +
| object_name | lock_type | lock_mode | lock_status | lock_data |
+ ------------- + ----------- + --------------- + ------- ------ + ------------------------ +
| f1 | TABLE | IX | GRANTED | NULL |
| f1 | RECORD | X | GRANTED | supremum pseudo-record |
| f1 | RECORD | X | GRANTED | 100, 100, 1 |
| f1 | RECORD | X, REC_NOT_GAP | GRANTED | 1 |
| f1 | RECORD | X, GAP | GRANTED | 101, 100, 1 |
| f6 | TABLE | IS | GRANTED | NULL |
| f6 | RECORD | S, REC_NOT_GAP | GRANTED | 100, 100, 12 |
| f6 | TABLE | IX | GRANTED | NULL |
| f6 | RECORD | X, REC_NOT_GAP | GRANTED | 12 |
| f6 | RECORD | X, REC_NOT_GAP | GRANTED | 101, 100, 12 |
| f6 | RECORD | S, GAP | GRANTED | 101, 100, 12 |
+ ------------- + ----------- + --------------- + ------- ------ + ------------------------ +
11 rows in set(0.00 sec)

Three, foreign key restrictions:

  1. Only supported by InnoDB and NDB engines.
  2. Virtual columns are not supported.
  3. Temporary tables are not supported.
  4. The data types, character sets, and proofreading rules of foreign key columns and reference columns must be consistent.
  5. Foreign key columns and reference columns must be indexed.
  6. If the foreign key refers to multiple columns, the column order must be consistent.
  7. Large object fields cannot be used as reference columns.
  8. The constraint name must be unique within a single database.
  9. The foreign key cascade update operation will not trigger the trigger on the child table.
  10. Partition tables are not supported.

to sum up

This article mainly demonstrates whether and how foreign keys should be used from a few examples, so that you can understand the detailed requirements of foreign keys.

From the scenarios I described above, scenarios 1, 2 and 3 are very suitable for using foreign keys; scenarios 4, 5, 6, and 7 are not suitable for using foreign keys; you can implement the foreign key function outside the database .

Yang Taotao from the media.png