Poor man’s Online Optimize in 5.6

Table space fragmentation has generally 2 origins :

File System fragmentation : the data file is spread physically on many non contiguous locations on the disk.
Internal Fragmentation : the data and index pages have “holes” : this happens when rows are deleted or updated, especially at random.

As a result, performance is affected by table space fragmentation. Data typically takes more space on disk and in memory. The disk is more busy than it should.

File System fragmentation can be detected using the filefrag command on Linux (and similar on different OS). When using MyISAM, MYI files are usually very fragmented on the FS, much more than the MYD files.

 ls -al frag
total 883304
drwx------  2 aadant common      4096 Sep 30 18:41 .
drwxr-xr-x 17 aadant common      4096 Sep 30 18:59 ..
-rw-rw----  1 aadant common        65 Sep 30 18:40 db.opt
-rw-rw----  1 aadant common      8608 Sep 30 18:41 t.frm
-rw-rw----  1 aadant common 551944192 Sep 30 19:41 t.MYD
-rw-rw----  1 aadant common 426150912 Sep 30 19:41 t.MYI
filefrag frag/t.MYD
frag/t.MYD: 23 extents found
filefrag frag/t.MYI
frag/t.MYI: 4949 extents found

ls -al frag

total 883304

drwx------ 2 aadant common 4096 Sep 30 18:41 .

drwxr-xr-x 17 aadant common 4096 Sep 30 18:59 ..

-rw-rw---- 1 aadant common 65 Sep 30 18:40 db.opt

-rw-rw---- 1 aadant common 8608 Sep 30 18:41 t.frm

-rw-rw---- 1 aadant common 551944192 Sep 30 19:41 t.MYD

-rw-rw---- 1 aadant common 426150912 Sep 30 19:41 t.MYI

filefrag frag/t.MYD

frag/t.MYD: 23 extents found

filefrag frag/t.MYI

frag/t.MYI: 4949 extents found

To measure internal InnoDB fragmentation, there is no general formula. It is possible to compute the theoretical size and compare it to the actual size on disk. It only works on fixed length data types. I have this trick : it is mentioned in 50 Tips for Boosting MySQL Performance [CON2655]. The idea is to compare the fragmented table and a much smaller table having the same table definition and sample data.
It works well when the average row length can be estimated from the sample data even for variable length rows.

create table t_defrag like t; 
insert into t_defrag select * from t limit 20000;

1 2	create table t_defrag like t; insert into t_defrag select * from t limit 20000;

Here 20000 rows 15-20 Mb on disk gives a good idea. Inserting the sample data the same empty table actually defragments the sample, since it contains non empty rows.

The table is likely internally fragmented

if Avg_row_length(t) > Avg_row_length(t_defrag)

where Avg_row_length comes from show table status.

Example : some random data is created, uniformly distributed random row length : 1M rows, 640Mb. Internal fragmentation is created by deleting the even rows (one every two rows).

drop database test_case2;
create database test_case2;
use test_case2;
set global query_cache_size = 0;
set global innodb_flush_log_at_trx_commit = 2;
drop table if exists t1;
create table t1(
id int primary key auto_increment,
k  varchar(50),
value varchar(1000), unique key(k)) engine=InnoDB;
set @t1 =(select now());
set @i := 1;
insert into t1(k,value) values(@i:=@i+1,repeat('a',rand()*1000)); 
insert into t1(k,value) values(@i:=@i+1,repeat('a',rand()*1000)); 
replace into t1(k,value) select @i:=@i+1,repeat('a',rand()*1000) from t1 t1, t1 t2, t1 t3, t1 t4, t1 t5, t1 t6, t1 t7, t1 t8, t1 t9, t1 t10, t1 t11, t1 t12, t1 t13, t1 t14, t1 t15, t1 t16, t1 t17,
t1 t18, t1 t19, t1 t20;
set @t2 =(select now());
select @@version,timediff(@t2,@t1) duration;

analyze table t1\G
show table status like 't1'\G
delete from t1 where mod(id,2) = 0; 
show table status like 't1'\G
drop table if exists t1_defrag;
create table t1_defrag like t1; 
insert into t1_defrag select * from t1 limit 20000;
analyze table t1\G
analyze table t1_defrag\G
show table status like 't1'\G
show table status like 't1_defrag'\G

drop database test_case2;

create database test_case2;

use test_case2;

set global query_cache_size = 0;

set global innodb_flush_log_at_trx_commit = 2;

drop table if exists t1;

create table t1(

id int primary key auto_increment,

k varchar(50),

value varchar(1000), unique key(k)) engine=InnoDB;

set @t1 =(select now());

set @i := 1;

insert into t1(k,value) values(@i:=@i+1,repeat('a',rand()*1000));

replace into t1(k,value) select @i:=@i+1,repeat('a',rand()*1000) from t1 t1, t1 t2, t1 t3, t1 t4, t1 t5, t1 t6, t1 t7, t1 t8, t1 t9, t1 t10, t1 t11, t1 t12, t1 t13, t1 t14, t1 t15, t1 t16, t1 t17,

t1 t18, t1 t19, t1 t20;

set @t2 =(select now());

select @@version,timediff(@t2,@t1) duration;

analyze table t1\G

show table status like 't1'\G

delete from t1 where mod(id,2) = 0;

show table status like 't1'\G

drop table if exists t1_defrag;

create table t1_defrag like t1;

insert into t1_defrag select * from t1 limit 20000;

analyze table t1\G

analyze table t1_defrag\G

show table status like 't1'\G

show table status like 't1_defrag'\G

mysql> show table status like 't1'\G
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 476676
 Avg_row_length: 1293
    Data_length: 616562688
Max_data_length: 0
   Index_length: 37257216
      Data_free: 5242880
 Auto_increment: 1114098
    Create_time: 2013-09-30 15:19:01
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.01 sec)

mysql> show table status like 't1_defrag'\G
*************************** 1. row ***************************
           Name: t1_defrag
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 19773
 Avg_row_length: 610
    Data_length: 12075008
Max_data_length: 0
   Index_length: 1589248
      Data_free: 4194304
 Auto_increment: 40000
    Create_time: 2013-09-30 15:23:38
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.01 sec)

mysql> show table status like 't1'\G

*************************** 1. row ***************************

Name: t1

Engine: InnoDB

Version: 10

Row_format: Compact

Rows: 476676

Avg_row_length: 1293

Data_length: 616562688

Max_data_length: 0

Index_length: 37257216

Data_free: 5242880

Auto_increment: 1114098

Create_time: 2013-09-30 15:19:01

Update_time: NULL

Check_time: NULL

Collation: latin1_swedish_ci

Checksum: NULL

Create_options:

Comment:

1 row in set (0.01 sec)

mysql> show table status like 't1_defrag'\G

*************************** 1. row ***************************

Name: t1_defrag

Engine: InnoDB

Version: 10

Row_format: Compact

Rows: 19773

Avg_row_length: 610

Data_length: 12075008

Max_data_length: 0

Index_length: 1589248

Data_free: 4194304

Auto_increment: 40000

Create_time: 2013-09-30 15:23:38

Update_time: NULL

Check_time: NULL

Collation: latin1_swedish_ci

Checksum: NULL

Create_options:

Comment:

1 row in set (0.01 sec)

ls -al data/test_case2/t1*
-rw-rw---- 1 aadant common      8612 Sep 30 20:39 data/test_case2/t1_defrag.frm
-rw-rw---- 1 aadant common  22020096 Sep 30 20:39 data/test_case2/t1_defrag.ibd
-rw-rw---- 1 aadant common      8612 Sep 30 20:40 data/test_case2/t1.frm
-rw-rw---- 1 aadant common 671088640 Sep 30 20:44 data/test_case2/t1.ibd

ls -al data/test_case2/t1*

-rw-rw---- 1 aadant common 8612 Sep 30 20:39 data/test_case2/t1_defrag.frm

-rw-rw---- 1 aadant common 22020096 Sep 30 20:39 data/test_case2/t1_defrag.ibd

-rw-rw---- 1 aadant common 8612 Sep 30 20:40 data/test_case2/t1.frm

-rw-rw---- 1 aadant common 671088640 Sep 30 20:44 data/test_case2/t1.ibd

As you can see, the table t1 is fragmented since its average row length is 2 times larger than t1_defrag.

Fortunately, there is a command that fixes both FS and internal fragmentation on most storage engines :

optimize table t1;

1	optimize table t1;

For InnoDB, optimize does rebuild the table and analyze it. So this command also works :

alter table t1 engine=InnoDB;

1	alter table t1 engine=InnoDB;

mysql> alter table t1 engine=InnoDB;
Query OK, 524289 rows affected (5 min 58.99 sec)
Records: 524289  Duplicates: 0  Warnings: 0

mysql> show table status like 't1'\G
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 500516
 Avg_row_length: 611
    Data_length: 305971200
Max_data_length: 0
   Index_length: 18399232
      Data_free: 5242880
 Auto_increment: 1048578
    Create_time: 2013-09-30 19:53:47
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.26 sec)

mysql> alter table t1 engine=InnoDB;

Query OK, 524289 rows affected (5 min 58.99 sec)

Records: 524289 Duplicates: 0 Warnings: 0

mysql> show table status like 't1'\G

*************************** 1. row ***************************

Name: t1

Engine: InnoDB

Version: 10

Row_format: Compact

Rows: 500516

Avg_row_length: 611

Data_length: 305971200

Max_data_length: 0

Index_length: 18399232

Data_free: 5242880

Auto_increment: 1048578

Create_time: 2013-09-30 19:53:47

Update_time: NULL

Check_time: NULL

Collation: latin1_swedish_ci

Checksum: NULL

Create_options:

Comment:

1 row in set (0.26 sec)

After the table rebuild, the average row length is back to normal and the table size is minimal on disk, here 2 times smaller, 324Mb.

 ls -al data/test_case2/t1*
-rw-rw---- 1 aadant common      8612 Sep 30 20:35 data/test_case2/t1.frm
-rw-rw---- 1 aadant common 339738624 Sep 30 20:37 data/test_case2/t1.ibd

ls -al data/test_case2/t1*

-rw-rw---- 1 aadant common 8612 Sep 30 20:35 data/test_case2/t1.frm

-rw-rw---- 1 aadant common 339738624 Sep 30 20:37 data/test_case2/t1.ibd

Note that the optimize and alter table engine=InnoDB commands take an exclusive lock on the table.

Update : from MySQL 5.6.17, optimize and alter table engine=InnoDB are online operations. See this blog and the MySQL manual for more information.

http://mysqlserverteam.com/mysql-5-6-17-improved-online-optimize-table-for-innodb-and-partitioned-innodb-tables/

It means that if you use these commands before MySQL 5.6.17, your application : reads, writes, other DDL will be blocked during the table rebuild. So most of the time these commands were run in maintenance windows. Or using special tricks such as pt-online-schema-change or Online Schema Change at Facebook.

In MySQL 5.6, this is no longer required in most cases thanks to InnoDB online DDL. Online DDL is a major 5.6 feature. Even though optimize table is not an online operation, in practice, you can use this online DDL :

alter table t1 row_format = <row format>;

1	alter table t1 row_format = <row format>;

where <row_format> is given by show table status. Usually :

alter table t1 row_format = Compact;

1	alter table t1 row_format = Compact;

Make sure :

you have enough disk space, otherwise you may run into this : Bug #68895 Various assertions and crashes when running out of space
innodb_online_alter_log_max_size is large enough

Check also : innodb_sort_buffer_size that can speed up online DDL.

 alter table t1 row_format=Compact;
Query OK, 0 rows affected (5 min 26.53 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show table status like 't1'\G
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 502312
 Avg_row_length: 613
    Data_length: 308068352
Max_data_length: 0
   Index_length: 9977856
      Data_free: 0
 Auto_increment: 1114098
    Create_time: 2013-09-30 20:53:41
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options: row_format=COMPACT
        Comment:
1 row in set (0.02 sec)

alter table t1 row_format=Compact;

Query OK, 0 rows affected (5 min 26.53 sec)

Records: 0 Duplicates: 0 Warnings: 0

mysql> show table status like 't1'\G

*************************** 1. row ***************************

Name: t1

Engine: InnoDB

Version: 10

Row_format: Compact

Rows: 502312

Avg_row_length: 613

Data_length: 308068352

Max_data_length: 0

Index_length: 9977856

Data_free: 0

Auto_increment: 1114098

Create_time: 2013-09-30 20:53:41

Update_time: NULL

Check_time: NULL

Collation: latin1_swedish_ci

Checksum: NULL

Create_options: row_format=COMPACT

Comment:

1 row in set (0.02 sec)

The disk usage is even slightly smaller, 320Mb, than the classical rebuild and it is also slightly faster.

 ls -al data/test_case2/t1*
-rw-rw---- 1 aadant common      8612 Sep 30 20:51 data/test_case2/t1.frm
-rw-rw---- 1 aadant common 318767104 Sep 30 20:53 data/test_case2/t1.ibd

ls -al data/test_case2/t1*

-rw-rw---- 1 aadant common 8612 Sep 30 20:51 data/test_case2/t1.frm

-rw-rw---- 1 aadant common 318767104 Sep 30 20:53 data/test_case2/t1.ibd

With innodb_sort_buffer_size= 32M, the alter is faster :

 alter table t1 row_format=Compact;
Query OK, 0 rows affected (2 min 51.96 sec)
Records: 0  Duplicates: 0  Warnings: 0

alter table t1 row_format=Compact;

Query OK, 0 rows affected (2 min 51.96 sec)

Records: 0 Duplicates: 0 Warnings: 0

2 Responses to Poor man’s Online Optimize in 5.6

Mark Callaghan says:

October 1, 2013 at 2:23 pm

If the InnoDB table has secondary indexes and fast-index create is not used during the rebuild then they will still be fragmented after the rebuild unless the secondary key order matches the PK order — which is unlikely.

This is still open – fast index create not used during “alter table foo engine=innodb
http://bugs.mysql.com/bug.php?id=57583

And two feature requests are open to get mysqldump output to use fast index create on reload:
http://bugs.mysql.com/bug.php?id=49120
http://bugs.mysql.com/bug.php?id=64248

aadant says:

October 1, 2013 at 4:57 pm

hi Mark,

I commented here : http://bugs.mysql.com/bug.php?id=57583

alter table t1 row_format = Compact; does the trick in 5.6. It is an online DDL workaround and it leads to the minimal table size.

About the 2 mysqldump bugs, yes, this is a known issue and I asked if we can do something.

Poor man’s Online Optimize in 5.6

2 Responses to Poor man’s Online Optimize in 5.6

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta