Poor man’s Online Optimize in 5.6

Table space fragmentation has generally 2 origins :

  • File System fragmentation : the data file is spread physically on many non contiguous locations on the disk.
  • Internal Fragmentation : the data and index pages have “holes” : this happens when  rows are deleted or updated, especially at random.

As a result, performance is affected by table space fragmentation. Data typically takes more space on disk and in memory. The disk is more busy than it should.

File System fragmentation can be detected using the filefrag command on Linux (and similar on different OS). When using MyISAM, MYI files are usually very fragmented on the FS, much more than the MYD files.

To measure internal InnoDB fragmentation, there is no general formula. It is possible to compute the theoretical size and compare it to the actual size on disk. It only works on fixed length data types. I have this trick : it is mentioned in 50 Tips for Boosting MySQL Performance [CON2655]. The idea is to compare the fragmented table and a much smaller table having the same table definition and sample data.
It works well when the average row length can be estimated from the sample data even for variable length rows.

Here 20000 rows 15-20 Mb on disk gives a good idea. Inserting the sample data the same empty table actually defragments the sample, since it contains non empty rows.

The table is likely internally fragmented

if Avg_row_length(t) > Avg_row_length(t_defrag)

where Avg_row_length comes from show table status.

Example : some random data is created, uniformly distributed random row length : 1M rows, 640Mb. Internal fragmentation is created by deleting the even rows (one every two rows).

As you can see, the table t1 is fragmented since its average row length is 2 times larger than t1_defrag.

Fortunately, there is a command that fixes both FS and internal fragmentation on most storage engines :

For InnoDB, optimize does rebuild the table and analyze it. So this command also works :

After the table rebuild, the average row length is back to normal and the table size is minimal on disk, here 2 times smaller, 324Mb.

Note that the optimize and alter table engine=InnoDB commands take an exclusive lock on the table.

Update : from MySQL 5.6.17, optimize and alter table engine=InnoDB are online operations. See this blog and the MySQL manual for more information.

http://mysqlserverteam.com/mysql-5-6-17-improved-online-optimize-table-for-innodb-and-partitioned-innodb-tables/

It means that if you use these commands before MySQL 5.6.17, your application : reads, writes, other DDL will be blocked during the table rebuild. So most of the time these commands were run in maintenance windows. Or using special tricks such as pt-online-schema-change or Online Schema Change at Facebook.

In MySQL 5.6, this is no longer required in most cases thanks to InnoDB online DDL. Online DDL is a major 5.6 feature. Even though optimize table is not an online operation, in practice, you can use this online DDL :

where <row_format> is given by show table status. Usually :

Make sure :

Check also : innodb_sort_buffer_size that can speed up online DDL.

The disk usage is even slightly smaller, 320Mb, than the classical rebuild and it is also slightly faster.

With innodb_sort_buffer_size= 32M, the alter is faster :

 

This entry was posted in online DDL, tip. Bookmark the permalink.

2 Responses to Poor man’s Online Optimize in 5.6

  1. If the InnoDB table has secondary indexes and fast-index create is not used during the rebuild then they will still be fragmented after the rebuild unless the secondary key order matches the PK order — which is unlikely.

    This is still open – fast index create not used during “alter table foo engine=innodb
    http://bugs.mysql.com/bug.php?id=57583

    And two feature requests are open to get mysqldump output to use fast index create on reload:
    http://bugs.mysql.com/bug.php?id=49120
    http://bugs.mysql.com/bug.php?id=64248

  2. aadant says:

    hi Mark,

    I commented here : http://bugs.mysql.com/bug.php?id=57583

    alter table t1 row_format = Compact; does the trick in 5.6. It is an online DDL workaround and it leads to the minimal table size.

    About the 2 mysqldump bugs, yes, this is a known issue and I asked if we can do something.

Leave a Reply

Your email address will not be published. Required fields are marked *