Your company here — click to reach over 10,000 unique daily visitors

mk-find.1p - Man Page

Find MySQL tables and execute actions, like GNU find.


Usage: mk-find [OPTION...] [DATABASE...]

mk-find searches for MySQL tables and executes actions, like GNU find.  The default action is to print the database and table name.

Find all tables created more than a day ago, which use the MyISAM engine, and print their names:

  mk-find --ctime +1 --engine MyISAM

Find InnoDB tables that haven't been updated in a month, and convert them to MyISAM storage engine (data warehousing, anyone?):

  mk-find --mtime +30 --engine InnoDB --exec "ALTER TABLE %D.%N ENGINE=MyISAM"

Find tables created by a process that no longer exists, following the name_sid_pid naming convention, and remove them.

  mk-find --connection-id '\D_\d+_(\d+)$' --server-id '\D_(\d+)_\d+$' --exec-plus "DROP TABLE %s"

Find empty tables in the test and junk databases, and delete them:

  mk-find --empty junk test --exec-plus "DROP TABLE %s"

Find tables more than five gigabytes in total size:

  mk-find --tablesize +5G

Find all tables and print their total data and index size, and sort largest tables first (sort is a different program, by the way).

  mk-find --printf "%T\t%D.%N\n" | sort -rn

As above, but this time, insert the data back into the database for posterity:

  mk-find --noquote --exec "INSERT INTO sysdata.tblsize(db, tbl, size) VALUES('%D', '%N', %T)"


The following section is included to inform users about the potential risks, whether known or unknown, of using this tool.  The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs.

mk-find only reads and prints information by default, but "--exec" and "--exec-plus" can execute user-defined SQL.  You should be as careful with it as you are with any command-line tool that can execute queries against your database.

At the time of this release, we know of no bugs that could cause serious harm to users.

The authoritative source for updated information is always the online issue tracking system.  Issues that affect this tool will be marked as such.  You can see a list of such issues at the following URL: <http://www.maatkit.org/bugs/mk-find>.

See also "Bugs" for more information on filing bugs and getting help.


mk-find looks for MySQL tables that pass the tests you specify, and executes the actions you specify.  The default action is to print the database and table name to STDOUT.

mk-find is simpler than GNU find.  It doesn't allow you to specify complicated expressions on the command line.

mk-find uses SHOW TABLES when possible, and SHOW TABLE STATUS when needed.

Option Types

There are three types of options: normal options, which determine some behavior or setting; tests, which determine whether a table should be included in the list of tables found; and actions, which do something to the tables mk-find finds.

mk-find uses standard Getopt::Long option parsing, so you should use double dashes in front of long option names, unlike GNU find.


This tool accepts additional command-line arguments.  Refer to the "Synopsis" and usage information for details.


Prompt for a password when connecting to MySQL.


Specifies that all regular expression searches are case-insensitive.


short form: -A; type: string

Default character set.  If the value is utf8, sets Perl's binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL.  Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.


type: Array

Read this comma-separated list of config files; if specified, this must be the first option on the command line.


Measure times (for "--mmin", etc) from the beginning of today rather than from the current time.


short form: -F; type: string

Only read mysql options from the given file.  You must give an absolute pathname.


Show help and exit.


short form: -h; type: string

Connect to host.


Combine tests with OR, not AND.

By default, tests are evaluated as though there were an AND between them.  This option switches it to OR.

Option parsing is not implemented by mk-find itself, so you cannot specify complicated expressions with parentheses and mixtures of OR and AND.


short form: -p; type: string

Password to use when connecting.


type: string

Create the given PID file.  The file contains the process ID of the script. The PID file is removed when the script exits.  Before starting, the script checks if the PID file already exists.  If it does not, then the script creates and writes its own PID to it.  If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.


short form: -P; type: int

Port number to use for connection.


default: yes

Quotes MySQL identifier names with MySQL's standard backtick character.

Quoting happens after tests are run, and before actions are run.


type: string; default: wait_timeout=10000

Set these MySQL variables.  Immediately after connecting to MySQL, this string will be appended to SET and executed.


short form: -S; type: string

Socket file to use for connection.


short form: -u; type: string

User for login if not current user.


Show version and exit.


Most tests check some criterion against a column of SHOW TABLE STATUS output. Numeric arguments can be specified as +n for greater than n, -n for less than n, and n for exactly n.  All numeric options can take an optional suffix multiplier of k, M or G (1_024, 1_048_576, and 1_073_741_824 respectively).  All patterns are Perl regular expressions (see 'man perlre') unless specified as SQL LIKE patterns.

Dates and times are all measured relative to the same instant, when mk-find first asks the database server what time it is.  All date and time manipulation is done in SQL, so if you say to find tables modified 5 days ago, that translates to SELECT DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 5 DAY).  If you specify "--day-start", if course it's relative to CURRENT_DATE instead.

However, table sizes and other metrics are not consistent at an instant in time.  It can take some time for MySQL to process all the SHOW queries, and mk-find can't do anything about that.  These measurements are as of the time they're taken.

If you need some test that's not in this list, file a bug report and I'll enhance mk-find for you.  It's really easy.


type: string; group: Tests

Table's next AUTO_INCREMENT is n.  This tests the Auto_increment column.


type: size; group: Tests

Table avg row len is n bytes.  This tests the Avg_row_length column. The specified size can be "NULL" to test where Avg_row_length IS NULL.


type: string; group: Tests

Table checksum is n.  This tests the Checksum column.


type: size; group: Tests

Table was created n minutes ago.  This tests the Create_time column.


type: string; group: Tests

Table collation matches pattern.  This tests the Collation column.


type: string; group: Tests

A column name in the table matches pattern.


type: string; group: Tests

A column in the table matches this type (case-insensitive).

Examples of types are: varchar, char, int, smallint, bigint, decimal, year, timestamp, text, enum.


type: string; group: Tests

Table comment matches pattern.  This tests the Comment column.


type: string; group: Tests

Table name has nonexistent MySQL connection ID.  This tests the table name for a pattern.  The argument to this test must be a Perl regular expression that captures digits like this: (\d+).  If the table name matches the pattern, these captured digits are taken to be the MySQL connection ID of some process. If the connection doesn't exist according to SHOW FULL PROCESSLIST, the test returns true.  If the connection ID is greater than mk-find's own connection ID, the test returns false for safety.

Why would you want to do this?  If you use MySQL statement-based replication, you probably know the trouble temporary tables can cause.  You might choose to work around this by creating real tables with unique names, instead of temporary tables.  One way to do this is to append your connection ID to the end of the table, thusly: scratch_table_12345.  This assures the table name is unique and lets you have a way to find which connection it was associated with.  And perhaps most importantly, if the connection no longer exists, you can assume the connection died without cleaning up its tables, and this table is a candidate for removal.

This is how I manage scratch tables, and that's why I included this test in mk-find.

The argument I use to "--connection-id" is "\D_(\d+)$".  That finds tables with a series of numbers at the end, preceded by an underscore and some non-number character (the latter criterion prevents me from examining tables with a date at the end, which people tend to do: baron_scratch_2007_05_07 for example).  It's better to keep the scratch tables separate of course.

If you do this, make sure the user mk-find runs as has the PROCESS privilege! Otherwise it will only see connections from the same user, and might think some tables are ready to remove when they're still in use.  For safety, mk-find checks this for you.

See also "--server-id".


type: string; group: Tests

Table create option matches pattern.  This tests the Create_options column.


type: size; group: Tests

Table was created n days ago.  This tests the Create_time column.


type: size; group: Tests

Table has n bytes of free space.  This tests the Data_free column. The specified size can be "NULL" to test where Data_free IS NULL.


type: size; group: Tests

Table data uses n bytes of space.  This tests the Data_length column. The specified size can be "NULL" to test where Data_length IS NULL.


type: string; group: Tests

Database name matches SQL LIKE pattern.


type: string; group: Tests

Database name matches this pattern.


group: Tests

Table has no rows.  This tests the Rows column.


type: string; group: Tests

Table storage engine matches this pattern.  This tests the Engine column, or in earlier versions of MySQL, the Type column.


type: string; group: Tests

Function definition matches pattern.


type: size; group: Tests

Table indexes use n bytes of space.  This tests the Index_length column. The specified size can be "NULL" to test where Index_length IS NULL.


type: size; group: Tests

Table was checked n minutes ago.  This tests the Check_time column.


type: size; group: Tests

Table was checked n days ago.  This tests the Check_time column.


type: size; group: Tests

Table was last modified n minutes ago.  This tests the Update_time column.


type: size; group: Tests

Table was last modified n days ago.  This tests the Update_time column.


type: string; group: Tests

Procedure definition matches pattern.


type: string; group: Tests

Table row format matches pattern.  This tests the Row_format column.


type: size; group: Tests

Table has n rows.  This tests the Rows column. The specified size can be "NULL" to test where Rows IS NULL.


type: string; group: Tests

Table name contains the server ID.  If you create temporary tables with the naming convention explained in "--connection-id", but also add the server ID of the server on which the tables are created, then you can use this pattern match to ensure tables are dropped only on the server they're created on.  This prevents a table from being accidentally dropped on a slave while it's in use (provided that your server IDs are all unique, which they should be for replication to work).

For example, on the master (server ID 22) you create a table called scratch_table_22_12345.  If you see this table on the slave (server ID 23), you might think it can be dropped safely if there's no such connection 12345.  But if you also force the name to match the server ID with --server-id '\D_(\d+)_\d+$', the table won't be dropped on the slave.


type: size; group: Tests

Table uses n bytes of space.  This tests the sum of the Data_length and Index_length columns.


type: string; group: Tests

Table name matches SQL LIKE pattern.


type: string; group: Tests

Table name matches this pattern.


type: size; group: Tests

Table version is n.  This tests the Version column.


type: string; group: Tests

Trigger action statement matches pattern.


type: string; group: Tests

"--trigger" is defined on table matching pattern.


type: string; group: Tests

CREATE VIEW matches this pattern.


The "--exec-plus" action happens after everything else, but otherwise actions happen in an indeterminate order.  If you need determinism, file a bug report and I'll add this feature.


type: string; group: Actions

Execute this SQL with each item found.  The SQL can contain escapes and formatting directives (see "--printf").


type: string; group: Actions

Specify a DSN in key-value format to use when executing SQL with "--exec" and "--exec-plus".  Any values not specified are inherited from command-line arguments.


type: string; group: Actions

Execute this SQL with all items at once.  This option is unlike "--exec".  There are no escaping or formatting directives; there is only one special placeholder for the list of database and table names, %s.  The list of tables found will be joined together with commas and substituted wherever you place %s.

You might use this, for example, to drop all the tables you found:


This is sort of like GNU find's "-exec command {} +" syntax.  Only it's not totally cryptic.  And it doesn't require me to write a command-line parser.


group: Actions

Print the database and table name, followed by a newline.  This is the default action if no other action is specified.


type: string; group: Actions

Print format on the standard output, interpreting '\' escapes and '%' directives.  Escapes are backslashed characters, like \n and \t.  Perl interprets these, so you can use any escapes Perl knows about.  Directives are replaced by %s, and as of this writing, you can't add any special formatting instructions, like field widths or alignment (though I'm musing over ways to do that).

Here is a list of the directives.  Note that most of them simply come from columns of SHOW TABLE STATUS.  If the column is NULL or doesn't exist, you get an empty string in the output.  A % character followed by any character not in the following list is discarded (but the other character is printed).

   ---- ------------------ ------------------------------------------
   a    Auto_increment
   A    Avg_row_length
   c    Checksum
   C    Create_time
   D    Database           The database name in which the table lives
   d    Data_length
   E    Engine             In older versions of MySQL, this is Type
   F    Data_free
   f    Innodb_free        Parsed from the Comment field
   I    Index_length
   K    Check_time
   L    Collation
   M    Max_data_length
   N    Name
   O    Comment
   P    Create_options
   R    Row_format
   S    Rows
   T    Table_length       Data_length+Index_length
   U    Update_time
   V    Version

DSN Options

These DSN options are used to create a DSN.  Each option is given like option=value.  The options are case-sensitive, so P and p are not the same option.  There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted.  DSN options are comma-separated.  See the maatkit manpage for full details.


You can download Maatkit from Google Code at <http://code.google.com/p/maatkit/>, or you can get any of the tools easily with a command like the following:

   wget http://www.maatkit.org/get/toolname
   wget http://www.maatkit.org/trunk/toolname

Where toolname can be replaced with the name (or fragment of a name) of any of the Maatkit tools.  Once downloaded, they're ready to run; no installation is needed.  The first URL gets the latest released version of the tool, and the second gets the latest trunk code from Subversion.


The environment variable MKDEBUG enables verbose debugging output in all of the Maatkit tools:

   MKDEBUG=1 mk-....

System Requirements

You need the following Perl modules: DBI and DBD::mysql.


For a list of known bugs see <http://www.maatkit.org/bugs/mk-find>.

Please use Google Code Issues and Groups to report bugs or request support: <http://code.google.com/p/maatkit/>.  You can also join #maatkit on Freenode to discuss Maatkit.

Please include the complete command-line used to reproduce the problem you are seeing, the version of all MySQL servers involved, the complete output of the tool when run with "--version", and if possible, debugging output produced by running with the MKDEBUG=1 environment variable.

Copyright, License and Warranty

This program is copyright 2007-2011 Baron Schwartz. Feedback and improvements are welcome (see "Bugs").


This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License.  On UNIX and similar systems, you can issue `man perlgpl' or `man perlartistic' to read these licenses.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA.


Baron Schwartz

About Maatkit

This tool is part of Maatkit, a toolkit for power users of MySQL.  Maatkit was created by Baron Schwartz; Baron and Daniel Nichter are the primary code contributors.  Both are employed by Percona.  Financial support for Maatkit development is primarily provided by Percona and its clients.


This manual page documents Ver 0.9.23 Distrib 7540 $Revision: 7477 $.


2024-01-25 perl v5.38.2 User Contributed Perl Documentation