MySQL 8.0 Reference Manual(讀書筆記56節--Optimizing Database Structure)

东山絮柳仔發表於2024-05-23

In your role as a database designer, look for the most efficient way to organize your schemas, tables, and columns. As when tuning application code, you minimize I/O, keep related items together, and plan ahead so that performance stays high as the data volume increases. Starting with an efficient database design makes it easier for team members to write high-performing application code, and makes the database likely to endure as applications evolve and are rewritten.

1 Optimizing Data Size

Design your tables to minimize their space on the disk. This can result in huge【hjuːdʒ 巨大的;非常成功的;極多的;走紅的;程度高的;】 improvements by reducing the amount of data written to and read from disk. Smaller tables normally require less main memory while their contents are being actively processed during query execution. Any space reduction for table data also results in smaller indexes that can be processed faster.

MySQL supports many different storage engines (table types) and row formats. For each table, you can decide which storage and indexing method to use. Choosing the proper table format for your application can give you a big performance gain.

You can get better performance for a table and minimize storage space by using the techniques listed here:

• Table Columns • Row Format • Indexes • Joins • Normalization

Table Columns

• Use the most efficient (smallest) data types possible. MySQL has many specialized【ˈspeʃəlaɪzd 專門的;專用的;專業的;】 types that save disk space and memory. For example, use the smaller integer types if possible to get smaller tables. MEDIUMINT is often a better choice than INT because a MEDIUMINT column uses 25% less space.

• Declare columns to be NOT NULL if possible. It makes SQL operations faster, by enabling better use of indexes and eliminating【ɪˈlɪmɪneɪtɪŋ 消除;排除;消滅,幹掉(尤指敵人或對手);清除;(比賽中)淘汰;】 overhead for testing whether each value is NULL. You also save some storage space, one bit per column. If you really need NULL values in your tables, use them. Just avoid the default setting that allows NULL values in every column.

Row Format

• InnoDB tables are created using the DYNAMIC row format by default. To use a row format other than DYNAMIC, configure innodb_default_row_format, or specify the ROW_FORMAT option explicitly【ɪkˈsplɪsətli 明確地;明白地;】 in a CREATE TABLE or ALTER TABLE statement.

The compact family of row formats, which includes COMPACT, DYNAMIC, and COMPRESSED, decreases row storage space at the cost of increasing CPU use for some operations. If your workload is a typical one that is limited by cache hit rates and disk speed it is likely to be faster. If it is a rare case that is limited by CPU speed, it might be slower.

The compact family of row formats also optimizes CHAR column storage when using a variable-length character set such as utf8mb3 or utf8mb4. With ROW_FORMAT=REDUNDANT, CHAR(N) occupies N × the maximum byte length of the character set. Many languages can be written primarily using single-byte utf8mb3or utf8mb4 characters, so a fixed storage length often wastes space. With the compact family of rows formats, InnoDB allocates a variable amount of storage in the range of N to N × the maximum byte length of the character set for these columns by stripping trailing spaces. The minimum storage length is N bytes to facilitate in-place updates in typical cases.

• To minimize space even further by storing table data in compressed form, specify ROW_FORMAT=COMPRESSED when creating InnoDB tables, or run the myisampack command on an existing MyISAM table. (InnoDB compressed tables are readable and writable, while MyISAM compressed tables are read-only.)

• For MyISAM tables, if you do not have any variable-length columns (VARCHAR, TEXT, or BLOB columns), a fixed-size row format is used. This is faster but may waste some space.You can hint that you want to have fixed length rows even if you have VARCHAR columns with the CREATE TABLE option ROW_FORMAT=FIXED.

Indexes

• The primary index of a table should be as short as possible. This makes identification【aɪˌdentɪfɪˈkeɪʃn 識別;鑑定;確定;確認;身份證明;辨認;密切關聯;強烈的同情感(或諒解、支援);】 of each row easy and efficient. For InnoDB tables, the primary key columns are duplicated in each secondary index entry, so a short primary key saves considerable space if you have many secondary indexes.

• Create only the indexes that you need to improve query performance. Indexes are good for retrieval【rɪˈtriːvl 檢索;取回;資料檢索;索回;】, but slow down insert and update operations. If you access a table mostly by searching on a combination of columns, create a single composite index on them rather than a separate index for each column. The first part of the index should be the column most used. If you always use many columns when selecting from the table, the first column in the index should be the one with the most duplicates, to obtain better compression of the index.

• If it is very likely that a long string column has a unique prefix on the first number of characters, it is better to index only this prefix, using MySQL's support for creating an index on the leftmost part of the column. Shorter indexes are faster, not only because they require less disk space, but because they also give you more hits in the index cache, and thus fewer disk seeks.

Joins

• In some circumstances, it can be beneficial【ˌbenɪˈfɪʃl 有益的;有利的;有用的;有裨益的;】 to split into two a table that is scanned very often. This is especially true if it is a dynamic-format table and it is possible to use a smaller static format table that can be used to find the relevant【ˈreləvənt 相關的;有意義的;有價值的;切題的;】 rows when scanning the table.

• Declare columns with identical【aɪˈdentɪkl 完全相同的;相同的;同一的;完全同樣的;】 information in different tables with identical data types, to speed up joins based on the corresponding【ˌkɔːrəˈspɑːndɪŋ 相應的;相關的;符合的;】 columns.

• Keep column names simple, so that you can use the same name across different tables and simplify join queries. For example, in a table named customer, use a column name of name instead of customer_name. To make your names portable to other SQL servers, consider keeping them shorter than 18 characters.

Normalization

• Normally, try to keep all data nonredundant【無餘度的;非多餘的;不重複的;】 (observing what is referred to in database theory as third normal form). Instead of repeating lengthy values such as names and addresses, assign them unique IDs, repeat these IDs as needed across multiple smaller tables, and join the tables in queries by referencing the IDs in the join clause.

• If speed is more important than disk space and the maintenance costs of keeping multiple copies of data, for example in a business intelligence scenario【səˈnærioʊ 方案;設想;預測;指令碼;(電影或戲劇的)劇情梗概;】 where you analyze all the data from large tables, you can relax the normalization rules, duplicating information or creating summary tables to gain more speed.

2 Optimizing MySQL Data Types

2.1 Optimizing for Numeric Data

• For unique IDs or other values that can be represented as either strings or numbers, prefer numeric columns to string columns. Since large numeric values can be stored in fewer bytes than the corresponding strings, it is faster and takes less memory to transfer and compare them.

• If you are using numeric data, it is faster in many cases to access information from a database (using a live connection) than to access a text file. Information in the database is likely to be stored in a more compact format than in the text file, so accessing it involves fewer disk accesses. You also save code in your application because you can avoid parsing the text file to find line and column boundaries.

2.2 Optimizing for Character and String Types

For character and string columns, follow these guidelines:

• Use binary collation order for fast comparison and sort operations, when you do not need languagespecific collation features. You can use the BINARY operator to use binary collation within a particular query.

• When comparing values from different columns, declare those columns with the same character set and collation wherever possible, to avoid string conversions while running the query.

• For column values less than 8KB in size, use binary VARCHAR instead of BLOB. The GROUP BY and ORDER BY clauses can generate temporary tables, and these temporary tables can use the MEMORY storage engine if the original table does not contain any BLOB columns.

• If a table contains string columns such as name and address, but many queries do not retrieve those columns, consider splitting the string columns into a separate table and using join queries with a foreign key when necessary. When MySQL retrieves any value from a row, it reads a data block containing all the columns of that row (and possibly other adjacent rows). Keeping each row small, with only the most frequently used columns, allows more rows to fit in each data block. Such compact tables reduce disk I/ O and memory usage for common queries.

• When you use a randomly generated value as a primary key in an InnoDB table, prefix it with an ascending value such as the current date and time if possible. When consecutive primary values are physically stored near each other, InnoDB can insert and retrieve them faster.

2.3 Optimizing for BLOB Types

• When storing a large blob containing textual【ˈtekstʃuəl 文字的;篇章的;】 data, consider compressing it first. Do not use this technique when the entire table is compressed by InnoDB or MyISAM.

• For a table with several columns, to reduce memory requirements for queries that do not use the BLOB column, consider splitting the BLOB column into a separate table and referencing it with a join query when needed.

• Since the performance requirements to retrieve and display a BLOB value might be very different from other data types, you could put the BLOB-specific table on a different storage device or even a separate database instance. For example, to retrieve a BLOB might require a large sequential disk read that is better suited to a traditional hard drive than to an SSD device.

• Rather than testing for equality against a very long text string, you can store a hash of the column value in a separate column, index that column, and test the hashed value in queries. (Use the MD5() or CRC32() function to produce the hash value.) Since hash functions can produce duplicate results for different inputs, you still include a clause AND blob_column = long_string_value in the query to guard against false matches; the performance benefit comes from the smaller, easily scanned index for the hashed values.

3 Optimizing for Many Tables

Some techniques for keeping individual queries fast involve splitting data across many tables. When the number of tables runs into the thousands or even millions, the overhead of dealing with all these tables becomes a new performance consideration.

3.1 How MySQL Opens and Closes Tables

When you execute a mysqladmin status command, you should see something like this:

Uptime: 426 Running threads: 1 Questions: 11082
Reloads: 1 Open tables: 12

The Open tables value of 12 can be somewhat puzzling【ˈpʌzlɪŋ 使為難的;費解的;莫名其妙的;】 if you have fewer than 12 tables.

MySQL is multithreaded, so there may be many clients issuing queries for a given table simultaneously【ˌsaɪməlˈteɪniəsli 同時;聯立;急切地;】. To minimize the problem with multiple client sessions having different states on the same table, the table is opened independently by each concurrent session. This uses additional memory but normally increases performance. With MyISAM tables, one extra file descriptor is required for the data file for each client that has the table open. (By contrast, the index file descriptor is shared between all sessions.)

The table_open_cache and max_connections system variables affect the maximum number of files the server keeps open. If you increase one or both of these values, you may run up against a limit imposed【ɪmˈpoʊzd 把…強加於;推行,採用(規章制度);迫使;強制實行;使(別人)接受自己的意見;】 by your operating system on the per-process number of open file descriptors. Many operating systems permit you to increase the open-files limit, although the method varies widely from system to system. Consult your operating system documentation to determine whether it is possible to increase the limit and how to do so.

table_open_cache is related to max_connections. For example, for 200 concurrent running connections, specify a table cache size of at least 200 * N, where N is the maximum number of tables per join in any of the queries which you execute. You must also reserve【rɪˈzɜːrv 保留,留出;擁有,保持,保留(某種權利);留存,儲備;預訂,預約(座位、席位、房間等);】 some extra file descriptors for temporary tables and files.

Make sure that your operating system can handle the number of open file descriptors implied by the table_open_cache setting. If table_open_cache is set too high, MySQL may run out of file descriptors and exhibit【ɪɡˈzɪbɪt 展覽;表現,顯示,顯出(感情、品質或能力);展出;】 symptoms such as refusing connections or failing to perform queries.

Also take into account that the MyISAM storage engine needs two file descriptors for each unique open table. To increase the number of file descriptors available to MySQL, set the open_files_limit system variable.

The cache of open tables is kept at a level of table_open_cache entries. The server autosizes the cache size at startup. To set the size explicitly, set the table_open_cache system variable at startup. MySQL may temporarily open more tables than this to execute queries, as described later in this section.

MySQL closes an unused table and removes it from the table cache under the following circumstances:

• When the cache is full and a thread tries to open a table that is not in the cache.

• When the cache contains more than table_open_cache entries and a table in the cache is no longer being used by any threads.

• When a table-flushing operation occurs. This happens when someone issues a FLUSH TABLES statement or executes a mysqladmin flush-tables or mysqladmin refresh command.

When the table cache fills up, the server uses the following procedure to locate a cache entry to use:

• Tables not currently in use are released, beginning with the table least recently used.

• If a new table must be opened, but the cache is full and no tables can be released, the cache is temporarily extended as necessary. When the cache is in a temporarily extended state and a table goes from a used to unused state, the table is closed and released from the cache.

A MyISAM table is opened for each concurrent access. This means the table needs to be opened twice if two threads access the same table or if a thread accesses the table twice in the same query (for example, by joining the table to itself). Each concurrent open requires an entry in the table cache. The first open of any MyISAM table takes two file descriptors: one for the data file and one for the index file. Each additional use of the table takes only one file descriptor for the data file. The index file descriptor is shared among all threads.

If you are opening a table with the HANDLER tbl_name OPEN statement, a dedicated【ˈdedɪkeɪtɪd 專用的;獻身的;專心致志的;一心一意的;】 table object is allocated for the thread. This table object is not shared by other threads and is not closed until the thread calls HANDLER tbl_name CLOSE or the thread terminates. When this happens, the table is put back in the table cache (if the cache is not full).

To determine whether your table cache is too small, check the Opened_tables status variable, which indicates the number of table-opening operations since the server started:

mysql> SHOW GLOBAL STATUS LIKE 'Opened_tables';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Opened_tables | 2741  |
+---------------+-------+

If the value is very large or increases rapidly, even when you have not issued many FLUSH TABLES statements, increase the table_open_cache value at server startup.

3.2 Disadvantages of Creating Many Tables in the Same Database

If you have many MyISAM tables in the same database directory, open, close, and create operations are slow. If you execute SELECT statements on many different tables, there is a little overhead when the table cache is full, because for every table that has to be opened, another must be closed. You can reduce this overhead by increasing the number of entries permitted in the table cache.

4 Internal Temporary Table Use in MySQL

In some cases, the server creates internal temporary tables while processing statements. Users have no direct control over when this occurs.

The server creates temporary tables under conditions such as these:

• Evaluation【ɪˌvæljuˈeɪʃn 評價;評審;估計;定值;計值;】 of UNION statements, with some exceptions described later.

• Evaluation of some views, such those that use the TEMPTABLE algorithm, UNION, or aggregation.

• Evaluation of derived tables.

• Evaluation of common table expressions.

• Tables created for subquery or semijoin materialization.

• Evaluation of statements that contain an ORDER BY clause and a different GROUP BY clause, or for which the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue.

• Evaluation of DISTINCT combined with ORDER BY may require a temporary table.

• For queries that use the SQL_SMALL_RESULT modifier, MySQL uses an in-memory temporary table, unless the query also contains elements (described later) that require on-disk storage.

• To evaluate INSERT ... SELECT statements that select from and insert into the same table, MySQL creates an internal temporary table to hold the rows from the SELECT, then inserts those rows into the target table.

• Evaluation of multiple-table UPDATE statements.

• Evaluation of GROUP_CONCAT() or COUNT(DISTINCT) expressions.

• Evaluation of window functions uses temporary tables as necessary.

To determine whether a statement requires a temporary table, use EXPLAIN and check the Extra column to see whether it says Using temporary. EXPLAIN does not necessarily say Using temporary for derived or materialized temporary tables. For statements that use window functions, EXPLAIN with FORMAT=JSON always provides information about the windowing steps. If the windowing functions use temporary tables, it is indicated for each step.

Some query conditions prevent the use of an in-memory temporary table, in which case the server uses an on-disk table instead:

• Presence of a BLOB or TEXT column in the table. However, the TempTable storage engine, which is the default storage engine for in-memory internal temporary tables in MySQL 8.0, supports binary large object types as of MySQL 8.0.13.

• Presence of any string column with a maximum length larger than 512 (bytes for binary strings, characters for nonbinary strings) in the SELECT list, if UNION or UNION ALL is used.

• The SHOW COLUMNS and DESCRIBE statements use BLOB as the type for some columns, thus the temporary table used for the results is an on-disk table.

The server does not use a temporary table for UNION statements that meet certain qualifications. Instead, it retains from temporary table creation only the data structures necessary to perform result column typecasting. The table is not fully instantiated and no rows are written to or read from it; rows are sent directly to the client. The result is reduced memory and disk requirements, and smaller delay before the first row is sent to the client because the server need not wait until the last query block is executed. EXPLAIN and optimizer trace output reflects this execution strategy: The UNION RESULT query block is not present because that block corresponds to the part that reads from the temporary table.

These conditions qualify a UNION for evaluation without a temporary table:

• The union is UNION ALL, not UNION or UNION DISTINCT.

• There is no global ORDER BY clause.

• The union is not the top-level query block of an {INSERT | REPLACE} ... SELECT ... statement.

Internal Temporary Table Storage Engine

An internal temporary table can be held in memory and processed by the TempTable or MEMORY storage engine, or stored on disk by the InnoDB storage engine.

Storage Engine for In-Memory Internal Temporary Tables

The internal_tmp_mem_storage_engine variable defines the storage engine used for in-memory internal temporary tables. Permitted values are TempTable (the default) and MEMORY.

【As of MySQL 8.0.27, configuring a session setting for internal_tmp_mem_storage_engine requires the SESSION_VARIABLES_ADMIN or SYSTEM_VARIABLES_ADMIN privilege.】

The TempTable storage engine provides efficient storage for VARCHAR and VARBINARY columns, and other binary large object types as of MySQL 8.0.13.

The following variables control TempTable storage engine limits and behavior:

• tmp_table_size: From MySQL 8.0.28, tmp_table_size defines the maximum size of any individual in-memory internal temporary table created by the TempTable storage engine. When the tmp_table_size limit is reached, MySQL automatically converts the in-memory internal temporary table to an InnoDB on-disk internal temporary table. The default tmp_table_size setting is 16777216 bytes (16 MiB).

The tmp_table_size limit is intended to prevent individual queries from consuming an inordinate【ɪnˈɔːrdɪnət 過度的;過分的;超乎預料的;】 amount global TempTable resources, which can affect the performance of concurrent queries that require TempTable resources. Global TempTable resources are controlled by the temptable_max_ram and temptable_max_mmap settings.

If the tmp_table_size limit is less than the temptable_max_ram limit, it is not possible for an in-memory temporary table to contain more data than permitted by the tmp_table_size limit. If the tmp_table_size limit is greater than the sum of the temptable_max_ram and temptable_max_mmap limits, it is not possible for an in-memory temporary table to contain more than the sum of the temptable_max_ram and temptable_max_mmap limits.

• temptable_max_ram: Defines the maximum amount of RAM that can be used by the TempTable storage engine before it starts allocating space from memory-mapped files or before MySQL starts using InnoDB on-disk internal temporary tables, depending on your configuration. The default temptable_max_ram setting is 1073741824 bytes (1GiB).

【The temptable_max_ram setting does not account for the thread-local memory block allocated to each thread that uses the TempTable storage engine. The size of the thread-local memory block depends on the size of the thread's first memory allocation request. If the request is less than 1MB, which it is in most cases, the thread-local memory block size is 1MB. If the request is greater than 1MB, the thread-local memory block is approximately the same size as the initial memory request. The thread-local memory block is held in thread-local storage until thread exit.】

• temptable_use_mmap: Controls whether the TempTable storage engine allocates space from memory-mapped files or MySQL uses InnoDB on-disk internal temporary tables when the temptable_max_ram limit is exceeded. The default setting is temptable_use_mmap=ON.

【The temptable_use_mmap variable was introduced in MySQL 8.0.16 and deprecated in MySQL 8.0.26; expect support for it to be removed in a future version of MySQL. Setting temptable_max_mmap=0 is equivalent to setting temptable_use_mmap=OFF.】

• temptable_max_mmap: Introduced in MySQL 8.0.23. Defines the maximum amount of memory the TempTable storage engine is permitted to allocate from memory-mapped files before MySQL starts using InnoDB on-disk internal temporary tables. The default setting is 1073741824 bytes (1GiB). The limit is intended to address the risk of memory mapped files using too much space in the temporary directory (tmpdir). A temptable_max_mmap=0 setting disables allocation from memory-mapped files, effectively disabling their use, regardless of the temptable_use_mmap setting.

Use of memory-mapped files by the TempTable storage engine is governed by these rules;

• Temporary files are created in the directory defined by the tmpdir variable.

• Temporary files are deleted immediately after they are created and opened, and therefore do not remain visible in the tmpdir directory. The space occupied by temporary files is held by the operating system while temporary files are open. The space is reclaimed【rɪˈkleɪmd 開墾,利用,改造(荒地);取回;要求歸還;拿回;沙化;荒漠化;重新變為沙漠(或森林等);拋荒;】 when temporary files are closed by the TempTable storage engine, or when the mysqld process is shut down.

• Data is never moved between RAM and temporary files, within RAM, or between temporary files.

• New data is stored in RAM if space becomes available within the limit defined by temptable_max_ram. Otherwise, new data is stored in temporary files.

• If space becomes available in RAM after some of the data for a table is written to temporary files, it is possible for the remaining table data to be stored in RAM.

When using the MEMORY storage engine for in-memory temporary tables (internal_tmp_mem_storage_engine=MEMORY), MySQL automatically converts an in-memory temporary table to an on-disk table if it becomes too large. The maximum size of an in-memory temporary table is defined by the tmp_table_size or max_heap_table_size value, whichever is smaller. This differs from MEMORY tables explicitly created with CREATE TABLE. For such tables, only the max_heap_table_size variable determines how large a table can grow, and there is no conversion to on-disk format.

Storage Engine for On-Disk Internal Temporary Tables

In MySQL 8.0.15 and earlier, the internal_tmp_disk_storage_engine variable defined the storage engine used for on-disk internal temporary tables. Supported storage engines were InnoDB and MyISAM.

From MySQL 8.0.16, MySQL uses only the InnoDB storage engine for on-disk internal temporary tables. The MYISAM storage engine is no longer supported for this purpose.

InnoDB on-disk internal temporary tables are created in session temporary tablespaces that reside in the data directory by default.

In MySQL 8.0.15 and earlier:

• For common table expressions (CTEs), the storage engine used for on-disk internal temporary tables cannot be MyISAM. If internal_tmp_disk_storage_engine=MYISAM, an error occurs for any attempt to materialize a CTE using an on-disk temporary table.

• When using internal_tmp_disk_storage_engine=INNODB, queries that generate on-disk internal temporary tables that exceed InnoDB row or column limits return Row size too large or Too many columns errors. The workaround is to set internal_tmp_disk_storage_engine to MYISAM.

Internal Temporary Table Storage Format

When in-memory internal temporary tables are managed by the TempTable storage engine, rows that include VARCHAR columns, VARBINARY columns, and other binary large object type columns (supported as of MySQL 8.0.13) are represented in memory by an array of cells, with each cell containing a NULL flag, the data length, and a data pointer. Column values are placed in consecutive【kənˈsekjətɪv 連續的;連續不斷的;】 order after the array, in a single region of memory, without padding. Each cell in the array uses 16 bytes of storage. The same storage format applies when the TempTable storage engine allocates space from memory-mapped files.

When in-memory internal temporary tables are managed by the MEMORY storage engine, fixed-length row format is used. VARCHAR and VARBINARY column values are padded to the maximum column length, in effect storing them as CHAR and BINARY columns.

Prior to MySQL 8.0.16, on-disk internal temporary tables were managed by the InnoDB or MyISAM storage engine (depending on the internal_tmp_disk_storage_engine setting). Both engines store internal temporary tables using dynamic-width row format. Columns take only as much storage as needed, which reduces disk I/O, space requirements, and processing time compared to on-disk tables that use fixedlength rows. Beginning with MySQL 8.0.16, internal_tmp_disk_storage_engine is not supported, and internal temporary tables on disk are always managed by InnoDB.

When using the MEMORY storage engine, statements can initially create an in-memory internal temporary table and then convert it to an on-disk table if the table becomes too large. In such cases, better performance might be achieved by skipping the conversion and creating the internal temporary table on disk to begin with. The big_tables variable can be used to force disk storage of internal temporary tables.

Monitoring Internal Temporary Table Creation

When an internal temporary table is created in memory or on disk, the server increments the Created_tmp_tables value. When an internal temporary table is created on disk, the server increments the Created_tmp_disk_tables value. If too many internal temporary tables are created on disk, consider adjusting the engine-specific limits.

【Due to a known limitation, Created_tmp_disk_tables does not count on-disk temporary tables created in memory-mapped files. By default, the TempTable storage engine overflow mechanism creates internal temporary tables in memorymapped files. 】

The memory/temptable/physical_ram and memory/temptable/physical_disk Performance Schema instruments can be used to monitor TempTable space allocation from memory and disk. memory/temptable/physical_ram reports the amount of allocated RAM. memory/temptable/ physical_disk reports the amount of space allocated from disk when memory-mapped files are used as the TempTable overflow mechanism. If the physical_disk instrument reports a value other than 0 and memory-mapped files are used as the TempTable overflow mechanism, a TempTable memory limit was reached at some point. Data can be queried in Performance Schema memory summary tables such as memory_summary_global_by_event_name.

5 Limits on Number of Databases and Tables

MySQL has no limit on the number of databases. The underlying file system may have a limit on the number of directories.

MySQL has no limit on the number of tables. The underlying file system may have a limit on the number of files that represent tables. Individual storage engines may impose engine-specific constraints. InnoDB permits up to 4 billion tables.

6 Limits on Table Size

The effective maximum table size for MySQL databases is usually determined by operating system constraints on file sizes, not by MySQL internal limits. For up-to-date information operating system file size limits, refer to the documentation specific to your operating system.

Windows users, please note that FAT and VFAT (FAT32) are not considered suitable for production use with MySQL. Use NTFS instead.

If you encounter a full-table error, there are several reasons why it might have occurred:

• The disk might be full.

• You are using InnoDB tables and have run out of room in an InnoDB tablespace file. The maximum tablespace size is also the maximum size for a table.

Generally, partitioning of tables into multiple tablespace files is recommended for tables larger than 1TB in size.

• You have hit an operating system file size limit. For example, you are using MyISAM tables on an operating system that supports files only up to 2GB in size and you have hit this limit for the data file or index file.

• You are using a MyISAM table and the space required for the table exceeds what is permitted by the internal pointer size. MyISAM permits data and index files to grow up to 256TB by default, but this limit can be changed up to the maximum permissible size of 65,536TB (2567 − 1 bytes).

If you need a MyISAM table that is larger than the default limit and your operating system supports large files, the CREATE TABLE statement supports AVG_ROW_LENGTH and MAX_ROWS options.The server uses these options to determine how large a table to permit.

If the pointer size is too small for an existing table, you can change the options with ALTER TABLE to increase a table's maximum permissible size.

ALTER TABLE tbl_name MAX_ROWS=1000000000 AVG_ROW_LENGTH=nnn;

You have to specify AVG_ROW_LENGTH only for tables with BLOB or TEXT columns; in this case, MySQL cannot optimize the space required based only on the number of rows.

To change the default size limit for MyISAM tables, set the myisam_data_pointer_size, which sets the number of bytes used for internal row pointers. The value is used to set the pointer size for new tables if you do not specify the MAX_ROWS option. The value of myisam_data_pointer_size can be from 2 to 7. For example, for tables that use the dynamic storage format, a value of 4 permits tables up to 4GB; a value of 6 permits tables up to 256TB. Tables that use the fixed storage format have a larger maximum data length.

You can check the maximum data and index sizes by using this statement:

SHOW TABLE STATUS FROM db_name LIKE 'tbl_name';

You also can use myisamchk -dv /path/to/table-index-file.

Other ways to work around【əˈraʊnd 周圍;圍繞;大約;四周;環繞;出現;到處;轉彎;閒散地;】 file-size limits for MyISAM tables are as follows:

• If your large table is read only, you can use myisampack to compress it. myisampack usually compresses a table by at least 50%, so you can have, in effect, much bigger tables. myisampack also can merge multiple tables into a single table.

• MySQL includes a MERGE library that enables you to handle a collection of MyISAM tables that have identical structure as a single MERGE table.

• You are using the MEMORY (HEAP) storage engine; in this case you need to increase the value of the max_heap_table_size system variable.

7 Limits on Table Column Count and Row Size

This section describes limits on the number of columns in tables and the size of individual rows.

• Column Count Limits • Row Size Limits

Column Count Limits

MySQL has hard limit of 4096 columns per table, but the effective maximum may be less for a given table. The exact column limit depends on several factors:

• The maximum row size for a table constrains【kənˈstreɪnz 約束;限制;限定;強制;強迫;迫使;】 the number (and possibly size) of columns because the total length of all columns cannot exceed this size.

• The storage requirements of individual columns constrain the number of columns that fit within a given maximum row size. Storage requirements for some data types depend on factors such as storage engine, storage format, and character set.

• Storage engines may impose additional restrictions that limit table column count. For example, InnoDB has a limit of 1017 columns per table.

• Functional key parts are implemented as hidden virtual generated stored columns, so each functional key part in a table index counts against the table total column limit.

Row Size Limits

The maximum row size for a given table is determined by several factors:

• The internal representation of a MySQL table has a maximum row size limit of 65,535 bytes, even if the storage engine is capable of supporting larger rows. BLOB and TEXT columns only contribute 9 to 12 bytes toward the row size limit because their contents are stored separately from the rest of the row.

• The maximum row size for an InnoDB table, which applies to data stored locally within a database page, is slightly【ˈslaɪtli 稍微;略微;輕微地;身材瘦小的;】 less than half a page for 4KB, 8KB, 16KB, and 32KB innodb_page_size settings. For example, the maximum row size is slightly less than 8KB for the default 16KB InnoDB page size. For 64KB pages, the maximum row size is slightly less than 16KB.

If a row containing variable-length columns exceeds the InnoDB maximum row size, InnoDB selects variable-length columns for external off-page storage until the row fits within the InnoDB row size limit. The amount of data stored locally for variable-length columns that are stored off-page differs by row format.

• Different storage formats use different amounts of page header and trailer data, which affects the amount of storage available for rows.

Row Size Limit Examples

• The MySQL maximum row size limit of 65,535 bytes is demonstrated in the following InnoDB and MyISAM examples. The limit is enforced regardless of storage engine, even though the storage engine may be capable of supporting larger rows.

mysql> CREATE TABLE t (a VARCHAR(10000), b VARCHAR(10000),
 c VARCHAR(10000), d VARCHAR(10000), e VARCHAR(10000),
 f VARCHAR(10000), g VARCHAR(6000)) ENGINE=InnoDB CHARACTER SET latin1;
ERROR 1118 (42000): Row size too large. The maximum row size for the used
table type, not counting BLOBs, is 65535. This includes storage overhead,
check the manual. You have to change some columns to TEXT or BLOBs

In the following MyISAM example, changing a column to TEXT avoids the 65,535-byte row size limit and permits the operation to succeed because BLOB and TEXT columns only contribute 9 to 12 bytes toward the row size.

mysql> CREATE TABLE t (a VARCHAR(10000), b VARCHAR(10000),
 c VARCHAR(10000), d VARCHAR(10000), e VARCHAR(10000),
 f VARCHAR(10000), g TEXT(6000)) ENGINE=MyISAM CHARACTER SET latin1;
Query OK, 0 rows affected (0.02 sec)

The operation succeeds for an InnoDB table because changing a column to TEXT avoids the MySQL 65,535-byte row size limit, and InnoDB off-page storage of variable-length columns avoids the InnoDB row size limit.

mysql> CREATE TABLE t (a VARCHAR(10000), b VARCHAR(10000),
 c VARCHAR(10000), d VARCHAR(10000), e VARCHAR(10000),
 f VARCHAR(10000), g TEXT(6000)) ENGINE=InnoDB CHARACTER SET latin1;
Query OK, 0 rows affected (0.02 sec)

• Storage for variable-length columns includes length bytes, which are counted toward the row size. For example, a VARCHAR(255) CHARACTER SET utf8mb3 column takes two bytes to store the length of the value, so each value can take up to 767 bytes.

The statement to create table t1 succeeds because the columns require 32,765 + 2 bytes and 32,766 + 2 bytes, which falls within the maximum row size of 65,535 bytes:

mysql> CREATE TABLE t1
 (c1 VARCHAR(32765) NOT NULL, c2 VARCHAR(32766) NOT NULL)
 ENGINE = InnoDB CHARACTER SET latin1;
Query OK, 0 rows affected (0.02 sec)

The statement to create table t2 fails because, although the column length is within the maximum length of 65,535 bytes, two additional bytes are required to record the length, which causes the row size to exceed 65,535 bytes:

mysql> CREATE TABLE t2
 (c1 VARCHAR(65535) NOT NULL)
 ENGINE = InnoDB CHARACTER SET latin1;
ERROR 1118 (42000): Row size too large. The maximum row size for the used
table type, not counting BLOBs, is 65535. This includes storage overhead,
check the manual. You have to change some columns to TEXT or BLOBs

Reducing the column length to 65,533 or less permits the statement to succeed.

mysql> CREATE TABLE t2
 (c1 VARCHAR(65533) NOT NULL)
 ENGINE = InnoDB CHARACTER SET latin1;
Query OK, 0 rows affected (0.01 sec)

• For MyISAM tables, NULL columns require additional space in the row to record whether their values are NULL. Each NULL column takes one bit extra, rounded up to the nearest byte.

The statement to create table t3 fails because MyISAM requires space for NULL columns in addition to the space required for variable-length column length bytes, causing the row size to exceed 65,535 bytes:

mysql> CREATE TABLE t3
 (c1 VARCHAR(32765) NULL, c2 VARCHAR(32766) NULL)
 ENGINE = MyISAM CHARACTER SET latin1;
ERROR 1118 (42000): Row size too large. The maximum row size for the used
table type, not counting BLOBs, is 65535. This includes storage overhead,
check the manual. You have to change some columns to TEXT or BLOBs

• InnoDB restricts row size (for data stored locally within the database page) to slightly less than half a database page for 4KB, 8KB, 16KB, and 32KB innodb_page_size settings, and to slightly less than 16KB for 64KB pages.

The statement to create table t4 fails because the defined columns exceed the row size limit for a 16KB InnoDB page.

mysql> CREATE TABLE t4 (
 c1 CHAR(255),c2 CHAR(255),c3 CHAR(255),
 c4 CHAR(255),c5 CHAR(255),c6 CHAR(255),
 c7 CHAR(255),c8 CHAR(255),c9 CHAR(255),
 c10 CHAR(255),c11 CHAR(255),c12 CHAR(255),
 c13 CHAR(255),c14 CHAR(255),c15 CHAR(255),
 c16 CHAR(255),c17 CHAR(255),c18 CHAR(255),
 c19 CHAR(255),c20 CHAR(255),c21 CHAR(255),
 c22 CHAR(255),c23 CHAR(255),c24 CHAR(255),
 c25 CHAR(255),c26 CHAR(255),c27 CHAR(255),
 c28 CHAR(255),c29 CHAR(255),c30 CHAR(255),
 c31 CHAR(255),c32 CHAR(255),c33 CHAR(255)
 ) ENGINE=InnoDB ROW_FORMAT=DYNAMIC DEFAULT CHARSET latin1;
ERROR 1118 (42000): Row size too large (> 8126). Changing some columns to TEXT or BLOB may help.
In current row format, BLOB prefix of 0 bytes is stored inline.

相關文章