innodb 頁儲存管理的實驗

Steven1981發表於2010-11-08

MYSQL 5.0.68 on RHEL 5u4

本實驗主要是想看一下,INNODB 頁何時被回收,何時被重新利用; 資料檔案何時擴充套件,按多少比例擴充套件等;

[@more@]

MYSQL 5.0.68 on RHEL 5u4

Innodb Record 組成 :
---------------------------
Field Start Offsets :(F*1) or (F*2) bytes ,F=欄位個數
Extra Bytes :6 bytes
Field Contents :三個系統欄位(ROWID,TXID(6B),PID(7B) ) + 資料內容;
如果有主鍵,ROWID=PK;

Innodb PAGE 組成 (16K):
---------------------------
Fil Header : 38B
Page Header: 56B
The Infimum + Supremum Records : 15B
USER RECORD: (所有欄位長度+系統欄位)*行數
Free Space : ???
Page Directory : (主鍵長度+頁內指標6B)*行數
Fil Trailer : 8B


PAGE 儲存管理
---------------------
All InnoDB indexes are B-trees where the index records are stored in the leaf pages of the tree.
The default size of an index page is 16KB.
When new records are inserted, InnoDB tries to leave 1/16 of the page free for future insertions and updates of the index records.

If index records are inserted in a sequential order (ascending or descending),
the resulting index pages are about 15/16 full.
If records are inserted in a random order, the pages are from 1/2 to 15/16 full.

If the fill factor of an index page drops below 1/2, InnoDB tries to contract the index tree to free the page.
這裡意思是INNODB會在DELETE後自動回收少於1/2滿的頁;


每個PAGE能存多少行資料?
---------------------------
假設一個表10個欄位,每個欄位字長100位元組(主鍵也是100B)
所有欄位佔用1K位元組;

我們根據以上的資訊:

1)每行所佔用空間:
 10*1B + 6B + ( 13B+1000 ) = 1029B

2)Page Directory 每行佔用:
 100B + 6B = 106B

3) 每個頁還要保留1/16 ,以作UPDATE操作;

MAX 行數 = [ (15*1024-38-56-15-8) *15/16 ] / (1029 + 106 ) ~= 13.4

新建一個表,看看相關SIZE資料:
---------------------------
create table h1 (
c0 char(100) primary key,
c1 char(100),
c2 char(100),
c3 char(100),
c4 char(100),
c5 char(100),
c6 char(100),
c7 char(100),
c8 char(100),
c9 char(100)
) engine=innodb default charset=latin1 ;


: test 14:55:10> show table status like 'h1' G
*************************** 1. row ***************************
Name: h1
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 0
Avg_row_length: 0
Data_length: 16384 ### 16K
Max_data_length: 0
Index_length: 0
Data_free: 0
Auto_increment: NULL
Create_time: 2010-11-04 14:55:00
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment: InnoDB free: 0 kB
1 row in set (0.00 sec)

[root@test1 test]# ll
total 112
-rw-rw---- 1 mysql mysql 8790 Nov 4 14:55 h1.frm
-rw-rw---- 1 mysql mysql 98304 Nov 4 14:55 h1.ibd #### 96K


建表一個表,資料檔案大小96K,但INNODB顯示資料佔用空間:16K;
前面的80K應該是資料檔案的頭部資料 ??
不對,經過下面的資料你會發現,INNODB資料狀態顯示空間使用超過48才進行資料檔案的擴充套件 ;
所以這裡猜測,資料檔案的頭部只佔48K(3 pages);


測試指令碼,觀察INNODB SIZE 變化
--------------------------
mysql -uroot -h127.0.0.1 test <drop table if exists h1 ;
create table h1 (
c0 char(100) primary key,
c1 char(100),
c2 char(100),
c3 char(100),
c4 char(100),
c5 char(100),
c6 char(100),
c7 char(100),
c8 char(100),
c9 char(100)
) engine=innodb default charset=latin1 ;
EOF

LOGFILE=/tmp/size.log
rm -f $LOGFILE
for SEQQ in $(seq -w 1 100000 );
do
VAL="aaa$SEQQ"
mysql -uroot -h127.0.0.1 test -e "insert into h1 values ('$VAL','$VAL','$VAL','$VAL','$VAL','v','$VAL','$VAL','$VAL','$VAL');"
MDL=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Data_length" | tr -d ' ')
MFREE=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Comment" | awk -F ":" '{print $NF}')
ODL=$(ls /home/mysql/data/mysql/test/h1.ibd -l | awk '{print $5}')
if [ "$PRE_MDL" != "$MDL" -o "$PRE_ODL" != "$ODL" ] ; then
printf "%8s ==== %-30s === os_size: %-20s === FREE:%-20s n" "$VAL" "$MDL" "$ODL" "$MFREE" >> $LOGFILE
fi
PRE_MDL=$MDL
PRE_ODL=$ODL
done

INNODB 檔案管理,如何擴充套件
-----------------------
文件上說:
The tablespace consists of database pages with a default size of 16KB. The pages are grouped into extents of 64 consecutive pages.

When a segment grows inside the tablespace, InnoDB allocates the first 32 pages to it individually.
After that InnoDB starts to allocate whole extents to the segment
InnoDB can add to a large segment up to 4 extents at a time to ensure good sequentiality of data.

透過實驗資料:
我們發現,資料檔案的擴充套件按以下規則進行:
1) 按16K速度擴充套件,擴充套件32個16K (其中第1,2個16K一次擴充套件完成)
2) 按1M的速度擴充套件,擴充套件32個1M (其中第1,2次擴充套件有點奇怪?)
3) 按4M的速度擴充套件 ... ...

刪除空間何時REUSE
------------------------
文件上說:
When you delete data from a table, InnoDB contracts the corresponding B-tree indexes.
Whether the freed space becomes available for other users depends on whether the pattern of deletes frees individual pages or extents to the tablespace.
Dropping a table or deleting all rows from it is guaranteed to release the space to other users,
but remember that deleted rows are physically removed only in an (automatic) purge operation after they are no longer needed for transaction rollbacks or consistent reads.
(See Section 12.2.12, “Implementation of Multi-Versioning”.)

這個問題要證明,思路是:
1)把一頁灌滿,讓他擴充套件到第二頁;
2)刪除第一頁的大部分資料,和第二頁的前部分資料;
3)繼續插看第二頁滿為止,看是不是會重複利用第一頁資料;


mysql -uroot -h127.0.0.1 test <drop table if exists h1 ;
create table h1 (
c0 char(100) primary key,
c1 char(100),
c2 char(100),
c3 char(100),
c4 char(100),
c5 char(100),
c6 char(100),
c7 char(100),
c8 char(100),
c9 char(100)
) engine=innodb default charset=latin1 ;
EOF

LOGFILE=/tmp/size.log
rm -f $LOGFILE
for SEQQ in $(seq -w 1 66 );
do
VAL="aaa$SEQQ"
mysql -uroot -h127.0.0.1 test -e "insert into h1 values ('$VAL','$VAL','$VAL','$VAL','$VAL','v','$VAL','$VAL','$VAL','$VAL');"
MDL=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Data_length" | tr -d ' ')
MFREE=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Comment" | awk -F ":" '{print $NF}')
ODL=$(ls /home/mysql/data/mysql/test/h1.ibd -l | awk '{print $5}')
if [ "$PRE_MDL" != "$MDL" -o "$PRE_ODL" != "$ODL" ] ; then
printf "%8s ==== %-30s === os_size: %-20s === FREE:%-20s n" "$VAL" "$MDL" "$ODL" "$MFREE" >> $LOGFILE
fi
PRE_MDL=$MDL
PRE_ODL=$ODL
done

##### aaa01 ==== Data_length:16384 === os_size: 98304 === FREE: 0 kB
##### aaa15 ==== Data_length:49152 === os_size: 98304 === FREE: 0 kB
##### aaa22 ==== Data_length:65536 === os_size: 131072 === FREE: 0 kB
##### aaa36 ==== Data_length:81920 === os_size: 147456 === FREE: 0 kB
##### aaa50 ==== Data_length:98304 === os_size: 163840 === FREE: 0 kB
##### aaa64 ==== Data_length:114688 === os_size: 180224 === FREE: 0 kB

# DUMP(0) 顯示資料分佈如下:
  PAGE4 == rec1 -- rec14
  PAGE5 == rec08 -- rec21 (08--14 為什麼存兩次,不明白 ?)
  PAGE6 == rec22 -- rec35
  PAGE7 == rec36 -- rec49
  PAGE8 == rec50 -- rec63
  PAGE9 == rec64 -- rec66


在下列操作過程中,有不斷的DUMP PAGES:
delete from h1 where c0='aaa36';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa37';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa38';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa39';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa40';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa41';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa42';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa43';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa44';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa45';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa46';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa47';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
delete from h1 where c0='aaa48';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
# DUMP(1)
# 在這裡我們發現,在PAGE6中,出現了aaa49這條記錄,這說明INNODB已經開始合併頁了
# DUMP(1) 顯示資料分佈如下:
  #PAGE4 == rec1 -- rec14
  #PAGE5 == rec08 -- rec21
  #PAGE6 == rec22 -- rec35rec49
  #PAGE7 == rec49
  #PAGE8 == rec50 -- rec63
  #PAGE9 == rec64 -- rec66

delete from h1 where c0='aaa49';select data_length,table_comment from information_schema.tables where table_name='h1';system ls -l /home/mysql/data/mysql/test/h1.ibd;
# DUMP(2)
#刪除在這裡時,發現DATAFILE擴充套件了16K,不知為何用。 data_length 縮小了16K
#而且PAGE7上的REC49沒有消失(應該是沒有被標誌為刪除), 在PAGE6上的REC49上消失了,
#在這裡我們可以猜測,PAGE7 已經被置為空閒頁了;但行沒有被標誌為刪除;
# DUMP(2) 顯示資料分佈如下:
  #PAGE4 == rec1 -- rec14
  #PAGE5 == rec08 -- rec21
  #PAGE6 == rec22 -- rec35
  #PAGE7 == rec49 這一行為什麼一直留著?
  #PAGE8 == rec50 -- rec63
  #PAGE9 == rec64 -- rec66


#再插一些資料,讓最後一頁滿,並滿出一點; 看是不是會重用剛才空出來的PAGE7;
LOGFILE=/tmp/size.log
for SEQQ in $(seq -w 67 78 );
do
VAL="aaa$SEQQ"
mysql -uroot -h127.0.0.1 test -e "insert into h1 values ('$VAL','$VAL','$VAL','$VAL','$VAL','v','$VAL','$VAL','$VAL','$VAL');"
MDL=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Data_length" | tr -d ' ')
MFREE=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Comment" | awk -F ":" '{print $NF}')
ODL=$(ls /home/mysql/data/mysql/test/h1.ibd -l | awk '{print $5}')
if [ "$PRE_MDL" != "$MDL" -o "$PRE_ODL" != "$ODL" ] ; then
printf "%8s ==== %-30s === os_size: %-20s === FREE:%-20s n" "$VAL" "$MDL" "$ODL" "$MFREE" >> $LOGFILE
fi
PRE_MDL=$MDL
PRE_ODL=$ODL
done

### aaa64 ==== Data_length:114688 === os_size: 180224 === FREE: 0 kB
### aaa67 ==== Data_length:98304 === os_size: 196608 === FREE: 0 kB
### aaa78 ==== Data_length:114688 === os_size: 196608 === FREE: 0 kB
### 我們發現 ,Data_length 又增加了16K ,
### Dump(3) 發現多了一個新的PAGE:10 , 裡面記錄了剛剛滿出的REC78
### 而PAGE7 裡面的REC49 依舊存在;
# DUMP(3) 顯示資料分佈如下:
  #PAGE4 == rec1 -- rec14
  #PAGE5 == rec08 -- rec21
  #PAGE6 == rec22 -- rec35
  #PAGE7 == rec49
  #PAGE8 == rec50 -- rec63
  #PAGE9 == rec64 -- rec66 rec67-rec77
#PAGE10== rec78

flush tables ;
show innodb status G
---
LOG
---
Log sequence number 0 94049458
Log flushed up to 0 94049458
Last checkpoint at 0 94049458
###髒塊已經全部刷下去了; 卻看塊資訊,依然沒有回收;

###再插一些資料,直到DATASIZE需要擴充套件,看是不是會重用剛才空出來的PAGE7;?
LOGFILE=/tmp/size.log
for SEQQ in $(seq -w 1 14 );
do
VAL="bbb$SEQQ"
mysql -uroot -h127.0.0.1 test -e "insert into h1 values ('$VAL','$VAL','$VAL','$VAL','$VAL','v','$VAL','$VAL','$VAL','$VAL');"
MDL=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Data_length" | tr -d ' ')
MFREE=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Comment" | awk -F ":" '{print $NF}')
ODL=$(ls /home/mysql/data/mysql/test/h1.ibd -l | awk '{print $5}')
if [ "$PRE_MDL" != "$MDL" -o "$PRE_ODL" != "$ODL" ] ; then
printf "%8s ==== %-30s === os_size: %-20s === FREE:%-20s n" "$VAL" "$MDL" "$ODL" "$MFREE" >> $LOGFILE
fi
PRE_MDL=$MDL
PRE_ODL=$ODL
done

### 又擴充套件了一個頁: PAGE11,
# DUMP(4) 顯示資料分佈如下:
  #PAGE4 == rec1 -- rec14
  #PAGE5 == rec08 -- rec21
  #PAGE6 == rec22 -- rec35
  #PAGE7 == rec49
  #PAGE8 == rec50 -- rec63
  #PAGE9 == rec64 -- rec66 rec67-rec77
#PAGE10== rec78 rec b1-b13
#PAGE11== recb14

LOGFILE=/tmp/size.log
for SEQQ in $(seq -w 15 28 );
do
VAL="bbb$SEQQ"
mysql -uroot -h127.0.0.1 test -e "insert into h1 values ('$VAL','$VAL','$VAL','$VAL','$VAL','v','$VAL','$VAL','$VAL','$VAL');"
MDL=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Data_length" | tr -d ' ')
MFREE=$(mysql -uroot -h127.0.0.1 test -e "show table status like 'h1' G " | grep "Comment" | awk -F ":" '{print $NF}')
ODL=$(ls /home/mysql/data/mysql/test/h1.ibd -l | awk '{print $5}')
if [ "$PRE_MDL" != "$MDL" -o "$PRE_ODL" != "$ODL" ] ; then
printf "%8s ==== %-30s === os_size: %-20s === FREE:%-20s n" "$VAL" "$MDL" "$ODL" "$MFREE" >> $LOGFILE
fi
PRE_MDL=$MDL
PRE_ODL=$ODL
done

### DATASIZE 擴充套件了: 212992 , data_length 也擴充套件了: 147456
### 又多了一個PAGE:12
# DUMP(5) 顯示資料分佈如下:
  #PAGE4 == rec1 -- rec14
  #PAGE5 == rec08 -- rec21
  #PAGE6 == rec22 -- rec35
  #PAGE7 == rec49
  #PAGE8 == rec50 -- rec63
  #PAGE9 == rec64 -- rec66 rec67-rec77
#PAGE10== rec78 rec b1-b13
#PAGE11== recb14-- recb27
#PAGE12== recb28

### 從當前的測試來看,MYSQL並沒有馬上回收並重複利用“記錄被全部刪除”的PAGE;而是選擇了新擴充套件;
### 或者說PAGE7還沒有被全部釋放 ;還沒有在PAGE FREE LIST中;


### 插兩條與REC49相近的記錄,看是不是馬上重新利用PAGE7
for SEQQ in $(seq -w 47 48 );
do
VAL="aaa$SEQQ"
mysql -uroot -h127.0.0.1 test -e "insert into h1 values ('$VAL','$VAL','$VAL','$VAL','$VAL','v','$VAL','$VAL','$VAL','$VAL');"
done

# DUMP(6) 顯示資料分佈如下:
  #PAGE4 == reca1 -- reca14
  #PAGE5 == reca08 -- reca21
  #PAGE6 == reca22 -- reca35 reca47 ### 只插一條,MYSQL先在左側頁看有沒有空間;如果有插入;
  #PAGE7 == reca48 ### 已經重新利用;代表著PAGE7從未被回收過? 還是又從FREE LIST 拿回來用了
  #PAGE8 == reca50 -- reca63
  #PAGE9 == reca64 -- reca66 reca67-reca77
#PAGE10== reca78 recb1-b13
#PAGE11== recb14 -- recb27
#PAGE12== recb28



### 測試到這裡,還是留下了不少的疑問:
1) rec49已經被刪除,為什麼不置標誌位?  
2) PAGE7裡的記錄被刪除完了,為何沒有被置為空閒? 是因為REC49還存在的緣故?
3) 部分資料為什麼要儲存兩次,比如 a08-a14,有兩個塊(PAGE4,PAGE5)中都有?
這些問題先留著,等待解決;

### 附:Data_length 與 os_size 的變化情況;
aaa01 ==== Data_length:16384 === os_size: 98304 === FREE: 0 kB
aaa15 ==== Data_length:49152 === os_size: 98304 === FREE: 0 kB
aaa22 ==== Data_length:65536 === os_size: 131072 === FREE: 0 kB
aaa36 ==== Data_length:81920 === os_size: 147456 === FREE: 0 kB
aaa50 ==== Data_length:98304 === os_size: 163840 === FREE: 0 kB
aaa64 ==== Data_length:114688 === os_size: 180224 === FREE: 0 kB
aaa67 ==== Data_length:98304 === os_size: 196608 === FREE: 0 kB
aaa78 ==== Data_length:114688 === os_size: 196608 === FREE: 0 kB
bbb14 ==== Data_length:131072 === os_size: 196608 === FREE: 0 kB
bbb28 ==== Data_length:147456 === os_size: 212992 === FREE: 0 kB

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/703656/viewspace-1041141/,如需轉載,請註明出處,否則將追究法律責任。

相關文章