[20240529]簡單探究FREE LISTS列表.txt

lfree發表於2024-06-11
[20240529]簡單探究FREE LISTS列表.txt

--//簡單探究shraed pool的FREE LISTS列表.

1.環境:
SYS@test> @ver1
PORT_STRING VERSION BANNER CON_ID
-------------------- ---------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0 12.2.0.1.0 Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production 0

--//關閉資料庫啟動到mount狀態,主要方便觀察,不然變化影響測試觀察.
SYS@test> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.

SYS@test> startup mount
ORACLE instance started.
Total System Global Area 805306368 bytes
Fixed Size 8924064 bytes
Variable Size 297796704 bytes
Database Buffers 490733568 bytes
Redo Buffers 7852032 bytes
Database mounted.

2.做堆轉儲:

SYS@test> @ init
SYS@test> alter session set events 'immediate trace name heapdump level 2';
Session altered.

SYS@test> @ t
TRACEFILE
-----------------------------------------------------------------
D:\APP\ORACLE\diag\rdbms\test\test\trace\test_ora_3708.trc

--//檢視相關chunk的內容:
--//看看Bucket 3的情況size=56,後面出現是RESERVED FREE LISTS:
$ sed -n "/^ Bucket 3 /,/^ Bucket 4 /p" test_ora_3708.trc
Bucket 3 size=56
Chunk 7ff0197df68 sz= 56 free " "
Chunk 7ff02b8b4a8 sz= 56 free " "
Chunk 7ff037a8368 sz= 56 free " "
Bucket 4 size=64
Bucket 3 size=56
Bucket 4 size=64

--//順便貼一個FREE LISTS的情況,僅僅啟動到mount.
Total heap size =176156736
FREE LISTS:
Bucket 0 size=32
Chunk 7ff0b000088 sz= 0 kghdsx
Bucket 1 size=40
Bucket 2 size=48
Chunk 7ff01ffdf90 sz= 48 free " "
Chunk 7ff033310c0 sz= 48 free " "
Bucket 3 size=56
Chunk 7ff0197df68 sz= 56 free " "
Chunk 7ff02b8b4a8 sz= 56 free " "
Chunk 7ff037a8368 sz= 56 free " "
Bucket 4 size=64
Chunk 7ff013bb058 sz= 64 free " "
Bucket 5 size=72
...
Bucket 30 size=272
Chunk 7ff013bd300 sz= 272 free " "
Bucket 31 size=280
...
Bucket 254 size=65560
Chunk 7ff00834000 sz= 3898152 free " "
Total free space = 3898752
--//僅僅啟動到mount,自由空間全部在最後一個bucket 254,前面僅僅存在幾個free chunk,這樣好做分析.

$ sed -n "/^ Bucket 3 /,/^ Bucket 4 /p" test_ora_3708.trc | awk '/Chunk/{print "oradebug peek 0x" toupper($2),32, 1 }'
oradebug peek 0x7FF0197DF68 32 1
oradebug peek 0x7FF02B8B4A8 32 1
oradebug peek 0x7FF037A8368 32 1

--//退出會話重新登陸,執行如下:
SYS@test> @ t
TRACEFILE
-----------------------------------------------------------------
D:\APP\ORACLE\diag\rdbms\test\test\trace\test_ora_7324.trc

SYS@test> oradebug setmypid
Statement processed.

SYS@test> oradebug peek 0x7FF0197DF68 32 1
[7FF0197DF68, 7FF0197DF88) = 00000039 C0B38F00 0197DF28 000007FF 0AC5DD48 00000000 02B8B4B8 000007FF
~~~~~~~~~~~~~~~~~ ++++++++++++++++
SYS@test> oradebug peek 0x7FF02B8B4A8 32 1
[7FF02B8B4A8, 7FF02B8B4C8) = 00000039 C0B38F00 02B7B9D0 000007FF 0197DF78 000007FF 037A8378 000007FF
~~~~~~~~~~~~~~~~~ ++++++++++++++++
SYS@test> oradebug peek 0x7FF037A8368 32 1
[7FF037A8368, 7FF037A8388) = 00000039 C0B38F00 03434000 000007FF 02B8B4B8 000007FF 0AC5DD48 00000000
~~~~~~~~~~~~~~~~~ ++++++++++++++++
--//0x39 = 57,表示chunk size+1.
--//注意看下劃線以及+++相關內容,注意intel系列CPU的大小頭問題.
--//前後都出現0AC5DD48 00000000.
--//02B8B4B8 000007FF 顛倒過來就是 000007FF 02B8B4B8,減去0x10就是000007FF 02B8B4A8與Chunk 7ff02b8b4a8一致.
--//0197DF78 000007FF 顛倒過來就是 000007FF 0197DF78,減去0x10就是000007FF 0197DF68與Chunk 7ff0197df68一致.
--//037A8378 000007FF 顛倒過來就是 000007FF 037A8378,減去0x10就是000007FF 037A8368與Chunk 7FF037A8368一致.
--//前面表示上一個,後面表示下一個,這樣看就形成1個環.
--//這樣很清晰展示free list 一個bucket桶裡面的free chunk地址如何連結在一起的.

--//可以猜測存在一個單獨free list區,包含地址0x0AC5DD48.
SYS@test> oradebug peek 0x0AC5DD48 32
[00AC5DD48, 00AC5DD68) = 037A8378 000007FF 0197DF78 000007FF 00000040 00000000 013BB068 000007FF
~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
--//大小頭顛倒, 000007FF037A8378 , 000007FF0197DF78,減去0x10對應前面轉儲看到的尾部,頭部.
--//下面確定一個bucket在free list區佔用多少空間,實際上從前面的輸出可以猜測佔用24位元組,因為00000040 00000000不可能是地址.
--//似乎表示某種長度,0x40 = 64,僅僅是猜測.
--//013BB068 000007FF 顛倒過來就是 000007FF 013BB068,減去0x10就是 000007FF 013BB058,仔細看前面跟蹤的輸出
Bucket 4 size=64
Chunk 7ff013bb058 sz= 64 free " "
--//正好是bucket 4的內容,可以確定一個bucket在free list區佔24位元組.

--//看跟蹤檔案下一個bucket 4,僅僅一個chunk.
Bucket 4 size=64
Chunk 7ff013bb058 sz= 64 free " "

SYS@test> oradebug peek 0x7ff013bb058 32
[7FF013BB058, 7FF013BB078) = 00000041 C0B38F00 013B7858 000007FF 0AC5DD60 00000000 0AC5DD60 00000000
~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~
--// 0x41 = 65 =chunk size+1,第4-7位元組都是C0B38F00.第8-11,12-15位元組內容我不知道表示怎麼?無法猜測...

SYS@test> oradebug peek 0x0AC5DD60 24
[00AC5DD60, 00AC5DD78) = 013BB068 000007FF 013BB068 000007FF 00000048 00000000
--//000007FF 013BB068 - 0x10 = 0x7ff013bb058,後面的00000048 00000000猜測不出來.視乎增大1個bucket,增加8.
--//每個bucket佔用 0AC5DD60-0AC5DD48 = 0x18 = 24.
--//0x48 = 72,哦似乎明白了,這個值表示這個free list bucket桶的最大chunk size是72-1=71.可以在後面單獨驗證.

SYS@test> @ fchaz 0x0AC5DD60
no rows selected
--//奇怪沒有輸出!!

--//看看這個地址0x0AC5DD60在那裡:
SYS@test> oradebug ipc
IPC information written to the trace file
SYS@test> @ t
TRACEFILE
-----------------------------------------------------------------
D:\APP\ORACLE\diag\rdbms\test\test\trace\test_ora_7324.trc

*** 2024-05-30T20:44:35.830155+08:00 (CDB$ROOT(1))
Processing Oradebug command 'ipc'
Dump of Windows skgm context
areaflags 000017ff
realmflags 00001b01
maxtotalrealmsize 936d0000
VMpagesize 00001000
VMallocgranularity 00010000
minappaddress 0000000000010000
maxappaddress 000007FFFFFEFFFF
stacklimit 0000000023981000
magic acc01ade
Handle: 0000000001260060 `sga_test'
Dump of Windows realm handle `sga_test', flags = 00000000
Area #0 `Fixed Size' containing Subareas 0-0
Total size 0000000000882ba0 Minimum Subarea size 00000000
Area Subarea Start Addr
0 0 000000000AB10000
Subarea size
08925184
[
Base = 000000000AB10000 Prot = RW Size = 8925184 State = COM
]
--//AB10000 = 179372032
--//179372032+8925184 = 188297216
--//188297216 = 0xb393000
--//0x0AC5DD60 在 0xAB10000 - 0xb393000之間.也就是free list表在fixed 區.
Area #1 `Variable Size' containing Subareas 2-2
Total size 000000002f000000 Minimum Subarea size 00400000
Area Subarea Start Addr
1 2 000007FF00000000
Subarea size
788529152
[
Base = 000007FF00000000 Prot = RW Size = 788529152 State = COM
]
Area #2 `Redo Buffers' containing Subareas 1-1
Total size 000000000077d000 Minimum Subarea size 00001000
Area Subarea Start Addr
2 1 000000000C090000
Subarea size
07852032
[
Base = 000000000C090000 Prot = RW Size = 7852032 State = COM
]


--//看看第一個Bucket 0.
FREE LISTS:
Bucket 0 size=32
Chunk 7ff0b000088 sz= 0 kghdsx
~~~~~~~~~~~~~

SYS@test> oradebug peek 0x7ff0b000088 32
[7FF0B000088, 7FF0B0000A8) = 00000001 C0B38F00 00000000 00000000 0AC5DD00 00000000 0AC5DD00 00000000

--//00000001 = 1 ,僅僅1個位元組,也就是這個chunk size=0,不過注意記錄的sz確實等於0,看下劃線內容.
--//這樣後面的資訊如何儲存,也許這個chunk非常特殊,kghdsx表示什麼,我感覺這個chunk非常特殊,似乎表示開始.

SYS@test> oradebug peek 0x0AC5DD00 24
[00AC5DD00, 00AC5DD18) = 0B000098 000007FF 0B000098 000007FF 00000028 00000000

--//0AC5DD60-0AC5DD00 = 0x60 = 60 = 96
--//96/4 = 24,反向驗證free list每個bucket佔用24位元組.
--//0x28 = 40,驗證我前面的判斷表示這個bucket最大的chunk size不能大於40.

--//僅僅啟動到mount,free list都是null的.看看 Bucket 30,反向驗證是否正確.
--//注意一個細節,整個測試最好不要執行任何sql語句,儘量減少使用chunk的情況.
Bucket 30 size=272
Chunk 7ff013bd300 sz= 272 free " "

--//30 = 0x1e,24=0x18
--//0AC5DD00 + 0x18 * 0x1e = 0xac5dfd0

SYS@test> oradebug peek 0xac5dfd0 24
[00AC5DFD0, 00AC5DFF0) = 013BD310 000007FF 013BD310 000007FF 00000118
~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
--//000007FF013BD310 - 0x10 = 0x7FF013BD300,正好對上.

--//0x118 = 280,0x48 = 72 ,0x28 = 40
--//280-72 = 208 , 208/8 = 26
--//280-40 = 240, 240/8 = 30
--//確實第16-19位元組每個bucket增加8.

--//ac5dfd0-0x18 = 0xac5dfb8,bucket 29
SYS@test> oradebug peek 0xac5dfb8 24
[00AC5DFB8, 00AC5DFD0) = 0AC5DFB8 00000000 0AC5DFB8 00000000 00000110 00000000

--//ac5dfd0+0x18 = 0xac5dfe8,,bucket 31
SYS@test> oradebug peek 0xac5dfe8 24
[00AC5DFE8, 00AC5E000) = 0AC5DFE8 00000000 0AC5DFE8 00000000 00000120 00000000

--//確實第16-19位元組基本可以確定表示這個bucket的最大chunk size是該數值-1.

3.啟動到open,再次轉儲,步驟略:

Bucket 251 size=12376
Chunk 7ff00188010 sz= 16384 free " "
Chunk 7feff84ac88 sz= 13560 free " "
Chunk 7ff003a1000 sz= 16384 free " "
Bucket 252 size=16408

--//可以發現 Bucket 251連結的是chunk size>= 12376 and <=16408-1 的chunk.
--//這樣最小chunk size 是 32位元組.bucket 0 chunk size >=32 <=40-1 的chunk.
Bucket 0 size=32
Chunk 7ff0b000088 sz= 0 kghdsx
--//sz=0 應該具有某種特殊意義.我不知道.

4.驗證第16-19位元組是否表示這個bucket的最大chunk size是該數值-1.

--//重新啟動到mount狀態.再次轉儲,步驟略:
--//單獨儲存free list部分為檔案a.txt

$ grep Bucket a.txt | cut -d= -f2 | awk 'NR==1 {a=$1} NR>1 {print $1-a;a=$1}'| uniq -c
179 8
10 16
50 48
1 72
1 8
1 16
1 4096
1 536
3 8
1 608
1 8
1 2976
1 8
1 4032
1 16384
1 32768
--//Sum = 254

--//簡單解析執行過程,避免以後忘記.
--//grep Bucket a.txt | cut -d= -f2 ,過濾含有Bucket行,再使用cut以=作為分隔符號,取第2個欄位就是size=後面的值
D:\>grep Bucket a.txt | head -4
Bucket 0 size=32
Bucket 1 size=40
Bucket 2 size=48
Bucket 3 size=56

D:\>grep Bucket a.txt | cut -d= -f2| head -4
32
40
48
56
--//awk "NR==1 {a=$1} NR>1 {print $1-a;a=$1}" 就是做上下相減操作.
--//uniq -c -c 表示prefix lines by the number of occurrences

--//看前面的輸出,開始以為我計算錯誤,仔細檢查發現沒有問題.
--//可以看出oracle每個bucket增加的大小並非線性增加,中間出現僅僅增加8位元組的情況.說明oracle一直在調整演算法,避免記憶體浪費.或
--//者講避免ora-04031錯誤.

d:\tmp> oerr ora 4031
04031, 00000, "unable to allocate %s bytes of shared memory (\"%s\",\"%s\",\"%s\",\"%s\")"
// *Cause: More shared memory is needed than was allocated in the shared
// pool or Streams pool.
// *Action: If the shared pool is out of memory, either use the
// DBMS_SHARED_POOL package to pin large packages,
// reduce your use of shared memory, or increase the amount of
// available shared memory by increasing the value of the
// initialization parameters SHARED_POOL_RESERVED_SIZE and
// SHARED_POOL_SIZE.
// If the large pool is out of memory, increase the initialization
// parameter LARGE_POOL_SIZE.
// If the error is issued from an Oracle Streams or XStream process,
// increase the initialization parameter STREAMS_POOL_SIZE or increase
// the capture or apply parameter MAX_SGA_SIZE.

--//179+10+50+1 = 240,貼出其中一段:
Bucket 238 size=3976
Bucket 239 size=4024
Bucket 240 size=4096
Bucket 241 size=4104 --//出現上下相減等於8的情況.
Bucket 242 size=4120
Bucket 243 size=8216
Bucket 244 size=8752
Bucket 245 size=8760
Bucket 246 size=8768
Bucket 247 size=8776 --//出現上下相減等於8的情況.
Bucket 248 size=9384
Bucket 249 size=9392
Bucket 250 size=12368
Bucket 251 size=12376 --//出現上下相減等於8的情況.
Bucket 252 size=16408
Bucket 253 size=32792
Bucket 254 size=65560
--//上下相減 48,72 8,16....
--//可以想象在具體應用中一定許多chunk集中在size=4024 4096 4104 4120 附近.不然oracle不會做這樣的改進.

--//換一個方式驗證:
--//free list區 bucket 0在 00000000 0AC5DD00 位置.

SYS@test> oradebug peek 0x0AC5DD00 24
[00AC5DD00, 00AC5DD18) = 0B000098 000007FF 0B000098 000007FF 00000028 00000000

--//0AC5DD00 +0x10 地址儲存chunk size .

SYS@test> oradebug peek 0x0AC5DD10 8
[00AC4DD10, 00AC4DD18) = 00000028 00000000

--//繼續昨天的測試,今天啟動到mount後,發現free list開始地址出現變動.

SYS@test> oradebug peek 0x7ff0ac00088 32
[7FF0AC00088, 7FF0AC000A8) = 00000001 C0B38F00 00000000 00000000 0AC4DD00 00000000 0AC4DD00 00000000

--//開始位置出現在0x0AC4DD00

SYS@test> oradebug peek 0x0AC4DD10 8
[00AC4DD10, 00AC4DD18) = 00000028 00000000

--//0AC4DD10 = 180673808

$ seq 0 1 254 | tr -d '\r' | xargs -IQ echo "obase=16;180673808+Q*24"| bc | tr -d '\r'| awk '{print "oradebug peek 0x" toupper($1),4}'
--//輸出略.使用tr -d '\r'主要原因是cygwin seq,bc的輸出多了\r字元.

$ seq 0 1 254 | tr -d '\r' | xargs -IQ echo "obase=16;180673808+Q*24"| bc | tr -d '\r'| awk '{print "oradebug peek 0x" toupper($1),4}' > b.txt

SYS@test> oradebug setmypid
Statement processed.
SYS@test> @ b.txt
[00AC4DD10, 00AC4DD14) = 00000028
[00AC4DD28, 00AC4DD2C) = 00000030
[00AC4DD40, 00AC4DD44) = 00000038
[00AC4DD58, 00AC4DD5C) = 00000040
...
[00AC4F360, 00AC4F364) = 00000FB8
[00AC4F378, 00AC4F37C) = 00001000
[00AC4F390, 00AC4F394) = 00001008
[00AC4F3A8, 00AC4F3AC) = 00001018
[00AC4F3C0, 00AC4F3C4) = 00002018
[00AC4F3D8, 00AC4F3DC) = 00002230
[00AC4F3F0, 00AC4F3F4) = 00002238
[00AC4F408, 00AC4F40C) = 00002240
[00AC4F420, 00AC4F424) = 00002248
[00AC4F438, 00AC4F43C) = 000024A8
[00AC4F450, 00AC4F454) = 000024B0
[00AC4F468, 00AC4F46C) = 00003050
[00AC4F480, 00AC4F484) = 00003058
[00AC4F498, 00AC4F49C) = 00004018
[00AC4F4B0, 00AC4F4B4) = 00008018
[00AC4F4C8, 00AC4F4CC) = 00010018
[00AC4F4E0, 00AC4F4E4) = 00000000
--//最後1個00000000.

SYS@test> @ t
TRACEFILE
-----------------------------------------------------------------
D:\APP\ORACLE\diag\rdbms\test\test\trace\test_ora_900.trc

$ grep "^\[" test_ora_900.trc | cut -d= -f2 | sed -n -e '1,$s/^ //p' | awk '{print strtonum("0x"$0)}' | awk 'NR==1 {a=$1} NR>1 {print $1-a;a=$1}'| uniq -c
178 8
10 16
50 48
1 72
1 8
1 16
1 4096
1 536
3 8
1 608
1 8
1 2976
1 8
1 4032
1 16384
1 32768
1 -65560
--//Sum = 254
--//出現178 8是正常的,因為這次計算使用最大值.
--//簡單說明: grep "^\[" test_ora_900.trc | cut -d= -f2 | sed -n -e '1,$s/^ //p' 過濾開頭[的行,取分隔符=的第2欄位,刪除
--//開頭的空格.
--//awk '{print strtonum("0x"$0)}' 轉換為10進位制數字.
--//awk 'NR==1 {a=$1} NR>1 {print $1-a;a=$1}' 上下相減
--//uniq -c 壓縮輸出.

Bucket 239 size=4024
Bucket 240 size=4096
Bucket 241 size=4104 --//出現上下相減等於8的情況.
Bucket 242 size=4120

$ seq 100000 | xargs -IQ echo "select Q from dual;" | sqlplus / as sysdba > /dev/null
SYS@test> select count(*) ,ksmchsiz from x$ksmsp where ksmchsiz between 4000 and 4120 group by ksmchsiz order by 2;
COUNT(*) KSMCHSIZ
---------- ----------
11 4000
63 4008
1 4024
3 4032
2 4040
2 4048
2 4056
39 4072
14 4080
1 4088
5957 4096
1 4104
4 4112
145 4120
14 rows selected.
--//我的測試環境語句太少,不過還是可以看出KSMCHSIZ=4120有許多.某種意義看出oracle故意這樣設定.

5.總結:
--//前面已經展示free list如何連結的.僅僅相差0x10位元組

--//free list在fixed區,並且每個bucket佔用24位元組,開始0-7位元組表示free list地址的尾部,8-15表示free list地址的頭部,16-23位元組
--//表示該buckect free list 連結的chunk 最大chunk大小 -1,我開始猜測16-23位元組相關內容時浪費一點點時間.

--//12c free list chunk size大小做了一些改進,每個bucket增加的大小並非線性增加,中間出現僅僅增加8位元組的情況.說明oracle一直
--//在調整演算法,避免記憶體浪費.或者講避免ora-04031錯誤.在具體應用中一定許多chunk集中在size=4024 4096 4104 4120 附近.不然
--//oracle不會做這樣的改進.補充在11g下也測試發現從11g就開始這樣設計.

--//測試在mount下進行,不過不應該影響測試結論.

--//windows下測試太麻煩了,浪費大量時間,cygwin不夠靈活,遇到輸出\r字元的問題.

--//寫的很亂,主要記錄我當時的思考以及分析過程,整個分析過程夾雜了我許多猜測,術語也使用不對,不知道如何表達.

相關文章