一篇介紹asm邏輯metadata的好文章
一直比較迷惑asm file的extents和asm disk 的AU(allocate unit)之間的mapping關係,下面這片文章詳細的介紹了這個關係
同時這個連線中的圖也不失為一張好圖,順便收集一下
[@more@]A peek Inside Oracle ASM Metadata
Luca Canali, CERN, Jan 2006.
Overview
Oracle ASM (Automatic Storage Management) is a new feature of Oracle 10g to streamline storage management and provisioning. ASM provides volume and (cluster) filesystem management where the IO subsystem is directly handled by the Oracle kernel [Ref 1,2]. With Oracle 10g and ASM it is possible to build a scalable and highly available storage infrastructure build on low-cost hardware [Ref 3]. A typical example is using SATA HD with fiber channel controllers arranged in a SAN network. A scalable architecture built on low-cost HW is deployed at CERN combining ASM and Oracle 10g RAC on Linux [Ref 4].
Oracle documentation and whitepapers [Ref 1,2,3,5] provide the necessary information to set up Oracle ASM instances and configure the storage with it. Configuration details and performance metrics are exposed via a few V$ views. Other possibilities are the command line interface, asmcmd (10g R2) and the graphical interface of OEM.
Metadata are however partially hidden to the end user. That is the mapping between physical storage, ASM allocation units, and database files is not completely exposed via V$ views. The author has found that is however possible to query such information via undocumented X$ tables. For example, it is possible to determine the exact physical location on disk of each extent (or mirror copies of extents) for each file allocated on ASM (and if needed access the data directly via the OS). This kind of information can be put to profit by Oracle practitioners wanting to extend their knowledge of the inner workings of the ASM or wanting to diagnose hotspots and ASM rebalancing issues [Ref 6].
Direct access to ASM files (data peeking), possibly automated in a small ‘utility’, for didactical scopes or for emergency data rescue is a possible extension of the findings documented here.
Oracle ASM Recap
Database storage is organized by ASM using the following components: each LUN is mapped as a disk (possibly using asmlib on Linux). Disks are grouped together into disk groups. Each disk group can be segmented in one or more fail groups. Typically 2 fail groups are created for a normal redundancy disk group, ASM takes then care of writing mirror data in two copies allocated in different fail groups [Ref 5]. Oracle files are allocated by ASM from the pool of storage defined by the disk groups, as specified by the DBA. ASM takes care of mirroring and striping the data, following the S.A.M.E. concept [Ref 7].
A typical configuration [Ref 1-5] uses 2 disk groups: a data disk group (datafiles, redologs, spfile, tempfile and controfile) and a recovery disk group for the flash recovery area (archive logs, disk backups, redolog and controlfile multiplexed members). Configuration of the disk groups is performed via SQL commands [Ref 4].
The following V$ views are available to expose the configuration and some usage statistics, such as the number of physical reads and writes. All these views are accessible from the ASM instance (while some of them display no records when queried from the database instance).
View Name | Based on | Description |
V$ASM_DISKGROUP | X$KFGRP | performs disk discovery and lists diskgroups |
V$ASM_DISKGROUP_STAT | X$KFGRP_STAT | lists diskgroups |
V$ASM_DISK | X$KFDSK, X$KFKID | perform disk discovery and lists disks + usage metrics |
V$ASM_DISK_STAT | X$KFDSK_STAT, X$KFKID | List disks + usage metrics |
V$ASM_FILE | X$KFFIL | lists ASM files (1 row per file) |
V$ASM_ALIAS | X$KFALS | lists ASM aliases (files, directories) |
V$ASM_CLIENT | X$KFTMTA | lists instances DB instances connected to ASM |
V$OPERATION | X$KFGMG | lists running rebalancing operations |
N.A. | X$KFFXP | Extent mapping table for ASM files |
From the table above we can see that the V$ASM_* views are based on X$KF* (i.e. X$ tables with KF as a prefix). There are more of such tables that are not used to build V$ASM_* views: X$KFFXP, X$KFDAT, X$KFCBH, X$KFCCE, X$KFBH, X$KFDPARTNER, X$KFCLLE. Note: the findings reported here are based on querying the documented dictionary views: V$FIXED_VIEW_DEFINITION and V$FIXED_TABLE.
By querying the undocumented X$ tables listed above the author has found that the extent mapping table for ASM is contained in X$KFFXP (see also Ref 7 and 8).
ASM Allocation table
X$KFFXP contains the physical allocation table for each ASM file, that is it contains the mapping between ASM files (identified by columns NUMBER_KFFXP and COMPOUND_KFFXP) and their location on disk is. Space on disk (identified by GROUP_KFFXP, DISK_KFFXP) is segmented in 1MB chunks called allocation units (column AU_KFFXP). ASM files are correspondingly allocated in extents that are mapped to the Disk allocation unit. When mirroring is used, each extent is allocated to 2 or 3 allocation units (2-way or 3-way mirroring).
X$KFFXP description
By querying X$KFFXP on a test database running ASM 10g R2 and RAC the following description for X$KFFXP has been speculated:
Column Name | Description |
ADDR | table address/identifier |
INDX | row identifier |
INST_ID | instance number (RAC) |
NUMBER_KFFXP | ASM file number. Join with v$asm_file and v$asm_alias |
COMPOUND_KFFXP | File identifier. Join with compound_index in v$asm_file |
INCARN_KFFXP | File incarnation id. Join with incarnation in v$asm_file |
PXN_KFFXP | Extent number per file |
XNUM_KFFXP | Logical extent number per file (mirrored extents have the same value) |
GROUP_KFFXP | ASM disk group number. Join with v$asm_disk and v$asm_diskgroup |
DISK_KFFXP | Disk number where the extent is allocated. Join with v$asm_disk |
AU_KFFXP | Relative position of the allocation unit from the beginning of the disk. The allocation unit size (1 MB) in v$asm_diskgroup |
LXN_KFFXP | 0,1 used to identify primary/mirror extent, 2 identifies file header allocation unit (hypothesis) |
FLAGS_KFFXP | N.K. |
CHK_KFFXP | N.K. |
Example 1 – Direct file retrival from ASM disks
The server parameter file (spfile) for the database instance is stored in an ASM diskgroup. This is a small file, useful to illustrate how to use the X$KFFXP table.
1. We find the disk group and file number of the database spfile:
sys@+ASM1> select GROUP_NUMBER,FILE_NUMBER,BYTES from v$asm_file where type='PARAMETERFILE';
GROUP_NUMBER FILE_NUMBER BYTES
------------ ----------- ----------
1 267 3584
2. We find the number and location of the extents where the spfile is written:
sys@+ASM1> select DISK_KFFXP,AU_KFFXP,PXN_KFFXP,XNUM_KFFXP,LXN_KFFXP from x$kffxp where GROUP_KFFXP=1 and NUMBER_KFFXP=267;
DISK_KFFXP AU_KFFXP PXN_KFFXP XNUM_KFFXP LXN_KFFXP
---------- ---------- ---------- ---------- ----------
24 3820 0 0 0
0 176 1 0 1
3. From steps 1. and 2. above we know that the spfile is 3584 bytes long and is stored in 2 mirrored extents: one on disk 0, the other on disk 24 (on disk group). We can find the OS path of the disks with the following query (note the test system used was on Linux using asmlib):
sys@+ASM1> select failgroup,disk_number,path from v$asm_disk where GROUP_NUMBER=1 and DISK_NUMBER in (0,24);
FAILGROUP DISK_NUMBER PATH
---------- ----------- --------------------
FG1 24 ORCL:ITSTOR08_2_EXT
FG2 0 ORCL:ITSTOR11_10_EXT
4. We can now confirm with OS commands that the mapping is correct. ‘dd’ allows the sysadmin to read the disks directly (bs=1M is the block size, while skip=176 means that the command starts reading at the offset 176M) . Using the disk name found in step 3 (only disk 0 demonstrated here) and the offsets found in step 2 we can confirm that the spfile data is at the expected physical location.
$ dd if=/dev/oracleasm/disks/ITSTOR11_10_EXT bs=1M count=1 skip=176|strings|head -4
test12.__db_cache_size=1476395008
test11.__db_cache_size=1476395008
test12.__java_pool_size=16777216
test11.__java_pool_size=16777216
We can see the first 4 lines of the spfile are printed out, as expected.
Example 2 – 100MB datafile Extent Allocation
This example demonstrates how to query the datafile allocation map on ASM for a disk groups with normal redundancy (2-way mirroring). A tablespace with 1 datafile of size 100MB has been created in an ASM disk group, composed of 4 disks arranged in two fail groups: disk 20 and 24 (fail group FG1), disk 7 and 8 make (fail group FG2). The space allocation mapping (mapping between ASM file extents and allocation units on disk) for such a file can be queried like this:
select DISK_KFFXP,AU_KFFXP,PXN_KFFXP,XNUM_KFFXP,LXN_KFFXP from x$kffxp where GROUP_KFFXP=1 and NUMBER_KFFXP=271;
DISK_KFFXP AU_KFFXP PXN_KFFXP XNUM_KFFXP LXN_KFFXP
---------- ---------- ---------- ---------- ----------
20 13682 0 0 0
7 14358 1 0 1
24 13699 2 1 0
8 14364 3 1 1
8 14366 4 2 0
24 13678 5 2 1
. . .
. . .
8 14490 198 99 0
24 13750 199 99 1
20 13791 200 100 0
7 14493 201 100 1
7 14384 0 2147483648 0
24 13714 1 2147483648 1
65534 4294967294 2 2147483648 2
205 rows selected.
We can see that 205 allocation units of 1MB each are listed for the tablespace created with a datafile of 100MB nominal size. The actual datafile size is 101MB (the extra megabyte is allocated by Oracle for tablespace internal structures independently of the use of ASM). The disk group used for this test has ‘normal redundancy’, i.e. 2-way mirroring, this accounts for 101x2=202 allocation units. The remaining 3 allocation units are listed in the last 3 rows in the x$kffxp table and are identifiable by aving XNUM_KFFXP=2147483648. These are most likely file headers and/or metadata (this conclusion has still to be further investigated).
One of the goals of ASM is providing uniform allocation of datafiles across the available spindles. This is a key feature of ASM and has the scope of maximizing performance (throughput, IOPS), lower latency times and increase HA. The following query investigates how the extents are allocated across the four available disks in the test disk group used for this example.
select DISK_KFFXP, count(*)
from x$kffxp
where GROUP_KFFXP=1
and NUMBER_KFFXP=271
and XNUM_KFFXP!=2147483648
group by DISK_KFFXP ;
DISK_KFFXP COUNT(*)
---------- ----------
7 51
8 50
20 51
24 50
We can see that space is allocated in a uniform way between the different disks (and fail groups). With the following query we can see that the primary and mirror extents (identified by the value of LXN_KFFXP) are mixed uniformly for each disk.
Note: Oracle needs only has to read the primary extent of a mirrored pair of ASM-allocated extents for read operations (while both extents need to be accessed for a write operation). Therefore a uniform primary/mirror extent allocation provides for better performance of read operations..
sys@+ASM1>
select DISK_KFFXP, LXN_KFFXP, count(*)
from x$kffxp
where GROUP_KFFXP=1
and NUMBER_KFFXP=271
and XNUM_KFFXP!=2147483648
group by
DISK_KFFXP, LXN_KFFXP
order by
DISK_KFFXP, LXN_KFFXP;
DISK_KFFXP LXN_KFFXP COUNT(*)
---------- ---------- ----------
7 0 25
7 1 26
8 0 25
8 1 25
20 0 26
20 1 25
24 0 25
24 1 25
Example 3 – diskgroup rebalancing
This example illustrates the outcome of an ‘online’ disk group enlargement operation: four additional disks have been added to the disk group used in the example 2 above (disks 2, 3, 0, 1 will be added) and the disk group has then be rebalanced (see SQL below). After this operation the ASM space allocation map for the test datafile has been queried again. The outcome, shown below, is that the file is again uniformly spread over all the disks available in the disk group, as expected.
alter diskgroup test1_datadg1
add
failgroup fg1 disk 'ORCL:ITSTOR08_1_EXT','ORCL:ITSTOR08_3_EXT'
failgroup fg2 disk 'ORCL:ITSTOR11_3_EXT','ORCL:ITSTOR11_4_EXT';
alter diskgroup test1_datadg1 rebalance power 4;
select a.disk_number, a.failgroup, count(*) N#_AU
from x$kffxp x join v$asm_disk a
on x.disk_kffxp=a.disk_number and x.GROUP_KFFXP=a.group_number
where GROUP_KFFXP=1
and NUMBER_KFFXP=271
and XNUM_KFFXP!=2147483648
group by a.failgroup, a.disk_number
order by a.failgroup, a.disk_number;
DISK_NUMBER FAILGROUP N#_AU
----------- ---------- ----------
2 FG1 25
3 FG1 25
20 FG1 26
24 FG1 25
0 FG2 25
1 FG2 26
7 FG2 25
8 FG2 25
We can see that the datafile, after the online redefinition of the disk group, is spread uniformly across the disks and correctly mirrored across the two fail groups, as expected.
Summary
Oracle ASM is a powerful and easy to use volume manager and filesystem for Oracle 10g and 10g RAC. Configuration details and performance metrics of the configured ASM disks and disk groups are exposed via V$ASM_* views. However, the space allocation mapping (ASM file extents to disk allocation unit mapping) is not fully documented. This paper details how queries on the X$KFFXP internal table can be used to work around this limitation. A set of working examples has been discussed to demonstrate the findings and to directly explore some inner workings of ASM. As expected, it was found that datafiles are automatically spread over the available disks in a disk group, that mirroring is taken care by ASM and it is done at the extent level (as opposed to volume-level mirroring found in many other volume managers), and that ‘online’ disk additions to a disk group allow to spread datafiles uniformly over a large number of spindles in a ‘transparent’ way and can be used to improve performance and possibly reduce the impact of ‘hot spots’.
A few open points remain to be investigated, such as: the role of the ‘extra 3 allocation unit’ allocated for each datafile, that were documented in example N.2 (see rows where XNUM_KFFXP=2147483648). Rebalancing operations have been demonstrated to (re)distribute datafile extents uniformly over the available disks. ASM rebalancing algorithm apparently does not utilize workload metrics (from v$asm_disk_stat) to spread datafiles (such as spread apart hot parts of the datafiles) but seems to use a simpler ‘round robin’ algorithm. A few additional X$KF* tables have been identified (see above), but their purposes have not yet been documented.
References
1. N. Vengurlekar, 2005,
2. A. Shakian, OOW 2005, take the guesswork out of db tuning,
3. J Loaiza and S Lee, OTN 2005,
4. L. Canali, scalable Oracle 10g architecture,
5. Oracle 10g administrator’s guide, “Using Automatic Storage Management”
6. Metalink, Bug N. 4306135
7. J Loaiza, OTN 1999, Optimal storage configuration made easy,
8. S. Rognes, 2004, posting to oracle-l mailing list
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/19602/viewspace-1032772/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 推薦一篇難得的深入分析邏輯讀的好文章
- linux 邏輯卷介紹Linux
- js的邏輯與&&運算子用法簡單介紹JS
- 機器學習演算法--邏輯迴歸原理介紹機器學習演算法邏輯迴歸
- SAP WebClient UI和business switch相關的邏輯介紹WebclientUI
- SAP CRM 裡 Attachment 搜尋的實現邏輯介紹
- SAP CRM產品主資料ID的生成邏輯介紹
- SAP WebClient UI頁面標籤的決定邏輯介紹WebclientUI
- Mycat邏輯表table標籤的九個屬性介紹
- 邏輯程式設計與函式程式設計的介紹程式設計函式
- ASM Metadata and InternalsASM
- 【轉】Oracle ASM介紹OracleASM
- SAP WebClient UI配置決定(configuration)的邏輯介紹WebclientUI
- ASM file metadata operationASM
- ASM Metadata Dump UtilityASM
- 一篇關於Redis的好文章Redis
- 達夢資料庫dexp邏輯匯出工具使用介紹資料庫
- 邏輯式程式語言極簡實現(使用C#) - 1. 邏輯式程式語言介紹C#
- Oracle 邏輯結構簡介Oracle
- Linux 系統邏輯盤卷管理LVM 詳細介紹LinuxLVM
- 機器學習中的邏輯迴歸模型簡介機器學習邏輯迴歸模型
- 一篇不太一樣的RxJava介紹RxJava
- 從根上理解elasticsearch(lucene)查詢原理(1)-lucece查詢邏輯介紹Elasticsearch
- Oracle AMDU- ASM Metadata Dump UtilityOracleASM
- lvm(邏輯卷管理器)的介紹和常用功能流程實驗(轉)LVM
- 【ASM】asm常用命令及主要功能介紹ASM
- 分享:一篇webpack配置基礎絕好文章Web
- 資料庫 Mysql 邏輯架構簡介資料庫MySql架構
- Spark LogisticRegression 邏輯迴歸之簡介Spark邏輯迴歸
- 創新實訓(七)部落格相關功能介紹及內部邏輯處理
- 形式邏輯(普通邏輯)7:推理概述
- 『學了就忘』Linux檔案系統管理 — 65、LVM邏輯卷管理介紹LinuxLVM
- 針對 Source Maps 的一篇介紹(翻譯)
- ASP.NET Razor – VB 邏輯條件簡介ASP.NET
- Go 實現 Raft 第一篇:介紹GoRaft
- Go實現Raft第一篇:介紹GoRaft
- 跟我學SpringCloud | 第一篇:介紹SpringGCCloud
- 邏輯分析