linux的hugepage的配置-優化oracle記憶體 .

weixin_34221276發表於2015-04-12

linux的hugepage的配置

 

linux雖然沒有aix,hp unix那麼強悍,但linux也是非常優秀的,為了提升linux的效能,它採用了很多
io,memory的排程機制,linux使用記憶體的方式是採用vm的方式,即linux把實體記憶體和swap共同虛擬成
記憶體來對外提供,有時使用者看似使用記憶體,可實際上是使用磁碟,那如何避免使用swap磁碟空間呢?

linux管理記憶體的單位是頁(pages),一般情況下是4k的page,當我們使用的大記憶體時(>8G),管理這麼大的記憶體
就會給系統造成很大的負擔,再加上頻繁的pagein/pageout,會成為系統的瓶頸。

 

1.hugepage介紹
2.實踐配置


1.hugepage介紹
hugepage是在linux2.6核心被引入的,主要提供4k的page和比較大的page的選擇

當我們訪問記憶體時,首先訪問”page table“,然後linux在通過“page table”的
mapping來訪問真實實體記憶體(ram+swap)。為了提升效能,linux在cpu中申請
固定大小的buffer,被稱為TLB,TLB中儲存有“page table”的部分內容,這也遵循
了,讓資料儘可能的靠近cpu原則。在TLB中通過hugetlb來指向hugepage。這些被分配
的hugepage作為記憶體檔案系統hugetlbfs(類似tmpfs)提供給程式使用。

普通4k page

 

啟用hugepage

 

hugepage特點
linux系統啟動,hugepage就被分配並保留,不會pagein/pageout,除非人為干預,如改變hugepage的配置等;
根據linux核心的版本和HW的架構,hugepage的大小從2M到256M不等。因為採用大page,所以也減少TLB
和page table的管理壓力

什麼使用hugepage

對於大記憶體(>8G),hugepage對於提高在linux上的oracle效能是非常有幫助的
 1)Larger Page Size and Less of Pages:減少了HugeTLB 的工作量
 2)No Page Table Lookups:因為hugepage是不swappable的,所有就沒有page table lookups。
 3)No Swapping: 在Linux下,hugepage是不支援swapping
 4)No 'kswapd' Operations:在linux下程式“kswapd”是管理swap的,如果是大記憶體,那pages的數量就非常大, 那“kswapd”就會被頻繁的呼叫,從而會影響效能。

 0 檢視系統版本 uname -r

2.6.18-128.el5

[root@node2 ~]# ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 32768      gdm       600        393216     2          dest
0x7a1b43dc 98305      grid       660        4096       0
0x596be9dc 622594     oracle    660        4833935360 32


1) 配置之前

[oracle@db101 ~]$ grep HugePages /proc/meminfo
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB


(2) 首先修改limits.conf
[root@db101 ~]# vi /etc/security/limits.conf  

## 等於SGA_MAX_SIZE  下面是KB 鎖定15G記憶體

free - t 獲取系統記憶體值

##zengmuansha add  0122
oracle   soft   memlock    15826672
oracle   hard   memlock    15826672


Oracle下使用 ulimit -t 檢視

(3) [ORACLE 11G] 必須關閉AMM(自動記憶體管理)特性才能使用hugepage
設定如下初始化引數:
ALTER SYSTEM SET sga_max_size=15455M SCOPE=SPFILE;
ALTER SYSTEM SET sga_target=15455M SCOPE=SPFILE;
ALTER SYSTEM SET PGA_AGGREGATE_TARGET=2048M SCOPE=SPFILE;

ALTER SYSTEM SET memory_target=0 SCOPE=SPFILE;
ALTER SYSTEM SET memory_max_target=0 SCOPE=SPFILE;

11.2.0.1版本 MEMORY_TARGET=0 設定會無效 必須通過INIT.ORA來遮蔽掉  再生成SPFILE;


(4) 配置分配hugepage的數量
nr_hugepages的計算公式:nr_hugepages>=sga(mb)/Hugepagesize(mb)
echo "vm.nr_hugepages=3872" >> /etc/sysctl.conf

程式碼需要ORACLE 賬號執行 並且所有例項以開啟,而且AMM已關閉

hugepages_settings.sh

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
# Check for the kernel version
#檢視Oracle Kernel的版本,因為2.4和2.6使用的hugepages的引數是不一樣的;
#2.4使用vm.hugetlb_pool,而2.6使用vm.nr_hugepages。
KERN='uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }''
# Find out the HugePage size
#查詢Hugepages的大小,x86非PAE為4096,x86+PAE以及x86_64為2048,注意這裡單位為K。
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk {'print $2'}`
# Start from 1 pages to be on the safe side and guarantee 1 free HugePage
#保證至少有1個page,也就是計數從1開始,MOS文件401749.1的初始計數從0開始。
NUM_PG=1
# Cumulative number of pages required to handle the running shared memory segments
#迴圈計算一共需要多少hugepages
#ipcs -m | awk {'print $5'} | grep "[0-9][0-9]*"的結果是列出所有的shared memory的大#小,單位為Bytes;echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q 為將shared memory處理單
#個page的大小,得到單個shared memory所需的hugepages的數量。將所有的shared memory
#迴圈累加,最終得到總的hugepages的數量。
for SEG_BYTES in `ipcs -m | awk {'print $5'} | grep "[0-9][0-9]*"`
do
   MIN_PG='echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q'
   if [ $MIN_PG -gt 0 ]; then
      NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
   fi
done
# Finish with results
#根據不同的核心,提示設定不同的hugepages引數
case $KERN in
   '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
          echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
   '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    *) echo "Unrecognized kernel version $KERN. Exiting." ;;
esac
# End
 #!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support 
# http://support.oracle.com


# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support 
(http://support.oracle.com) where it is intended to compute values for 
the recommended HugePages/HugeTLB configuration for the current shared 
memory segments. Before proceeding with the execution please note following:
 * For ASM instance, it needs to configure ASMM instead of AMM.
 * The 'pga_aggregate_target' is outside the SGA and 
   you should accommodate this while calculating SGA size.
 * In case you changes the DB SGA size, 
   as the new SGA will not fit in the previous HugePages configuration, 
   it had better disable the whole HugePages, 
   start the DB with new SGA size and run the script again.
And make sure that:
 * Oracle Database instance(s) are up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not setup 
   (See Doc ID 749851.1)
 * The shared memory segments can be listed by command:
     # ipcs -m




Press Enter to proceed..."


read


# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`


# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
if [ -z "$HPG_SZ" ];then
    echo "The hugepages may not be supported in the system where the script is being executed."
    exit 1
fi


# Initialize the counter
NUM_PG=0


# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`
do
    MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
    if [ $MIN_PG -gt 0 ]; then
        NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
    fi
done


RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`


# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
    echo "***********"
    echo "** ERROR **"
    echo "***********"
    echo "Sorry! There are not enough total of shared memory segments allocated for 
HugePages configuration. HugePages can only be used for shared memory segments 
that you can list by command:


    # ipcs -m


of a size that can match an Oracle Database SGA. Please make sure that:
 * Oracle Database instance is up and running 
 * Oracle Database 11g Automatic Memory Management (AMM) is not configured"
    exit 1
fi


# Finish with results
case $KERN in
    '2.2') echo "Kernel version $KERN is not supported. Exiting." ;;
    '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
           echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
    '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    '3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
esac


# End

(5) 重啟系統

reboot

(6) 啟動資料庫

sqlplus / as sysba

startup

(7) 檢查是否生效

root@db101:[/root]grep HugePages  /proc/meminfo
HugePages_Total:  3890
HugePages_Free:     17
HugePages_Rsvd:      0

為了確保HugePages配置的有效性,HugePages_Free值應該小於HugePages_Total 的值,並且應該等於HugePages_Rsvd的值。

Hugepages_Free 和HugePages_Rsvd 的值應該小於SGA 分配的gages。

 

11.2.0.2之前的版本,database的SGA只能選擇全部使用hugepages或者完全不使用hugepages。  11.2.0.2 及以後的版本, oracle增加了一個新的引數“USE_LARGE_PAGES”來管理資料庫如何使用 hugepages

2.8 故障處理

一些常見的問題如下:

Symptom

Possible Cause

Troubleshooting Action

System is running out of memory or swapping  Not enough HugePages to cover the SGA(s) and therefore the area reserved for HugePages are wasted where SGAs are allocated through regular pages

Review your HugePages configuration to make sure that all SGA(s) are covered.

Databases fail to start

memlock limits are not set properly

Make sure the settings in limits.conf apply to database owner account.

One of the database fail to start while another is up

The SGA of the specific database could not find available HugePages and remaining RAM is not enough.

Make sure that the RAM and HugePages are enough to cover all your database SGAs

Cluster Ready Services (CRS) fail to start

HugePages configured too large (maybe larger than installed RAM)

Make sure the total SGA is less than the installed RAM and re-calculate HugePages.

HugePages_Total = HugePages_Free

HugePages are not used at all. No database instances are up or using AMM.

Disable AMM and make sure that the database instances are up.

Database started successfully and the performance is slow

The SGA of the specific database could not find available HugePages and therefore the SGA is handled by regular pages, which leads to slow performance

Make sure that the HugePages are many enough to cover all your database SGAs

 

相關文章