[20190409]latch get 引數where and why測試.txt

lfree發表於2019-04-10

[20190409]latch get 引數where and why測試.txt


--//連結http://blog.itpub.net/267265/viewspace-2640890/大概瞭解哦latch get引數.做一個測試:


1.環境:

SYS@book> @ ver1

PORT_STRING                    VERSION        BANNER

------------------------------ -------------- --------------------------------------------------------------------------------

x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production


SYS@book> select addr, name from v$latch_parent where lower(name) like '%'||lower('test excl. parent2 l0')||'%';

ADDR             NAME

---------------- ----------------------------------------

0000000060009978 test excl. parent2 l0


--//拿這個拴鎖作為測試.

$ cat lwhere.sql

column "where" format a36

column location format a36

column PARENT_NAME format a30

SELECT t1.inst_id inst_id

      ,t1.indx indx

      ,t1.ksllasnam PARENT_NAME

      ,t2.ksllwnam "WHERE"

      ,t1.kslnowtf NWFAIL_COUNT

      ,t1.kslsleep SLEEP_COUNT

      ,t1.kslwscwsl WTR_SLP_COUNT

      ,t1.kslwsclthg LONGHOLD_COUNT

      ,t2.ksllwnam LOCATION

  FROM x$ksllw t2, x$kslwsc t1

 WHERE t2.indx = t1.indx AND lower(t1.ksllasnam) LIKE lower('%&&1%');


SYS@book> @ lwhere 'test excl. parent2 l0'

no rows selected

--//沒有查詢到,順便選一個數值作為測試,我測試好像這些引數oracle並不做檢查.


2.測試:

--//session 1:

SYS@book> @ spidx


       SID    SERIAL# PROCESS                  SERVER    SPID       PID  P_SERIAL# C50

---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------

       295          5 65437                    DEDICATED 65438       21          3 alter system kill session '295,5' immediate;


PRO

-----------------------------------

SYS@book(295.5 spid=65438 pid=21)>


SYS@book(295.5 spid=65438 pid=21)>  oradebug setmypid

Statement processed.


SYS@book(295.5 spid=65438 pid=21)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 00000000


SYS@book(295.5 spid=65438 pid=21)>  oradebug call kslgetl 0x0000000060009978 1 2 3

Function returned 1


SYS@book(295.5 spid=65438 pid=21)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 00000015

--//持有後寫入該會話的PID號,奇怪拴鎖寫入pid,而mutext寫入sid,不知道為什麼.


--//session 2:

SYS@book> @ spidx


       SID    SERIAL# PROCESS                  SERVER    SPID       PID  P_SERIAL# C50

---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------

        44         71 65440                    DEDICATED 65441       27         26 alter system kill session '44,71' immediate;


PRO

-----------------------------------

SYS@book(44.71 spid=65441 pid=27)>


$ cat ll.sql

select ksuprpid pid,ksuprsid sid ,ksuprlat laddr,ksuprlnm name, ksuprlmd,ksulawhy,ksulawhr, ksulagts gets from x$ksuprlat;

--//連結有一些介紹,要獲得where,why資訊,要查詢X$檢視才行.

--//這個檢視是x$ksupr的v$process的基表.x$ksuprlat僅僅包含latch的資訊.


SYS@book(44.71 spid=65441 pid=27)>  @ll

    PID        SID LADDR            NAME                                     KSUPRLMD   KSULAWHY   KSULAWHR       GETS

------- ---------- ---------------- ---------------------------------------- ---------- -------- ---------- ----------

     21        295 0000000060009978 test excl. parent2 l0                    EXCLUSIVE         2          3         13


SYS@book(44.71 spid=65441 pid=27)>  oradebug setmypid

Statement processed.


SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslgetl 0x0000000060009978 1 2 4

--//掛起!!


--//session 3:

SYS@book> @ wait

P1RAW            P2RAW            P3RAW         P1         P2         P3        SID    SERIAL#       SEQ# EVENT      STATUS   STATE               WAIT_TIME_MICRO SECONDS_IN_WAIT WAIT_CLASS

---------------- ---------------- ----- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- ------------------- --------------- --------------- -----------

0000000060009978 0000000000000005 00    1610652024          5          0         44         71         29 latch free INACTIVE WAITING                    88607826              89 Other


--//session 1:

SYS@book(295.5 spid=65438 pid=21)>  oradebug call kslfre 0x0000000060009978

Function returned 0


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslgetl 0x0000000060009978 1 2 4

Function returned 1

--//現在成功返回.


SYS@book(44.71 spid=65441 pid=27)>  @ ll

    PID        SID LADDR            NAME                                     KSUPRLMD   KSULAWHY   KSULAWHR       GETS

------- ---------- ---------------- ---------------------------------------- ---------- -------- ---------- ----------

     27         44 0000000060009978 test excl. parent2 l0                    EXCLUSIVE         2          4         14


--//現在是session 2 持有該拴鎖.

SYS@book(44.71 spid=65441 pid=27)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 0000001B


--//0x1B=27,與當前會話的pid=27一直.

--//連結:

Oracle 11g used PID value from v$process to indicate that exclusive latch is busy. Notice that Oracle latch is hold by

process, not session.  On the contrary mutexes are hold by sessions.


--//Oracle 11g使用v$process中的PID值來指示獨佔鎖存器正處於繁忙狀態。注意,Oracle閂鎖是由程式而不是會話儲存的。相反,互斥

--//是透過會話進行的。


Oracle not always used PID for this purpose in the past. In fact busy exclusive latch had the value of:


--//Oracle過去並不總是為此目的使用PID。事實上,忙碌的獨佔鎖存器的值是:


1 in version 7

0xFF in versions 8.0-10.2

PID in version 11g


3.繼續測試:

--//使用strace 跟蹤ipcs也就是訊號看看.

 $ strace -f -e ipc -p 65438

Process 65438 attached - interrupt to quit


$ strace -f -e ipc -p 65441

Process 65441 attached - interrupt to quit


--//當前是session 2持有該拴鎖.

SYS@book(44.71 spid=65441 pid=27)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 0000001B


--//session 1:

SYS@book(295.5 spid=65438 pid=21)>  oradebug call kslgetl 0x0000000060009978 1 2 5

--//掛起!!


--//跟蹤程式看到阻塞後執行semop睡眠等待.

$ strace -f -e ipc -p 65438

Process 65438 attached - interrupt to quit

semop(310214656, 0x7fff12c040b0, 1


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslfre 0x0000000060009978

Function returned 0


--//跟蹤程式看到釋放拴鎖時發出資訊.session 2程式:

$ strace -f -e ipc -p 65441

Process 65441 attached - interrupt to quit

semctl(310214656, 25, SETVAL, 0x1)      = 0


--//session 1程式:

$ strace -f -e ipc -p 65438

Process 65438 attached - interrupt to quit

semop(310214656, 0x7fff12c040b0, 1)     = 0


--//執行成功!!也就是oracle是透過semctl控制訊號操作.


4.繼續測試:

--//也就是如果使用oradebug poke修改資訊應該是無效的.

--//當前是session 1持有該拴鎖.

SYS@book(295.5 spid=65438 pid=21)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 00000015


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslgetl 0x0000000060009978 1 2 6


--//掛起!!


--//session 2:

$ strace -f -e ipc -p 65441

Process 65441 attached - interrupt to quit

semctl(310214656, 25, SETVAL, 0x1)      = 0

semop(310214656, 0x7fffd5533a10, 1


--//session 1,人為取消看看,使用oradebug poke:

SYS@book(295.5 spid=65438 pid=21)>  oradebug poke 0x0000000060009978 4 0x00000000

BEFORE: [060009978, 06000997C) = 00000015

AFTER:  [060009978, 06000997C) = 00000000


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslgetl 0x0000000060009978 1 2 6

Function returned 1


--//session 2:

$ strace -f -e ipc -p 65441

Process 65441 attached - interrupt to quit

semctl(310214656, 25, SETVAL, 0x1)      = 0

semop(310214656, 0x7fffd5533a10, 1)     = 0


--//session 1:

$ strace -f -e ipc -p 65438

Process 65438 attached - interrupt to quit

semop(310214656, 0x7fff12c040b0, 1)     = 0


--//session 1沒有執行semctl.不過這樣操作後:


SYS@book>  @ ll

    PID        SID LADDR            NAME                                     KSUPRLMD  KSULAWHY   KSULAWHR       GETS

------- ---------- ---------------- ---------------------------------------- --------- -------- ---------- ----------

     21        295 0000000060009978 test excl. parent2 l0                    EXCLUSIVE        2          6         16

     27         44 0000000060009978 test excl. parent2 l0                    EXCLUSIVE        2          6         16


--//2個會話都持有該拴鎖.這樣是有問題的.

--//session 1:

SYS@book(295.5 spid=65438 pid=21)>  oradebug call kslfre 0x0000000060009978

Function returned 0


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  @ ll

    PID        SID LADDR            NAME                                     KSUPRLMD   KSULAWHY   KSULAWHR       GETS

------- ---------- ---------------- ---------------------------------------- ---------- -------- ---------- ----------

     27         44 0000000060009978 test excl. parent2 l0                    EXCLUSIVE         2          6         16


SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslfre 0x0000000060009978

ORA-03113: end-of-file on communication channel

ORA-24323: value not allowed

--//會話已經中斷了.


總結:

--//看了連結許多文章,才知道10g,11g關於latch發生很大變化,不能再按照以前spin and sleep

--//方式操作.摘抄一段的內容:


This is how the gdb output looks like when the latch get in willing to wait mode is executed:


kslgetl laddr:60023a80, willing:1, where:0, why:2442

kslges 60023a80, 0, 1, 0

kslwlmod 13311368, -1780327896, 1610758784, 1

skgpwwait 13311608, -1767360, -1780326976, 0

sskgpwwait 13311608, -1767360, -1780326976, 0

semop 360451, 13310840, 1, -1


Interestingly, if the latch is not taken, this is how the latch get sequence looks like:


kslgetl laddr:60023a80, willing:1, where:0, why:2442


In other words, for getting a non shared latch in willing to wait mode:


1-the function kslgetl is called, which tries to fetch the latch.

If the latch can be taken, the function returns, if not:

2-the function kslges (kernel service latch get spinning) is called, which supposedly also tries to take the same latch.

If the latch still can not be taken, the next function is:

3-the function kslwlmod (kernel service latch waiting list modify) is entered.

In this function the process registers itself as waiting in the post/wait list.

4-the function skgpwwait (system kernel generic post/wait wait) is entered.

This function sets up the waiting for the process so it can be posted.

5-the function sskgpwwait (system system kernel generic post/wait wait)

My current understanding is the 'ss' function contain the platform specific code for database functions.

6-the (operating system) function semop (semaphore operation) is called.


This will make the process sleep waiting on a semaphore (operating system signalling mechanism). This way, the process

will not be runnable on the CPU unless the semaphore is posed.


--//這將使程式休眠等待訊號量(作業系統信令機制)。這樣,該方法除非發出訊號量,否則將無法在CPU上執行。

--//順便問一下里面提到的地址如何獲得的.

Some searching around revealed that a CPU register reveals this information. Add this to the above gdb script:


break *0xc29b51

  commands

    silent

    printf " kslges loop: %d\n", $ecx

    c

  end


--//有時間再看看.


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-2640940/,如需轉載,請註明出處,否則將追究法律責任。

相關文章