0824SQL/Net message from client 丟包模擬

lfree發表於2017-08-24

[20170824]SQL/Net message from client與網路丟包模擬.txt

--//以前做的測試:http://blog.itpub.net/267265/viewspace-2130032/
--//我本來的意思是測試網路很慢的情況下,會發生什麼情況?
--//我的測試SQL/Net message from client等待事件很長,但是SQL*Net message to client 的ela確很小.
--//也就是給人的感覺當網路很慢的情況下,要注意SQL/Net message from client等待事件.
--//我發現我當時做了流量限制,當時並不會丟包,僅僅慢一些.想起以前模擬參考連結,模擬出現丟包的情況,會發生什麼
--//

--//先看看官方手冊:docs.oracle.com/cd/E11882_01/server.112/e40402/waitevents003.htm#REFRN00634
SQL*Net message from client
The server process (foreground process) waits for a message from the client process to arrive.
Wait Time: The time it took for a message to arrive from the client since the last message was sent to the client
Parameter     Description
driver id     See "driver id"
#bytes     The number of bytes received by the server (foreground process) from the client.

SQL*Net message to client
The server (foreground process) is sending a message to the client.
Wait Time: The actual time the send takes
Parameter     Description
driver id     See "driver id"
#bytes     The number of bytes sent by the server process to the client

1.環境:
SYS@book> @ &r/ver1

PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

--//在服務端執行:

# tc qdisc show dev eth0
qdisc mq 0: root

# tc qdisc add dev eth0 root netem loss 40% 25%
--//# tc qdisc delete dev eth0 root netem loss 40% 25%
--//# 這樣丟包很厲害,遠端ssh連線都很慢,事先要寫好取消命令,到時候copy & paste就ok了.(打命令很困難)
 
# tc qdisc show dev eth0
qdisc netem 8001: root refcnt 6 limit 1000 loss 40% 25%

--// This will cause 40 of packets to be lost, and each successive probability depends by a quarter on the last one.
--//Probn = .25 * Probn-1 + .75 * Random

--//從client ping伺服器,發現丟包.
# ping -s 8192 192.168.100.78 -c 20
PING 192.168.100.78 (192.168.100.78) 8192(8220) bytes of data.
8200 bytes from 192.168.100.78: icmp_seq=3 ttl=64 time=0.393 ms
8200 bytes from 192.168.100.78: icmp_seq=4 ttl=64 time=0.400 ms
8200 bytes from 192.168.100.78: icmp_seq=6 ttl=64 time=0.351 ms
8200 bytes from 192.168.100.78: icmp_seq=7 ttl=64 time=0.396 ms
8200 bytes from 192.168.100.78: icmp_seq=9 ttl=64 time=0.426 ms
8200 bytes from 192.168.100.78: icmp_seq=10 ttl=64 time=0.332 ms
8200 bytes from 192.168.100.78: icmp_seq=12 ttl=64 time=0.334 ms
8200 bytes from 192.168.100.78: icmp_seq=13 ttl=64 time=0.395 ms
8200 bytes from 192.168.100.78: icmp_seq=15 ttl=64 time=0.399 ms
8200 bytes from 192.168.100.78: icmp_seq=17 ttl=64 time=0.404 ms
8200 bytes from 192.168.100.78: icmp_seq=18 ttl=64 time=0.344 ms
8200 bytes from 192.168.100.78: icmp_seq=19 ttl=64 time=0.385 ms
--- 192.168.100.78 ping statistics ---
20 packets transmitted, 12 received, 40% packet loss, time 19014ms
rtt min/avg/max/mdev = 0.332/0.379/0.426/0.039 ms, pipe 2

2.測試:
$ cat aa.sql
set array 7
alter session set events '10046 trace name context forever, level 12';
host sleep 10
select * from emp;
host sleep 12
alter session set events '10046 trace name context off';

--//在client端執行以上指令碼:
--//注先執行1次select * from emp;.避免遞迴語句.

--//跟蹤檔案內容如下:
*** 2017-08-24 11:06:48.940
*** SESSION ID:(16.17) 2017-08-24 11:06:48.940
*** CLIENT ID:() 2017-08-24 11:06:48.940
*** SERVICE NAME:(book) 2017-08-24 11:06:48.940
*** MODULE NAME:(SQL*Plus) 2017-08-24 11:06:48.940
*** ACTION NAME:() 2017-08-24 11:06:48.940

WAIT #140568569128584: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544008940478

*** 2017-08-24 11:06:59.156
WAIT #140568569128584: nam='SQL*Net message from client1' ela= 10215533 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019156541
CLOSE #140568569128584:c=0,e=8,dep=0,type=1,tim=1503544019156653
=====================
PARSING IN CURSOR #140568569243528 len=17 dep=0 uid=83 oct=3 lid=83 tim=1503544019156841 hv=1745700775 ad='61c8f7a0' sqlid='a2dk8bdn0ujx7'
select * from emp
END OF STMT
PARSE #140568569243528:c=0,e=133,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3956160932,tim=1503544019156835
EXEC #140568569243528:c=0,e=59,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3956160932,tim=1503544019157002
WAIT #140568569243528: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019157076
FETCH #140568569243528:c=0,e=99,p=0,cr=6,cu=0,mis=0,r=1,dep=0,og=1,plh=3956160932,tim=1503544019157226
WAIT #140568569243528: nam='SQL*Net message from client' ela= 827 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019158106
WAIT #140568569243528: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019158198
FETCH #140568569243528:c=0,e=61,p=0,cr=1,cu=0,mis=0,r=7,dep=0,og=1,plh=3956160932,tim=1503544019158238
WAIT #140568569243528: nam='SQL*Net message from client' ela= 619339 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019777623
WAIT #140568569243528: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019777726
FETCH #140568569243528:c=0,e=86,p=0,cr=1,cu=0,mis=0,r=6,dep=0,og=1,plh=3956160932,tim=1503544019777787
STAT #140568569243528 id=1 cnt=14 pid=0 pos=1 obj=87108 op='TABLE ACCESS FULL EMP (cr=8 pr=0 pw=0 time=89 us cost=3 size=532 card=14)'

*** 2017-08-24 11:07:11.786
WAIT #140568569243528: nam='SQL*Net message from client' ela= 12008336 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544031786313
CLOSE #140568569243528:c=0,e=11,dep=0,type=0,tim=1503544031786439
=====================
PARSING IN CURSOR #140568569243528 len=55 dep=0 uid=83 oct=42 lid=83 tim=1503544031786533 hv=2217940283 ad='0' sqlid='06nvwn223659v'
alter session set events '10046 trace name context off'
END OF STMT
PARSE #140568569243528:c=1000,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=0,tim=1503544031786532
EXEC #140568569243528:c=0,e=419,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=0,tim=1503544031787033

SYS@book> @ &r/ev_name 'SQL*Net message%client%'
EVENT#   EVENT_ID NAME                        PARAMETER1 PARAMETER2 PARAMETER3 WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
------ ---------- --------------------------- ---------- ---------- ---------- ------------- ----------- --------------------
   350 2067390145 SQL*Net message to client   driver id  #bytes                   2000153315           7 Network
   354 1421975091 SQL*Net message from client driver id  #bytes                   2723168908           6 Idle

--//oracle在分類上SQL*Net message to client是network型別事件.SQL*Net message from client是Idle型別事件.
--//我以前的例子說明如果網路很慢,要特別注意SQL*Net message from client ,不是沒有必要不關注這個事件.
--//官方文件: Wait Time: The actual time the send takes

--//簡單分析:
$ grep "SQL\*Net message to client" /u01/app/oracle/diag/rdbms/book/book/trace/book_ora_23505.trc
WAIT #140568569128584: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544008940478
WAIT #140568569243528: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019157076
WAIT #140568569243528: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019158198
WAIT #140568569243528: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019777726

--//即使在這樣的丟包率下,ela也是很小,也就是根本不能透過這個事件定位網路問題.而且 #bytes=1很小.

$ grep "SQL\*Net message from client" /u01/app/oracle/diag/rdbms/book/book/trace/book_ora_23505.trc
WAIT #140568569128584: nam='SQL*Net message from client' ela= 10215533 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019156541
WAIT #140568569243528: nam='SQL*Net message from client' ela= 827 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019158106
WAIT #140568569243528: nam='SQL*Net message from client' ela= 619339 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544019777623
WAIT #140568569243528: nam='SQL*Net message from client' ela= 12008336 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1503544031786313

--//注意第1,4次出現的ela= 10215533,ela= 12008336,正好與我指令碼host sleep命令有關,也就是出現這個等待事件大部分時間是等待用
--//戶輸入命令的事件.
--//而中間2個就非常值得關注.
--//官方手冊: Wait Time: The time it took for a message to arrive from the client since the last message was sent to the client
--//這個實際上前一個訊息與後一個訊息的時間差.

總結:
1.出現網路緩慢或者丟包的情況,應該關注的是SQL*Net message from client.而SQL*Net message to client應該不考慮.
2.另外千萬不要在生產系統做這樣的測試,我當時差點不能遠端操作伺服器.很慢....

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-2144051/,如需轉載,請註明出處,否則將追究法律責任。

相關文章