[20180123]測試SQLNET.EXPIRE_TIME引數.txt

lfree發表於2018-01-23

[20180123]測試SQLNET.EXPIRE_TIME引數.txt

--//曾經寫過一篇linux核心網路引數測試tcp_keepalive,連結http://blog.itpub.net/267265/viewspace-2138391/
--//測試服務端會定時發起連線監測與client的連線狀態.

引數解析:
/proc/sys/net/ipv4/tcp_keepalive_time    當keepalive起用的時候,TCP傳送keepalive訊息的頻度。預設是2小時。
/proc/sys/net/ipv4/tcp_keepalive_intvl   當探測沒有確認時,keepalive探測包的傳送間隔。預設是75秒。
/proc/sys/net/ipv4/tcp_keepalive_probes  如果對方不予應答,keepalive探測包的傳送次數。預設值是9。

--//我發現我遺漏的一種情況的測試,使用SQLNET.EXPIRE_TIME引數,這個引數與前面的核心引數意思差不多.定時監測服務端與客戶端連
--//接情況.但是注意這種測試是有服務端發起,一些狀態防火牆不允許這樣的包透過,我們的分院與總院正好有這樣的環境,不知道是否網管
--//配置錯誤,正好利用這個環境測試看看.主要目的測試設定SQLNET.EXPIRE_TIME情況下,如果無法連上client,是否探測多次,間隔時間是多少.
--//知道這些數值對於理解我遇到的一個問題幫助很大,另發文寫一篇blog.

1.環境:
--//服務端IP:192.168.100.106
--//客戶端IP:192.168.90.14
--//服務端:
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 75
/proc/sys/net/ipv4/tcp_keepalive_probes: 9
/proc/sys/net/ipv4/tcp_keepalive_time: 7200

--//客戶端
# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f
/proc/sys/net/ipv4/tcp_keepalive_intvl: 10
/proc/sys/net/ipv4/tcp_keepalive_probes: 4
/proc/sys/net/ipv4/tcp_keepalive_time: 200

--//在sqlnet.ora檔案中配置
sqlnet.expire_time = 5

--//更正說明:實際上這個引數要修改oracle使用者的$ORACLE_HOME/network/admin/sqlnet.ora,而裡面的定義如下:
$  cat $ORACLE_HOME/network/admin/sqlnet.ora
SQLNET.EXPIRE_TIME=10

--//這樣結果基本一致.加上我當時執行一些命令檢視埠號,消耗一點時間.注意為什麼不是讀取grid使用者的
--//$ORACLE_HOME/network/admin/sqlnet.ora檔案的SQLNET.EXPIRE_TIME = 5引數,另外寫blog探究.

2.測試:
--//客戶端使用sqlplus連線資料庫:
xxxxx>
SELECT s.sid, s.serial#, s.process, s.server, p.spid, p.pid, p.serial# p_serial#,'alter system kill session
       '''||s.sid||','||s.serial#||''''||' immediate;' c50
  FROM v$session s, v$process p
WHERE s.sid in (
        SELECT sid
          FROM v$mystat
WHERE rownum  = 1)
   AND s.paddr = p.addr;

       SID    SERIAL# PROCESS                  SERVER    SPID       PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------
      6656      50071 22214                    DEDICATED 6980        69        200 alter system kill session '6656,50071' immediate;

--//確定服務端程式號=6980
--//服務端:
# lsof -i -P -n | grep "6980 oracle"
oracle      6980 oracle    8u  IPv4 3826220323      0t0  UDP 127.0.0.1:58395
oracle      6980 oracle   14u  IPv4 3826216194      0t0  UDP 127.0.0.1:58118
oracle      6980 oracle   18u  IPv4 3826173598      0t0  TCP 192.168.100.106:1521->192.168.90.14:41203 (ESTABLISHED)

--//確定client的埠號41203.

# tcpdump -i bondeth0 host 192.168.90.15 and port 41203 -nn -vv
tcpdump: listening on bondeth0, link-type EN10MB (Ethernet), capture size 96 bytes

--//客戶端執行一些sql語句以後:
xxxxx> select sysdate from dual ;
SYSDATE
-------------------
2018-01-23 11:52:58

--//客戶端開啟監測:(注意網路介面不要選擇錯誤!!我測試環境是是bond1):
# tcpdump -i bond1 host 192.168.90.14 and port 41203 -nn -vv
tcpdump: listening on bond1, link-type EN10MB (Ethernet), capture size 65535 bytes
...

--//服務端開啟監測:
# tcpdump -i bondeth0 host 192.168.90.14 and port 41203 -nn -vv
tcpdump: listening on bondeth0, link-type EN10MB (Ethernet), capture size 96 bytes
11:52:58.146050 IP (tos 0x0, ttl  61, id 37349, offset 0, flags [DF], proto: TCP (6), length: 346) 192.168.90.14.41203 > 192.168.100.106.1521: P 64396777:64397083(306) ack 1905633603 win 368
11:52:58.146419 IP (tos 0x0, ttl  64, id 25005, offset 0, flags [DF], proto: TCP (6), length: 418) 192.168.100.106.1521 > 192.168.90.14.41203: P 1:379(378) ack 306 win 330
11:52:58.147109 IP (tos 0x0, ttl  61, id 37350, offset 0, flags [DF], proto: TCP (6), length: 40) 192.168.90.14.41203 > 192.168.100.106.1521: ., cksum 0x0860 (correct), 306:306(0) ack 379 win 386
11:52:58.148255 IP (tos 0x0, ttl  61, id 37351, offset 0, flags [DF], proto: TCP (6), length: 61) 192.168.90.14.41203 > 192.168.100.106.1521: P, cksum 0xe25b (correct), 306:327(21) ack 379 win 386
11:52:58.148361 IP (tos 0x0, ttl  64, id 25006, offset 0, flags [DF], proto: TCP (6), length: 212) 192.168.100.106.1521 > 192.168.90.14.41203: P 379:551(172) ack 327 win 330
11:52:58.188314 IP (tos 0x0, ttl  61, id 37352, offset 0, flags [DF], proto: TCP (6), length: 40) 192.168.90.14.41203 > 192.168.100.106.1521: ., cksum 0x078d (correct), 327:327(0) ack 551 win 404
---//等待看看:
--//因為閘道器配置成狀態防火牆的形式,從服務端發起的連線都會被拒接..等看看.
--//已經過了200秒,說明client的核心引數tcp_keepalive_*對於連線的服務端無用.也就是client不會主動發起連線測試.
12:07:20.096941 IP (tos 0x0, ttl  64, id 25007, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:07:20.311391 IP (tos 0x0, ttl  64, id 25008, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:07:20.742391 IP (tos 0x0, ttl  64, id 25009, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:07:21.604392 IP (tos 0x0, ttl  64, id 25010, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:07:23.328394 IP (tos 0x0, ttl  64, id 25011, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:07:26.777395 IP (tos 0x0, ttl  64, id 25012, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:07:33.672394 IP (tos 0x0, ttl  64, id 25013, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:07:47.448449 IP (tos 0x0, ttl  64, id 25014, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:08:15.032451 IP (tos 0x0, ttl  64, id 25015, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:09:10.136405 IP (tos 0x0, ttl  64, id 25016, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:11:00.472405 IP (tos 0x0, ttl  64, id 25017, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:13:00.792400 IP (tos 0x0, ttl  64, id 25018, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:15:01.112403 IP (tos 0x0, ttl  64, id 25019, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:17:01.432397 IP (tos 0x0, ttl  64, id 25020, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:19:01.752399 IP (tos 0x0, ttl  64, id 25021, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330
12:21:02.072411 IP (tos 0x0, ttl  64, id 25022, offset 0, flags [DF], proto: TCP (6), length: 50) 192.168.100.106.1521 > 192.168.90.14.41203: P, cksum 0x3fee (incorrect (-> 0x01ab), 551:561(10) ack 327 win 330

--//在出現以上資料包過程中,在服務端執行如下:
--//伺服器:
# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0     10 192.168.100.106:1521        192.168.90.14:41203         ESTABLISHED on (52.68/10/0)
..
# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0     10 192.168.100.106:1521        192.168.90.14:41203         ESTABLISHED on (103.27/12/0)

# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0     10 192.168.100.106:1521        192.168.90.14:41203         ESTABLISHED on (90.45/12/0)

# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0     20 192.168.100.106:1521        192.168.90.14:41203         ESTABLISHED on (18446744073.51/14/0)
--//中間不知道為什麼冒出這個大的值.
# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0     20 192.168.100.106:1521        192.168.90.14:41203         ESTABLISHED on (72.29/15/0)

...
# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
--//這時連線已經斷開在服務端.

--//說明關於netstat -o 引數的解析參考連線:http://blog.itpub.net/267265/viewspace-2138396/
--//The interpretation of field is, first field = timer countdown value, second field = no. of retransmissions,
--//third field = number of keepalive probes sent.
--//可以發現前面的第1欄位逐漸減少,而第2欄位增加後,第1欄位重新記錄.

--//客戶端:
# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0      0 192.168.90.14:41203         192.168.100.106:1521        ESTABLISHED off (0.00/0/0)

# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0      0 192.168.90.14:41203         192.168.100.106:1521        ESTABLISHED off (0.00/0/0)

# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0      0 192.168.90.14:41203         192.168.100.106:1521        ESTABLISHED off (0.00/0/0)
--//而客戶端的連線還在.

--//客戶端,在客戶端監測:
tcpdump: listening on bond1, link-type EN10MB (Ethernet), capture size 65535 bytes
...
12:27:17.961544 IP (tos 0x0, ttl 64, id 37353, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 64397104:64397410, ack 1905634153, win 404, length 306
12:27:18.166180 IP (tos 0x0, ttl 64, id 37354, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:27:18.576173 IP (tos 0x0, ttl 64, id 37355, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:27:19.396154 IP (tos 0x0, ttl 64, id 37356, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:27:21.036150 IP (tos 0x0, ttl 64, id 37357, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:27:24.316166 IP (tos 0x0, ttl 64, id 37358, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:27:30.876176 IP (tos 0x0, ttl 64, id 37359, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:27:43.996189 IP (tos 0x0, ttl 64, id 37360, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:28:10.236177 IP (tos 0x0, ttl 64, id 37361, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:29:02.716433 IP (tos 0x0, ttl 64, id 37362, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:30:47.676172 IP (tos 0x0, ttl 64, id 37363, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:32:47.677461 IP (tos 0x0, ttl 64, id 37364, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:34:47.676459 IP (tos 0x0, ttl 64, id 37365, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:36:47.676446 IP (tos 0x0, ttl 64, id 37366, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:38:47.676172 IP (tos 0x0, ttl 64, id 37367, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306
12:40:47.676464 IP (tos 0x0, ttl 64, id 37368, offset 0, flags [DF], proto TCP (6), length 346) 192.168.90.14.41203 > 192.168.100.106.1521: Flags [P.], cksum 0x4116 (incorrect -> 0xfb4b), seq 0:306, ack 1, win 404, length 306

--//同時在客戶端執行sql命令:
xxxxx> select sysdate from dual ;
--//掛起..由於服務端連線已經不存在了,最後報如下錯誤:
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 6980
Session ID: 6656 Serial number: 50071


--//客戶端:
# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0    306 192.168.90.14:41203         192.168.100.106:1521        ESTABLISHED on (6.85/7/0)

# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer
tcp        0    306 192.168.90.14:41203         192.168.100.106:1521        ESTABLISHED on (13.09/9/0)

# netstat -otun | egrep 'Active|Proto|192.168.90.14:41203'
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       Timer

--//可以看到timer列後面的變化..看到的情況與伺服器前面的情況一樣.

3.分析時間變化:

12:07:20.096941
11:52:58.188314
--//間隔15*60+20-58 = 862秒,為什麼不是SQLNET.EXPIRE_TIME*2分鐘=5*2*60 = 600秒呢?

create table t ( a timestamp);
insert into t values ('2018-01-23 12:07:20.096941');
insert into t values ('2018-01-23 12:07:20.311391');
insert into t values ('2018-01-23 12:07:20.742391');
insert into t values ('2018-01-23 12:07:21.604392');
insert into t values ('2018-01-23 12:07:23.328394');
insert into t values ('2018-01-23 12:07:26.777395');
insert into t values ('2018-01-23 12:07:33.672394');
insert into t values ('2018-01-23 12:07:47.448449');
insert into t values ('2018-01-23 12:08:15.032451');
insert into t values ('2018-01-23 12:09:10.136405');
insert into t values ('2018-01-23 12:11:00.472405');
insert into t values ('2018-01-23 12:13:00.792400');
insert into t values ('2018-01-23 12:15:01.112403');
insert into t values ('2018-01-23 12:17:01.432397');
insert into t values ('2018-01-23 12:19:01.752399');
insert into t values ('2018-01-23 12:21:02.072411');
commit;

column interval format 00000.999999

WITH x
     AS (SELECT a - n_a interval
           FROM (  SELECT a, LAG (a) OVER (ORDER BY a) n_a
                     FROM t
                 ORDER BY a)
          WHERE n_a IS NOT NULL)
SELECT   EXTRACT (DAY FROM interval) * 86400
       + EXTRACT (HOUR FROM interval) * 3600
       + EXTRACT (MINUTE FROM interval) * 60
       + EXTRACT (SECOND FROM interval)
          interval
  FROM x;
 
     INTERVAL
-------------
00000.214450
00000.431000
00000.862001
00001.724002
00003.449001
00006.894999
00013.776055
00027.584002
00055.103954
00110.336000   
00120.319995
00120.320003
00120.319994
00120.320002
00120.320012
15 rows selected.

--//開始間隔0.21秒,以後間隔2倍遞增.到110秒後,按照120秒間隔.另外一組資料測試如下,結果驚人的相似.

     INTERVAL
-------------
00000.204636
00000.409993
00000.819981
00001.639996
00003.280016
00006.560010
00013.120013
00026.239988
00052.480256
00104.959739
00120.001289
00119.998998
00119.999987
00119.999726
00120.000292
15 rows selected.

--//最大與最小的間隔
WITH x AS (SELECT MAX (a) - MIN (a) interval FROM t)
SELECT SUM
       (
            EXTRACT (DAY FROM interval) * 86400
          + EXTRACT (HOUR FROM interval) * 3600
          + EXTRACT (MINUTE FROM interval) * 60
          + EXTRACT (SECOND FROM interval)
       )
          interval_sum
  FROM x;

INTERVAL_SUM
------------
   809.71492   <== client測試的結果   

INTERVAL_SUM
------------
   821.97547   <== 服務端測試結果.

--//換一句話講,如果client端不kill掉,大約需要81X 秒才退出出現提示符號.

--//我這裡測試服務端請求認證後的情況,這時伺服器連線已經斷開,如果在確認期間,客戶端執行命令呢? 會發生什麼情況呢?看下一篇blog.
--//也許你抱怨無法模擬以上測試,實際上很簡單在服務端使用tmux,screen開啟會話(這樣網路斷開,命令可以一直在執行),再執行以上命
--//令,在客戶端執行sql語句後,拔掉網線,等待足夠的時間就能模擬我的測試結果.

--//補充說明:
--//我少算了最後1次120秒,這樣實際上大約809+120 = 929秒才會退出.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-2150431/,如需轉載,請註明出處,否則將追究法律責任。

相關文章