Oracle11g RAC : kkjcre1p: unable to spawn jobq slave process

tolywang發表於2011-05-31

Oracle 11g  RAC,   ASM ,  Linux AS 5.3  64bit . 

Oracle 11g 系統節點2 例項關閉, OS正常, 檢視系統alert log 檔案,發現報錯資訊如下: 


Process J001 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:

Tue May 31 07:39:27 2011
Process m000 died, see its trace file
Tue May 31 07:41:25 2011
Process J001 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 07:56:28 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 07:58:15 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 07:59:29 2011
Process m000 died, see its trace file
Tue May 31 08:00:08 2011
Process m000 died, see its trace file
Tue May 31 08:00:31 2011
Process m000 died, see its trace file

Tue May 31 08:00:31 2011
Process m000 died, see its trace file
Tue May 31 08:00:32 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 08:01:49 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 08:02:49 2011
Process m000 died, see its trace file
Tue May 31 08:04:36 2011
Process PZ99 died, see its trace file
Tue May 31 08:05:01 2011
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Tue May 31 08:06:41 2011
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Tue May 31 08:07:37 2011
Process J000 died, see its trace file

Tue May 31 08:05:01 2011
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Tue May 31 08:06:41 2011
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Tue May 31 08:07:37 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 08:07:40 2011
Starting ORACLE instance (normal)
WARNING: You are trying to use the MEMORY_TARGET feature. This feature requires the /dev/shm file

system to be mounted for at least 26239565824 bytes. /dev/shm is either not mounted or is mounted

with available space less than this size. Please fix this so that MEMORY_TARGET can work as

expected. Current available is 23652892672 and used is 9901539328 bytes. Ensure that the mount point

is /dev/shm for this directory.
memory_target needs larger /dev/shm
Tue May 31 08:08:11 2011
Starting ORACLE instance (normal)
WARNING: You are trying to use the MEMORY_TARGET feature. This feature requires the /dev/shm file

system to be mounted for at least 26239565824 bytes. /dev/shm is either not mounted or is mounted

with available space less than this size. Please fix this so that MEMORY_TARGET can work as

expected. Current available is 23652892672 and used is 9901539328 bytes. Ensure that the mount point

is /dev/shm for this directory.
memory_target needs larger /dev/shm
Tue May 31 08:08:41 2011
Process PZ99 died, see its trace file
Tue May 31 08:09:03 2011
Process O000 died, see its trace file

Tue May 31 08:22:31 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:22:43 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:22:55 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:23:07 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:23:19 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:23:31 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:23:43 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file

Tue May 31 08:23:55 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:24:07 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:24:36 2011
Error 29746: Cluster Synchronization Service is shutting down
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_lmon_14499.trc:
ORA-29746: Cluster Synchronization Service is being shut down.
LMON (ospid: 14499): terminating the instance due to error 29746

 

Trace 檔案:


Trace file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label

Security,
OLAP, Data Mining, Oracle Database Vault and Real Application Testing option
ORACLE_HOME = /u01/product/oracle/11.2.0/db_1
System name:    Linux
Node name:      wmrac02
Release:        2.6.18-128.el5
Version:        #1 SMP Wed Dec 17 11:41:38 EST 2008
Machine:        x86_64
Instance name: ccptdb2
Redo thread mounted by this instance: 2
Oracle process number: 52
Unix process pid: 14800, image:
(CJQ0)


*** 2011-05-27 22:00:00.212
*** SESSION ID:(1249.3) 2011-05-27 22:00:00.212
*** CLIENT ID:() 2011-05-27 22:00:00.212
*** SERVICE NAME:(SYS$BACKGROUND) 2011-05-27 22:00:00.212
*** MODULE NAME:() 2011-05-27 22:00:00.212
*** ACTION NAME:() 2011-05-27 22:00:00.212


*** TRACE FILE RECREATED AFTER BEING REMOVED ***

Setting Resource Manager plan SCHEDULER[0x3007]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Setting Resource Manager plan SCHEDULER[0x3008]:DEFAULT_MAINTENANCE_PLAN via scheduler window

*** 2011-05-28 06:00:00.184
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Setting Resource Manager plan SCHEDULER[0x3009]:DEFAULT_MAINTENANCE_PLAN via scheduler window

*** 2011-05-29 06:00:00.202
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter

*** 2011-05-31 00:50:46.038
Process J001 is dead (pid=1752 req_ver=693 cur_ver=693 state=KSOSP_SPAWNED).

*** 2011-05-31 06:28:08.365
Process J002 is dead (pid=8632 req_ver=6101 cur_ver=6101 state=KSOSP_SPAWNED).

*** 2011-05-31 07:39:24.844
Process J001 is dead (pid=16773 req_ver=17167 cur_ver=17167 state=KSOSP_SPAWNED).

*** 2011-05-31 07:41:25.314
Process J001 is dead (pid=17000 req_ver=17168 cur_ver=17168 state=KSOSP_SPAWNED).

*** 2011-05-31 07:56:28.167
Process J000 is dead (pid=18573 req_ver=17132 cur_ver=17132 state=KSOSP_SPAWNED).

*** 2011-05-31 07:56:29.170
Process J000 is dead (pid=18575 req_ver=17170 cur_ver=17170 state=KSOSP_SPAWNED).

 

 

另外一個trace檔案內容:


Trace file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_lmon_14499.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label

Security,
OLAP, Data Mining, Oracle Database Vault and Real Application Testing option
ORACLE_HOME = /u01/product/oracle/11.2.0/db_1
System name:    Linux
Node name:      wmrac02
Release:        2.6.18-128.el5
Version:        #1 SMP Wed Dec 17 11:41:38 EST 2008
Machine:        x86_64
Instance name: ccptdb2
Redo thread mounted by this instance: 2
Oracle process number: 11
Unix process pid: 14499, image:
(LMON)


*** 2011-05-29 20:02:22.111
*** SESSION ID:(265.1) 2011-05-29 20:02:22.111
*** CLIENT ID:() 2011-05-29 20:02:22.111
*** SERVICE NAME:(SYS$BACKGROUND) 2011-05-29 20:02:22.111
*** MODULE NAME:() 2011-05-29 20:02:22.111
*** ACTION NAME:() 2011-05-29 20:02:22.111


*** TRACE FILE RECREATED AFTER BEING REMOVED ***

kjfc_TaskScheduler_Execute_wTime: timer wraps at 0xffffffe0 max 0xffffffdc
2011-05-31 08:24:36.817: [ CSSCLNT]clssgsGroupGetStatus: CSS shutting down.

*** 2011-05-31 08:24:36.817
2011-05-31 08:24:36.817: [ CSSCLNT]clssgsGroupGetStatus: returning 22
kgxgnpstat: error: CLSS service is shutting down
kjxgmcr: kgxgnpstat return 17
LMON caught an error 29746 in the main loop
error 29746 detected in background process
ORA-29746: Cluster Synchronization Service is being shut down.

*** 2011-05-31 08:24:36.818
LMON (ospid: 14499): terminating the instance due to error 29746
ksuitm: waiting up to [5] seconds before killing DIAG(14487)

 

根據提示資訊:kkjcre1p: unable to spawn jobq slave process ,可以瞭解到是系統無法生成job相關的程式而出錯的,那麼大約有幾種可能:

1、引數job_queue_processes(設定過小)

2、引數session和processes(設定的會話數及連線數不能滿足業務需求)

3、引數pga_aggregate_target(被耗盡)

4、OS資源被耗盡,如virtual memory

 

SQL> show  parameter process

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes                      integer     0
cell_offload_processing              boolean     TRUE
db_writer_processes                  integer     8
gcs_server_processes                 integer     4
global_txn_processes                 integer     1
job_queue_processes                  integer     1000
log_archive_max_processes            integer     4
processes                            integer     1000

 

注意,Oracle 11g 中採用了sga, pga 分享使用的方式,統一設定 memory_target   。 所以pga相關值為0 . 


SQL> show parameter pga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 0

 

 


SQL> show parameter memo

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
hi_shared_memory_address             integer     0
memory_max_target                    big integer 25024M
memory_target                        big integer 25024M
shared_memory_address                integer     0
SQL>

 

另外 :

$  ps -ef | grep ora_ | grep -v grep
  ... ...
 oracle  712918        1   0   Dec 28      -  2:47 ora_cjq0_CRMDB1
 oracle 13230162        1   0 16:29:18      -  0:04 ora_j000_CRMDB1
 oracle  3182624        1   0 16:30:28      -  0:00 ora_j001_CRMDB1
  ... ...

上面省略了部分Oracle的後臺程式,上面的程式中,ora_j001_xxx和ora_j000都是由後臺程式ora_cjq0產生的

slave process,這些ora_j000就是job程式,也正是由初始化引數 job_queue_processes控制其最大數量。

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/35489/viewspace-696765/,如需轉載,請註明出處,否則將追究法律責任。

相關文章