常見corruption block總結

myownstars發表於2012-02-22

常見的corruption型別

(a) ORA-01578 –物理壞塊:資料塊頭(cache header)損壞;資料塊incomplete(/尾值不匹配)checksum校驗失敗;block misreplaced

(b) ORA-08103 – 邏輯壞塊

(c) ORA-00600 [2662] – 資料塊的SCN高於資料庫的SCN導致,可使用內部事件修復alter session set events 'IMMEDIATE trace name ADJUST_SCN level 1'

如何檢查corruption

1

DBV(僅限於unix平臺),用於檢查塊結構損壞以及checksum驗證失敗等,檢查不出邏輯壞塊

2

Rman

使用命令backup validate database/datafile 1 或者 backup validate check logical database然後檢視v$database_block_corruption

注:check logical選項同時檢查物理/邏輯壞塊,輸出結果如下:

SQL> select * from v$database_block_corruption;

FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
--------------- --------------- --------------- ------------------ ---------
6 10 1 8183236781662 LOGICAL
6 42 1 0 FRACTURED
6 34 2 0 CHECKSUM
6 50 1 8183236781952 LOGICAL
6 26 4 0 FRACTURED

3.

Export/select

使用exp/export工具嘗試匯出損壞的object; 全表掃描或者CTAS可能出現壞塊的表

4

Analyze 可用於檢查表或索引

Analyze table table_name validate structure (cascade)/ analyze index index_name validate structure

 

相關引數

1

DB_BLOCK_CHECKING

設定為true時,對所有資料塊進行校驗;若為false時,則只校驗system表空間

Oracle透過遍歷塊內資料以確保該塊是self-consistent,會佔有1%-10%的開銷

2

DB_BLOCK_CHECKSUM

設定trues時,對所有資料塊生效;false則只針對system表空間

DBWn或者direct loader往磁碟寫資料時,會計算塊的checksum值並寫入cache header,下次讀取時會計算checksum並與cache header中的比較

會佔有1%-2%的開銷

 

 

相應alert錯誤

Incomplete塊損壞

 

Corrupt block relative dba: 0x0380e573 (file 14, block 58739)
Fractured block found during buffer read
Data in bad block -
type: 6 format: 2 rdba: 0x0380e573
last change scn: 0x0288.8e5
a2f78 seq: 0x1 flg: 0x04
consistency value in
tail: 0x00780601
check value in block header: 0x8739, computed block checksum: 0x2f00
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0380e573 (file 14, block 58739) found same corrupted data

 

 

Checksum校驗失敗

 

Corrupt block relative dba: 0x0380a58f (file 14, block 42383)
Bad check value found during buffer read
Data in bad block -
type: 6 format: 2 rdba: 0x0380a58f
last change scn: 0x0288.7784c5ee seq: 0x1 flg: 0x06
consistency value in tail: 0xc5ee0601
check value in block header: 0x68a7,
computed block checksum: 0x2f00
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0380a58f (file 14, block 42383) found same corrupted data

 

Checksum無誤,但是讀取到的塊內容屬於其它的塊

 

Corrupt block relative dba: 0x0d805a89 (file 54, block 23177)
Bad header found during buffer read
Data in bad block -
type: 6 format: 2
rdba: 0x0d805b08----&gt Block is different than expected 0x0d805a89
last change scn: 0x0692.86dc08e3 seq: 0x1 flg: 0x04
consistency value in tail: 0x08e30601
check value in block header: 0x2a6e,
computed block checksum: 0x0
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

 

修復

 

使用dbms_repair修復0ra-1578

http://space.itpub.net/15480802/viewspace-703504

blockrecover修復壞塊

http://space.itpub.net/15480802/viewspace-688357

 

BLOCK CORRUPTIONS ON ORACLE AND UNIX [ID 77587.1]


 

Modified 22-FEB-2012     Type BULLETIN     Status PUBLISHED

 

 

Checked for relevance on 22-FEB-2012

 

PURPOSE

  This article discusses block corruptions in Oracle and how they are related

  to the underlying operating system and hardware.  To better illustrate the

  discussion, Unix is taken as the operating system of reference, although similar

  situations can be observed on other operating systems as well.

 

SCOPE & APPLICATION

  For users requiring further understanding as to how a block could become

  corrupted.

 

 

Block corruption has been a common occurrence on most UNIX based systems and

relational databases for many years.  It is one of the most frequent ways to

lose data and cause serious business impact.  Through a survey of literary

technical sources, this document will discuss several ways that block

corruptions can occur, provide conclusions and possible solutions.

 

To fully comprehend all the reasons for block corruptions, it is necessary to

understand how I/O device subsystems work, how memory buffers are used to

support the reading and writing of data blocks, how blocks are sized on both

UNIX and Oracle, and how these three objects work together to maintain data

consistency.

 

I/O devices are designed specifically for host machines and there have been

few attempts to standardize a particular interface across the industry.  Most

software, including Oracle, on UNIX machines uses standard C program calls that

in turn perform. system calls to support the reading and writing of data to

disk.  These system calls access I/O device software that retrieves or writes

data on disk.

 

The UNIX system contains two types of devices, block devices and raw or

character devices.  Block devices look like random access storage devices to

the rest of the system while character devices include all other devices such

as terminals and network media. (Bach, 1990 314).  These device types are

important to understand because different combinations can increase corruptions.

 

Device drivers are configured by the operating system and the configuration

procedure generates or populates tables that form. part of the code of the

kernel.  This kernel to device driver interface is described by the block

device switch table and the character device switch table.  Each device type

has entries in these tables that direct the kernel to the appropriate driver

interfaces for the system calls.  The open and close system calls of a device

file funnel through the two device switch tables, according to file type.  The

mount and umount system calls also invoke the device open and close procedures

for block devices. Read and write system calls of character special files pass

through the respective procedures in the character device switch tables.  Read

and write system calls of block devices and of files on mounted file systems

invokes the algorithms of the buffer cache, which invoke the device strategy

procedure. (Bach, 1990 314).  This buffer cache plays an important role in

block corruptions since it is the location where data blocks are the most

vulnerable.

 

The difference between the two disk interfaces is whether they deal with the

buffer cache.  When accessing the block device interface, the UNIX kernel

follows the same algorithm as for regular files, except that after converting

the logical byte offset into a logical block offset, it treats the logical

block offset as a physical block number in the file system.  It then accesses

the data via the buffer cache and, ultimately, the driver strategy interface.

However, when accessing the disk via the raw interface, the kernel does not

convert the byte offset into the file but passes the offset immediately to the

driver.  The driver's read or write routine converts the byte offset to a

block offset and copies the data directly to the user address space, bypassing

kernel buffers.

塊裝置和裸裝置(字元裝置)區別:是否使用了buffer cache(前者使用)

兩個程式分別使用不同方式訪問同一地址時,可能會造成資料不一致;

Thus, if one process writes a block device and a second process then reads a

raw device at the same address, the second process may not read the data that

the first process had written, because the data may still be in the buffer

cache and not on disk.  However, if the second process had read the block

device, it would automatically pick up the new data, as it exists in the

buffer cache.  (Bach, 1990 328).

 

Use of the raw interface may also introduce strange behavior.  If a process

reads or writes a raw device in units smaller than the block size, results are

driver-dependent.  For instance, when issuing 1-byte writes to a tape drive,

each byte may appear in different tape blocks. (Bach 1990)

 

The advantage of using the raw interface is speed, assuming there is no--裸裝置的優勢

advantage to caching data for later access.  Processes accessing block devices

transfer blocks of data whose size are constrained by the file system logical

block size. Furthermore, use of the block interface entails an extra copy of

data between user address space and kernel buffers, which is avoided in the

raw interface.  For example, if a file system has a logical block size 1K

bytes, at most 1K bytes are transferred per I/O operation.  However, processes

accessing the disk as a raw device can transfer many disk blocks during a disk

operation, subject to the capabilities of the disk controller.

 

Disk controllers are hardware devices that control the I/O actions of one or

more disks.  These controllers can also create a bottleneck in a system.

(Corey, Abbey, Dechichio 1995). Controllers are the most frequent piece of

hardware to have and cause problems on many systems.  When a system has

multiple disks controlled by one controller, the results can be fatal.  The

bottleneck on controllers is a common cause of write error.

 

It is important to remember that Oracle and other products use these device

access methods to perform. their work.  It is also important to note the added

complexity that the Oracle kernel adds to the I/O game.

 

The Oracle Relational Database Management System (RDBMS) keeps its

information, including data, in block format.  However, the Oracle data block

can be, and in most cases is, composed of several operating system blocks.

 

An Oracle database block is the physical unit of storage in which all Oracle

database data are stored in files.  The Oracle database block size is

determined by setting a parameter called db_block_size when the

database is created. (Millsap, 1995).

 

The most common UNIX block is 512 bytes but the Oracle block size can range-- Oracle資料塊與OS資料塊區別

from 512 to 32K.  The difference in block sizing between the operating system

and the Oracle kernel are beneficial for Oracle; boosting performance gains

while allowing UNIX to maintain small files with minimal wasted space.  The

Oracle block can be considered a superset of the UNIX file system block size.

 

Each block of an Oracle data file is formatted with a fixed header that

contains information about the particular block.  This information provides a

means to ensure the integrity for each block and in turn, the entire Oracle

database.  One component of the fixed header of a data block is called a Relative

Data Block Address (DBA).  This DBA is a 4 bytes that stores the relative file -- RDBA的用途

number of the Oracle database file and the Oracle block number offset relative

to the beginning of the file. (Presley, 1993).

 

Whenever there is a problem with the RDBA, Oracle may signal an Oracle error

ORA-1578: Data block corrupted in file #  block #.  This error provide information that point to where the

corruption exists.

 

Oracle uses the standard C system function calls to read and write blocks to

its database files.  Once the block has been read it is mapped to shared

memory by the operating system,  After the block has been read into shared

memory, the Oracle kernel does verification checks on the block to ensure the

integrity of the fixed header.  The RDBA check is the first verification made

on the fixed header.  So why do RDBAs become corrupt and how can we identify

and correct them?

 

Case One-- oracle資料塊被清空

--------

The first case of block corruption occurs when the block has been zeroed out. If the Oracle block

is completely zeroed out, sql statements may generate an ORA-8103 as the block type=0 is invalid

and it is not formatted as an empty block. In  this case the dbverify utility (dbv) can detect it

and will produce an error message.  Dbv output example:

 

DBVERIFY - Verification starting : FILE = /oradata/data_01.dbf

Page 307161 is marked corrupt

***

Corrupt block relative dba: 0x0644afd9 (file 0, block 307161)

Completely zero block found during dbv:

 

Usually the first operating system block of an Oracle block is zeroed out when -- Oracle資料塊的第一個OS塊因為軟體錯誤被清零

there was a software error on disk and the operating system attempted to repair

its block.  In addition, disk repair utility programs have caused this zeroing out effect.

 

Programs that read from and write to the disk directly can destroy the

consistency of file system data.  The file system algorithms coordinate disk

I/O operation to maintain a consistent view of disk data structures, including

linked lists of free disk blocks and pointer from inodes to direct and

indirect data blocks.  Processes that access the disk directly bypass these if

they run while other file system activity is going on.  For this reason, these

programs should not be run on an active file system. (Bach, 1990 328).

 

 

Case Two RDBA地址無效

--------

The RDBA in the physical block on disk is incorrect.  It can generate an error ORA-1578

and a message in the alert.log with message "Data in bad block" as next:

 

***

Corrupt block relative dba: 0x56c07ac1 (file 347, block 31425)

Bad header found during buffer read

Data in bad block -

type: 6 format: 2 rdba: 0x06407ac1

last change scn: 0x0000.00a02808 seq: 0x1 flg: 0x02

consistency value in tail: 0x28080601

check value in block header: 0x0, block  checksum disabled

spare1: 0x0, spare2: 0x0, spare3: 0x0

***

Reread of rdba: 0x56c07ac1 (file 347, block 31425) found same corrupted data

 

Blocks are sometimes written into the wrong places in the data file.  This is--資料塊被寫入到資料檔案的錯誤位置,通常由lseek()觸發

called "write blocks out of sequence."  This typically happens when the operating system

I/O device driver fails to write the block in the proper location that Oracle

requested via the lseek() system call.

 

The lseek() system call is one of the most important calls related to block

corruption.  The calculations that lseek() performs are often the cause of

block problems.  To understand lseek() a brief discussion of byte positioning

is necessary.

 

Every open file has a "current byte position" associated with it.  This is

measured as the number of bytes from the start of the file.  The create system

call sets the file's position to the beginning of the file, as does the open

system call.  The read and write system calls update the file's position by

the number of bytes read or written.  Before a read or write, an open file can

be positioned using lseek().  The format is:

 

lseek(int fildes, long offset, int whence);

 

The offset and whence arguments are interpreted as follows: If whence is 0,

the file's position is set to offset bytes from the beginning of the file.  If

whence is 1, the file's position is set to its current position plus the

offset.  If whence is 2, the file's position is set to the size of the file

plus the offset.  The file's offset can be greater than the file's current

size, in which case the next write to the file will extend the file.  Lseek()

returns a long integer byte offset of the file.  (Stevens, 1990 40).

 

There is great opportunity for miscalculation of an offset based on the

lseek() system call.  Though lseek is not the only system call culprit in the

block corruption problem, it is a major contributor.

 

 

This may also happen if the block was corrupted in memory but was written to disk. 

This situation is quite rare and in most cases it is usually caused by memory

faults that go undetected.  The RDBA found in the block is usually garbage and

not a valid RDBA. 

 

If there is a possibility of memory problems on the system, the database

administrator can enable further sanity block checking by placing the

following parameters in the database instance init.ora parameter file:

 

db_block_checking=TRUE--預防措施

db_block_checksum=TRUE / FULL (10.2+)

_db_block_cache_protect= true

 

db_block_checking force the Oracle RDBMS kernel to call functions that check

the block. Oracle checks a block by going through the data on the block, making

sure it is self-consistent. Block checking can often prevent memory and data corruption

 

db_block_checksum determines whether DBWn and the direct loader will calculate

a checksum (a number calculated from all the bytes stored in the block) and

store it in the cache header of every data block when writing it to disk.

Checksums are verified when a block is read only if this parameter is true and the

last write of the block stored a checksum.  If set to FULL, DB_BLOCK_CHECKSUM also

catches in-memory corruptions and stops them from making it to the disk.

 

The _db_block_cache_protect=true protects the cache layer from becoming corrupted.

This parameter will prevent certain corruption from getting to disk, although

it may crash the foreground of the database instance.  It will help catch

stray writes in the cache. When a process tries to write past the buffer size

in the SGA, it will fail first with a stack violation.

 

If the database writer process detects a corrupted block in cache prior to—一旦DBWR檢測到cache中的壞塊,會觸發錯誤並終止oracle例項

writing the block to disk, it will signal an error and will crash the

database instance.  The block that is corrupted is  never written to disk.

After receiving such an error, simply attempt to restart the database instance.

There is no doubt that this can be a costly workaround to avoid block

corruptions.  However, the workaround once a corruption has occurred can be

even costlier.

 

 

 

 

Case 3

------

 

A third cause for block corruption is the requested I/O  not being serviced by

the operating system.  The calls that Oracle makes to lseek() and read() are checked for

return error codes. In addition, Oracle checks to see the number of bytes read in by the read()

system call to ensure that the block size or a multiple of the block size was

read.  Since these checks appeared to have been successful, Oracle assumes

that the direct read succeeded.  Upon sanity checking, the RDBA is incorrect

and the database operation request fails.  Therefore, the I/O read request

really never took place.  In this case, the RDBA found can point to a block of

a different file.

 

Case 4

------

 

Another reason for block corruption is reading the wrong block from the same

device.  Typically, this is caused by a very busy disk.  In some cases, the

block read was off by 1 block but can range into several hundreds of blocks.

Since this occurs when the disk is very busy and under lots of

stress, try spreading datafiles across multiple disks and ensure that the disk

drive can support the load.

 

 

In the third and fourth situations, the database files will not be physically

corrupted and the operation can be tried again with success.  Most diagnostics

testing will not reveal anything wrong with either the operating system or the

hardware.  However, the problem is due to operating system or hardware related

problems. (Velpuri, 1995).

 

So what causes the operating system calls to behave the way they do and how

can companies try to minimize their risk?  To evaluate these questions,

another look into how UNIX works is required.

 

UNIX vendors, in a attempt to speed performance, have implemented many

features into the filesystem.  The filesystem manages a large cache of I/O

buffers, called the buffer cache. This cache allows UNIX to optimize read and

write operations.  When a program writes data, the filesystem stores the data

in a buffer rather that writing it to disk immediately.  At some later point

in time, the system will send this data to the disk driver, together with

other data that has accumulated in the cache.  In other words, the buffer

cache lets the disk driver schedule disk operations in batches.  It can make

larger transfers and use techniques such as seek optimization to make disk

access more efficient.  This is called write-behind.

 

When a program reads data, the system first checks the buffer cache to see if

the desired data is already there.  If the data is already in the buffer

cache, the filesystem does not need to access the disk for those blocks.  It

just gives the user the data it found in its buffer, eliminating the need to

wait for a disk drive.  The filesystem only needs to read the disk if the data

isn't already in the cache.  To increase efficiency even further, the

filesystem assumes the program will read the file consecutively and read

several blocks from the disk at once.  This increases the likelihood that the

data for future read operations will already be in the cache. (Loukides, M.,

1990)  This also increases the chance of block corruption.

 

As a filesystem gets busy and buffers are being read, modified, written, and

aged out of the cache the chance of the kernel reading or writing the wrong

block increases.  Also, the more complex the scheme to read from and write to

disk, the greater the likelihood of function failure.

 

The UNIX kernel uses the strategy interface to transmit data between the

buffer cache and a device, although the read and write procedures of character

devices sometime use their block counterpart strategy procedure to transfer

data directly between the device and the user address space.  The strategy

procedure may queue I/O jobs for a device on a work list or do more

sophisticated processing to schedule I/O jobs.  Drivers can set up data

transmission for one physical address or many, as appropriate.  The UNIX

kernel passes a buffer header address to the driver strategy procedure.  The

header contains a list of addresses and sizes for transmission of data to or

from the device.  This is also how the swapping operations work.  For the

buffer cache, the kernel transmits data from one address; when swapping, the

kernel transmits data from many data addresses.  If data is being copied to or

from the user's address space, the driver must lock the process in memory

until the I/O transfer is complete.

 

The kernel loses control over a buffer only when it waits for the completion

of I/O between the buffer and the disk.  It is conceivable that a disk drive

is corrupt so that it cannot interrupt the CPU, preventing the kernel from

ever releasing the buffer. There are processes that monitor the hardware for

such cases and zero out the block and return an error to the kernel for a bad

disk job. (Bach, 1990 52).

 

On the UNIX level there are several utilities that will check for  bad disk

blocks and zero out any blocks they find corrupted.  These utilities do not

realize that the block in question may be an Oracle RDBMS block and zero out

the block by mistake.

 

In (Silberschatz, Galvin, 1994), the authors consider the possible effect of aOS崩潰對資料塊的潛在影響

computer crash.  In this case, the table of opened files is generally lost,

and with it any changes in the directories of opened files.  This event can

leave the file system in an inconsistent structure.  Frequently, a special

program is run at reboot time to check for and correct disk inconsistencies.

 

The consistency checker compares the data in the directory structure with the

data blocks on disk, and tries to fix and inconsistencies it finds.

(Silberschatz, Galvin, 1994)  This will often result in the reformatting of

blocks which will cause the Oracle block information to be removed.  This will

definitely cause Oracle corruption.

 

It is important to realize that monitoring of hardware is required for all

operating systems.  Hardware monitors can sense electrical signals on the

busses and can accurately record them even at high speed.  A hardware monitor

keeps observing the system even when it is malfunctioning, and thus, it can be

used to debug the system. (Jain, 1991 99)  These tools can help determine the

cause of the problem and detect problems like controller error and media

faulting which are frequent corruption contributors.

 

In any case, there are many opportunities for blocks, either on disk or in the

buffer cache, to become corrupt.  Fixing the corruption can sometimes provide

even greater opportunities.

 

 

Conclusion

----------

 

Data block corruption is an ongoing problem on all operating systems,

especially UNIX.  There are many types and causes of corruptions to consider.

Advanced system configurations can increase the chance and hardware problems

are a common source of corruptions. When receiving block corruption errors,

remember that a couple of them are not physical corruptions but memory

corruptions that are never written to disk.

 

Oracle Customer Support provides a number of bulletins on block corruption

problems that help recover what is left of the data once corruption has

occurred.  If block corruption occurs on a machine, be sure to identify the

type of corruption and establish a plan for its correction.

 

 

[1]  Bach, M. (1990). The Design of the UNIX Operating System.

                      The I/O Subsystem  328.

[2]  Corey, M., Abbey, M., Dechichio, D. (1995).  Tuning Oracle 52.

[3]  Jain, R. (1991). The Art of Computer Systems Performance Analysis. 99

[4]  Loney, K. (1994).  Oracle DBA Handbook.   23.

[5]  Loukides, M., (1990) System Performance Tuning. 161-162.

[6]  Millsap, C. (1995). Oracle7 Server Space Management. 1-2.

[7]  Presley, D. (1993).  Data Block Corruption Detection. Oracle Corporation.

[8]  Silberschatz A., Galvin P. (1994)  Operating System Concepts. 404.

[9]  Stevens, W. (1990). UNIX Network Programming. 163.

[10] Velpuri, R. (1995).  Oracle Backup and Recovery Handbook. 286

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/15480802/viewspace-716936/,如需轉載,請註明出處,否則將追究法律責任。

相關文章