Wednesday, February 16, 2011

BLOCK CORRUPTION!

We have been running into a block corruption error on a 10.2.0.4 Oracle instance running on RedHat Linux (kddummy_blkchk], [129], [5850]). The other day it hit us on an insert again due to a bad block error. This was the second week in a row we encountered this for the same process. The insert had ran for 4 hours and died on the commit! After some root-cause-analysis (RCA) it seems that this issue is pointing to a table logical error. There seems to be a problem with that table not the physical structure. We were thinking that this table may have gotten created with invalid indexe or something. This problem has now occurred in 3 different environments (dev, test, prod) for the same table object. The table is inserted into every weekend after a truncate is issued. However, if we do a CTAS and then an insert using only the temp table created in the CTAS it works fine without encountering the bad block error.

We opened an SR with Oracle and they said pretty much the same thing for the workaround; here is what they said about it:

ODM Action Plan Oracle Support
=== ODM Action Plan ===

In order to recreate the table you need to disable the DB_BLOCK_CHECKINg because in other case, it will fail with the ORA-600 error. We are removing 1 check layer in order to be able to recreate the table. We know there is a block corruption, but we know as well it is not affected any data on the table, so we can do it safely.

Process should be:
1.-Disable DB_BLOCK_CHECKING
2.-Recreate the table
3.-Re-enable DB_BLOCK_CHECKING

On the other hand, the steps you can follow in order to fix the corruption are
DROP TABLE .. PURGE;
or
ALTER TABLE .. MOVE ..;
or
Create table as select(CTAS)
or
export/import,etc

You don't need to follow all steps but only one.

Since we have been using the CTAS workaround we have no longer been receiving the error. Here are the steps we are using now for this object:

Drop table tabone;
CTAS tmp_tabone;
Rename tmp_tabone to tabone;

Here is a nice little piece of code if you just want to find out what object is corrupt (replace the file and blockid):

select tablespace_name, segment_type, owner, segment_name
from dba_extents
where file_id =
and between block_id AND block_id + blocks-1;

No comments: