Thursday, December 09, 2010

RAW ASM

Since we seem to be on the theme of "fellow DBA's" here is another offering:

A fellow DBA has put together the below documentation to help with setting up ASM. He also did a great job on compiling this information:


ASM
1. When using ASM (as of 9/25/08), raw storage is used. Validate the following to make sure that raw storage has been properly configured.
1.1. Raw slices need to exist under /dev/raw.
1.2. Permissions for each raw slice should be set to "oracle:dba".
1.3. The file /etc/sysconfig/rawdevices should contain entries for each raw slice.
1.4. The file /etc/rc.d/rc.local should contain commands to set ownership and permission on each raw slice to be "oracle:dba" and 644, respectively.
1.4.1. There is a bug in Linux startup procedures where /etc/rc.d/rc.local may not be run soon enough, relative to ASM being started. The safest method around this is to update /etc/udev/permissions.d/50-udev.permissions and under "# raw devices" change the line:
raw/*:root:disk:0660

to
raw/*:oracle:dba:0660

2. Before ASM can be used on a server, having someone as "root" run the following:
/u01/app/oracle/product/10.2.0/asm/bin> ./localconfig add

This is for css to be started and added as an auto start service on server reboot, as the last line to /etc/inittab.
3. ASM installation is performed under the "Advanced" installation method in Oracle's installer or, if creating an ASM instance after the software install is done, use "dbca" under ASM's $ORACLE_HOME/bin and choose "Configure Automatic Storage Management" for the operation to perform.
4. NOTE: when using ASM, make sure to set up the server's Oracle listener to run out of ASM's $ORACLE_HOME.
5. For 10g ASM, the install should be directed to /u01/app/oracle/product/10.2.0/asm.
6. After the ASM instance has been created, drop any disk groups created as part of the ASM instance creation, then change the following database parameters:
ALTER SYSTEM SET "_asm_ausize" = 16777216 SCOPE = spfile;
ALTER SYSTEM SET "_asm_stripesize" = 1048576 SCOPE = spfile;

These parameters are needed to be set before any disk groups are created to surpass the limit of 10 TB for any single disk group.
Restart the ASM instance and recreate the 1 disk group using all available disks (external redundancy).

Thx

Tuesday, December 07, 2010

ASM At Risk

A fellow DBA has put together the below documentation to help those of us that have ASM instances to protect. He did a great job on compiling this information:

"ASM metadata corruption is a clear and present danger that we as DBAs have to recognize and prepare for. It is nothing to be embarrassed about, and nothing to fear, as long as you are ready for it. “But Mr. DBA,” you ask, in a piping voice, “What can I do to hedge myself against future corruption of my ASM metadata headers?!” Settle down now, I’m here to help.

A situation popped up recently for one team where a drive went into predictive failure, and a controller hung, corrupting the ASM headers upon reboot. This resulted them having to completely rebuild the disk groups and restore the database from tape. If the team had had a backup of their ASM headers, all of that wouldn’t have been necessary.

Under 11g:

md_backup
md_backup is ran from the ASMCMD> prompt. It very simply creates a file that is a backup of your metadata, for all of your disk groups, or just a few if you want to specify. By default, it writes to the current directory and backs up all of the mounted disk groups’s header files.

Arguments: -b and –g
-b Specifies a location to store the backup file
-g Specifies the disk group to back up

Example: ASMCMD> md_backup –b /mydir/somebackupfile –g DG1 –g DG3
This will backup disk groups DG1 and DG3 to the file /mydir/somebackupfile.


md_restore
md_restore is a bit more involved than md_backup. As you may have guessed, it restores previously backed up metadata to disk groups you specifiy.

Arguments: -b –i –t –f –g –o
-b Specifies the backup file to read
-i Ignores errors. (Normally, if md_restore hits an error, it stops. This argument suppressed that.
-t Specifies the type of disk group to be created/restored. “full” creates a disk group and restores the metadata. “nodg” restores metadata only, and “newdg” creates a new disk group with a different name using in conjunction with the –o argument.
-o Renames the disk group
-f writes the SQL commands to be performed in the restore to a file instead of executing them (sql_script_file).
-g selects the disk groups to be restored. If none are specified, it restores all of them.

Example: ASMCMD> md_restore –t full –g DG1 –i backup_file
This will restore disk group DG1 using the specified file.

Example: ASMCMD> md_restore –t nodg –g DG1 –i backup_file
This will restore an existing disk group’s metadata.

Under 10g:

Things become different for 10g, as Oracle hadn’t integrated ASM header backup yet. You can, however, backup the ASM header info using the following UNIX command:

dd if= of= bs=4096 count=1

That will capture the first 4k block of the ASM header file. You would do this for each file, creating a separate backup for each.

A simple dd will write it back in to the header, if you need to restore from your backup.

Example: dd if=/dev/rdisk/asmvol1 of=/u01/ora_backup/asmvo1_header.bak bs=4096 count=1
This would capture the first 4k of the given ASM file.

Example: dd if=/u01/ora_backup/asmvol1_header.bak of=/dev/rdisk/asmvol1
This would write in the backed up header back into the file it was backed up from.

It is always a good idea to keep backups of your disk group metadata, because no one can know when the corruption gremlin might strike. If you have any questions, do what most people do, and Google up an answer for yourself."

Hope you find this as valuable as I.

SPFile Problems with ASM Instance

Early this morning... very early, I had to bounce a 10.2.0.4 instance on a Linux server to accommodated the addition of 1 terabyte of additional space to latter be presented to ASM. Everything was business as usual; that is, until I tried to restart the ASM instance. At which time I encountered the following two errors:

ORA-01078: failure in processing system parameters
LRM-00109: could not open parameter file '/u01/app/oracle/.../initsid.ora'

The file it was path'ing in the error message was not there.

So here is how I got the instance to start:

At first, I thought I was going to have to go to the alert log and pull out the parameters (and work them over to fit the proper syntax) and put them into a replacement init file. I didn't really want to do that.

So I looked around some and found the spfile binary under the asm dbs directory. I issued the following commands to create myself a new pfile:

sqlplus / as sysdba

Connected to an idle instance.

create pfile = 'path to where I wanted the pfile created'
from spfile = 'path to where the spfile binary was that I had located';

This worked like a charm creating the replacement pfile. I was then able to issue the startup pfile command and got the ASM instance up and running again.

Monday, December 06, 2010

Issue with the silent installation for 11.1.0.7 on Linux

Well, I hit a new error message. I have been doing silent patch installs for years and until today had never seen the following errors:

SEVERE:Values for the following variables could not be obtained from the command line or response file(s): METALINK_USERNAME(MetaLinkUsername) Silent install cannot continue

I got past this one by adding the following to the kickoff of the response command:

./runInstaller -silent -responseFile /path/to/Disk1/response/patchset.rsp METALINK_USERNAME="" METALINK_PASSWORD=""

Then I hit this one:

You may not have enough space on your drive for a successful install.
Do you still want to continue?

I got past this one by adding the following to the command line execution:

-ignoreDiskWarning

So then the command line looked like this:

./runInstaller -silent -responseFile /path/to/Disk1/response/patchset.rsp METALINK_USERNAME="" METALINK_PASSWORD="" -ignoreDiskWarning

This in-turn resulted with this:

The installation of Oracle Database 11g Patch Set 1 was successful.

Thanks and have a great day!

My Steps for Patching an Oracle 10R2 Instance to 10.2.0.4

0. Rman level 0 backup

1. Blackout db and the server (Grid Control) --- done

2. Export invalid objects list --- done

3. Check the current version of the Oracle time zone definitions
SQL> SELECT version FROM v$timezone_file; --- was 3; did note 553812.1
If this query reports version 4, no action is required; in this case, continue with steps
If this reports a version lower or higher then 4, see OracleMetalink document 553812.1 Actions for the DSTv4 update in the Release 10.2.0.4 patchset.

4. chmod 640 /home/userid/.Xauthority; become oracle; export XAUTHORITY=/home/rboron/.Xauthority --- done

5. Shutdown the instance (normal or immediate) --- done

6. tar -cvf /u02/rman_backup/before_10204upgrade/oracle_home.tar $ORACLE_HOME --- done

7 ./runInstaller -silent -responsefile /home/oracle/oracle_patches/10204patchset/Disk1/patchset.rsp --- done

8. $ORACLE_HOME/root.sh script as the root user --- done

9. start up the listener (lsnrclt start)--- done

10. $ dbua -silent -dbname $ORACLE_SID -oracleHome $ORACLE_HOME -sysDBAUserName SYS -sysDBAPassword xxxxxxxx -recompile_invalid_objects true --- done

11. SQL> SELECT COMP_NAME, VERSION, STATUS FROM SYS.DBA_REGISTRY; --- done (all 12 were valid and showing 10.2.0.4.0)

12. Db bounce (shutdown normal or immediate and restart) --- done (looks great!)



*** Apr 2010 CPU - NO ASM
1. Shutdown db,agent,listener --- done

2. Make sure I have the OPatch 10.2 version 10.2.0.4.2 or later
/u01/app/oracle/product/10.2.0/db_1/OPatch/opatch version --- done (OPatch 10.2.0.4.2)

3. cd /u01/app/oracle/oracle_patches/apr2010cpu/9352191/ --- done

4. /u01/app/oracle/product/10.2.0/db_1/OPatch/opatch napply -skip_subset -skip_duplicate --- done

5. Listener startup (lsnrctl start) --- done

6. Load modified SQL

cd $ORACLE_HOME/rdbms/admin
sqlplus /nolog
SQL> CONNECT / AS SYSDBA
SQL> STARTUP
SQL> @catbundle.sql cpu apply
SQL> -- Execute the next statement only if this is the first 10.2.0.4 CPU applied in the Oracle home.
SQL> @utlrp.sql
SQL> QUIT
--- done

7. Check whether view recompilation has already been performed for the database
SQL> SELECT * FROM registry$history where ID = '6452863'; --- done (1 row returned)

If no rows returned, go to pt.8

If rows returned:

cd $ORACLE_HOME/cpu/view_recompile
sqlplus /nolog
SQL> CONNECT / AS SYSDBA
SQL> @recompile_precheck_jan2008cpu.sql
SQL> QUIT
---done (Procedure completed successfuly - 3528 objects to be recompiled)

If precheck orders to recompile:

cd $ORACLE_HOME/cpu/view_recompile
sqlplus /nolog
SQL> CONNECT / AS SYSDBA
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP UPGRADE
SQL> @view_recompile_jan2008cpu.sql
SQL> SHUTDOWN;
SQL> STARTUP;
SQL> QUIT
--- done (number of invalid objects 26)

If any invalid objects were reported, run the utlrp.sql script as follows:

cd $ORACLE_HOME/rdbms/admin
sqlplus /nolog
SQL> CONNECT / AS SYSDBA
SQL> @utlrp.sql
--- done

8. DB bounce (normal or immediate shutdown and restart instance) -- done

9. Agent startup -- done

10. Log files check --- done

11. Remote connection attempt --- done

12. Compare invalid objects lists --- done (7 less)

13. Opatch inventory check:
/u01/app/oracle/product/10.2.0/db_1/OPatch/opatch lsinventory
Compare CPU folder content [9352191] with the opatch result --- done

14. Disable server and db blackout (Grid Control) --- done

15. Complete a level 0 backup --- done

Trust you are having a great day!
Thx