Friday, October 07, 2011

Opening Your Wallet


1.        Connect  sys/password as sysdba
2.        alter system set wallet open indentified by  ”.

Wednesday, October 05, 2011

RMAN TDE BACKUP'S

One of my peer's tested the RMAN TDE backup and got the following list of results:

1. RMAN Transparent Encryption can create and restore encrypted backups with no DBA intervention. The wallet must be open.
2. RMAN backup created as a backupset can be encrypted. RMAN image copies cannot be encrypted.
3. RMAN encrypted backups can be taken to disk. Advanced Security Option licensing is required.
4. Encrypted backups to tape are only possible with Oracle Secure Backup as the media manager. Advanced Security Option licensing is not required.
5. Oracle Secure Backup is a Oracle separate backup product. Oracle Secure Backup can not be integrated with any 3rd-party backup tools.(for example :Veritas NBU)
Oracle Secure Backup is an alternative to those product offerings(just like Veritas NBU).

Tuesday, October 04, 2011

ORION (ORracle IO Numbers)


Here is what I did to accomplish an Orion test on a new server I am currently installing oracle on and configuring.  The orion executable (11R2 is the first release to included this package) is stored under $ORACLE_HOME/bin directory.

user1a@serv01[dpprod]:/u01/app/oracle/product/11.2.0/db_1/bin> ll *orion*
-rwxr-xr-x 1 oracle dba 8698817 Sep  4  2010 orionO*
-rwxr-x--x 1 oracle dba 8700950 Jun 29 21:34 orion*

(Note that this is on Linux 64 bit).
I created a directory under /u01/app/oracle/ called orion. 
I then created the following file in that same orion directory: asm.vols.lun
I used the information from /dev/oracleasm/disks to populate it:
user1a@sevr2:/u01/app/oracle/orion> ls -l /dev/oracleasm/disks/
total 0
brw-rw---- 1 oracle dba 8,  33 Sep 29 10:12 VOL1
brw-rw---- 1 oracle dba 8,  49 Sep 29 10:12 VOL2
brw-rw---- 1 oracle dba 8,  65 Sep 29 10:12 VOL3
brw-rw---- 1 oracle dba 8,  81 Sep 29 10:12 VOL4
brw-rw---- 1 oracle dba 8,  97 Sep 29 10:12 VOL5
brw-rw---- 1 oracle dba 8, 113 Sep 29 10:12 VOL6

Here is the content of the asm.vols.lun file:
user1a@sevr2:/u01/app/oracle/orion> cat asm.vols.lun
/dev/oracleasm/disks/VOL1
/dev/oracleasm/disks/VOL2
/dev/oracleasm/disks/VOL3
/dev/oracleasm/disks/VOL4
/dev/oracleasm/disks/VOL5
/dev/oracleasm/disks/VOL6

Note that this server is using RDAC (Redundant Disk Array Controller).  I was able to determine this by noting that there were no files in the /dev/mapper/ directory (the storage is direct attached):
user1a@sevr2:/u01/app/oracle/orion> ll /dev/mapper
total 0
crw-------  1 root root 10, 63 Sep 29 10:12 control
drwxr-xr-x  2 root root     60 Sep 29 10:12 ./
drwxr-xr-x 12 root root   4380 Oct  4 11:38 ../

If this was multipath as it is on the sevr1 (esan) the mapper directory would be populated:
user1a@sevr1[dbprod]:/u01/app/oracle/product/11.2.0/db_1/bin> ll /dev/mapper
total 0
crw-------  1 root root  10, 63 Sep 23 22:29 control
brw-rw----  1 root disk 253,  0 Sep 23 22:29 mpath9
brw-rw----  1 root disk 253, 16 Sep 23 22:29 mpath8
brw-rw----  1 root disk 253, 15 Sep 23 22:29 mpath7
brw-rw----  1 root disk 253, 14 Sep 23 22:29 mpath5
brw-rw----  1 root disk 253, 13 Sep 23 22:29 mpath4
brw-rw----  1 root disk 253, 12 Sep 23 22:29 mpath3
brw-rw----  1 root disk 253, 11 Sep 23 22:29 mpath2
brw-rw----  1 root disk 253,  8 Sep 23 22:29 mpath17
brw-rw----  1 root disk 253,  7 Sep 23 22:29 mpath16
brw-rw----  1 root disk 253,  6 Sep 23 22:29 mpath15
brw-rw----  1 root disk 253,  5 Sep 23 22:29 mpath14
brw-rw----  1 root disk 253,  4 Sep 23 22:29 mpath13
brw-rw----  1 root disk 253,  3 Sep 23 22:29 mpath12
brw-rw----  1 root disk 253,  2 Sep 23 22:29 mpath11
brw-rw----  1 root disk 253,  1 Sep 23 22:29 mpath10
brw-rw----  1 root disk 253, 10 Sep 23 22:29 mpath1
brw-rw----  1 root disk 253,  9 Sep 23 22:29 mpath0
brw-rw----  1 root disk 253, 18 Sep 23 22:29 mpath8p1
brw-rw----  1 root disk 253, 17 Sep 23 22:29 mpath7p1
brw-rw----  1 root disk 253, 19 Sep 23 22:29 mpath2p1
brw-rw----  1 root disk 253, 20 Sep 23 22:29 mpath1p1
brw-rw----  1 root disk 253, 21 Sep 23 22:29 mpath0p1
brw-rw----  1 root disk 253, 22 Sep 23 22:29 mpath17p1
brw-rw----  1 root disk 253, 23 Sep 23 22:29 mpath16p1
brw-rw----  1 root disk 253, 24 Sep 23 22:29 mpath9p1
brw-rw----  1 root disk 253, 25 Sep 23 22:29 mpath15p1
brw-rw----  1 root disk 253, 26 Sep 23 22:29 mpath14p1
brw-rw----  1 root disk 253, 27 Sep 23 22:29 mpath13p1
brw-rw----  1 root disk 253, 28 Sep 23 22:29 mpath12p1
brw-rw----  1 root disk 253, 29 Sep 23 22:29 mpath5p1
brw-rw----  1 root disk 253, 30 Sep 23 22:29 mpath11p1
brw-rw----  1 root disk 253, 31 Sep 23 22:29 mpath4p1
brw-rw----  1 root disk 253, 32 Sep 23 22:29 mpath10p1
brw-rw----  1 root disk 253, 33 Sep 23 22:29 mpath3p1
drwxr-xr-x  2 root root     740 Sep 23 22:29 ./
drwxr-xr-x 14 root root   14840 Sep 23 22:31 ../

Here is what sevr1’s /dev/oracleasm/disks directory looks like:
user1a@serv2[dbprod]:/u01/app/oracle/product/11.2.0/db_1/bin> ll /dev/oracleasm/disks
total 0
drwxr-xr-x 4 root   root       0 Sep 23 22:30 ../
drwxr-xr-x 1 root   root       0 Sep 23 22:30 ./
brw-rw---- 1 oracle dba  253, 28 Oct  4 16:55 VOL9
brw-rw---- 1 oracle dba  253, 27 Oct  4 16:55 VOL10
brw-rw---- 1 oracle dba  253, 20 Oct  4 16:55 VOL1
brw-rw---- 1 oracle dba  253, 30 Oct  4 16:55 VOL8
brw-rw---- 1 oracle dba  253, 32 Oct  4 16:55 VOL7
brw-rw---- 1 oracle dba  253, 24 Oct  4 16:55 VOL6
brw-rw---- 1 oracle dba  253, 29 Oct  4 16:55 VOL5
brw-rw---- 1 oracle dba  253, 31 Oct  4 16:55 VOL4
brw-rw---- 1 oracle dba  253, 33 Oct  4 16:55 VOL3
brw-rw---- 1 oracle dba  253, 19 Oct  4 16:55 VOL2
brw-rw---- 1 oracle dba  253, 22 Oct  4 16:55 VOL14
brw-rw---- 1 oracle dba  253, 23 Oct  4 16:55 VOL13
brw-rw---- 1 oracle dba  253, 25 Oct  4 16:55 VOL12
brw-rw---- 1 oracle dba  253, 26 Oct  4 16:55 VOL11

If that where the case then my asm.vols.lun file contents would have looked like this:
/dev/mapper/mpath1p1   
/dev/mapper/mpath13p1  
/dev/mapper/mpath14p1  
/dev/mapper/mpath15p1  
/dev/mapper/mpath16p1  
/dev/mapper/mpath17p1  
/dev/mapper/mpath2p1   
/dev/mapper/mpath3p1   
/dev/mapper/mpath4p1   
/dev/mapper/mpath5p1   
/dev/mapper/mpath9p1   
/dev/mapper/mpath10p1  
/dev/mapper/mpath11p1  
/dev/mapper/mpath12p1

Here is the command line entries I used to execute the orion test:
./$ORACLE_HOME/bin/orion -run advanced -testname asm.vols -type seq -matrix point -num_large 6 -num_small 0 -duration 120  -write 0
Note: (-num_large typically is assigned the same number as the number of lun’s you have.)
Here is the output from the command execution:
ORION: ORacle IO Numbers -- Version 11.1.0.7.0
asm.vols_20111004_1538
Test will take approximately 5 minutes
Larger caches may take longer

rwbase_run_test: rwbase_reap_req failed
rwbase_run_process: rwbase_run_test failed
rwbase_rwluns: rwbase_run_process failed
orion_warm_cache: Warming cache failed. Continuing

Note: (The failures are expected in this case.)
The following files were created:
oracle@serv1:/u01/app/oracle/orion> ll
total 45296
-rwxrwxrwx 1 oracle dba 46350238 Oct  4 14:31 orion_linux_x86-64*
drwxrwxr-x 3 oracle dba     4096 Oct  4 14:43 ../
-rw-r--r-- 1 oracle dba      157 Oct  4 15:22 asm.vols.lun
drwxrwxr-x 2 oracle dba     4096 Oct  4 15:38 ./
-rw-r--r-- 1 oracle dba     2651 Oct  4 15:40 asm.vols_20111004_1538_trace.txt
-rw-r--r-- 1 oracle dba      907 Oct  4 15:40 asm.vols_20111004_1538_summary.txt
-rw-r--r-- 1 oracle dba       41 Oct  4 15:40 asm.vols_20111004_1538_mbps.csv
-rw-r--r-- 1 oracle dba       24 Oct  4 15:40 asm.vols_20111004_1538_lat.csv
-rw-r--r-- 1 oracle dba       24 Oct  4 15:40 asm.vols_20111004_1538_iops.csv

The “summary.txt” file contains the overall i/o speed results:
user1a@serv1:/u01/app/oracle/orion> cat *summary*
ORION VERSION 11.2.0.2.0

Commandline:
-run advanced -testname asm.vols -type seq -matrix point -num_large 6 -num_small 0 -duration 120 -write 0

This maps to this test:
Test: asm.vols
Small IO size: 8 KB
Large IO size: 1024 KB
IO Types: Small Random IOs, Large Sequential Streams
Number of Concurrent IOs Per Stream: 4
Force streams to separate disks: No
Simulated Array Type: CONCAT
Write: 0%
Cache Size: Not Entered
Duration for each Data Point: 120 seconds
Small Columns:,      0
Large Columns:,      6
Total Data Points: 1

Name: /dev/oracleasm/disks/VOL1 Size: 1170202328064
Name: /dev/oracleasm/disks/VOL2 Size: 1170202328064
Name: /dev/oracleasm/disks/VOL3 Size: 1170202328064
Name: /dev/oracleasm/disks/VOL4 Size: 1170202328064
Name: /dev/oracleasm/disks/VOL5 Size: 1170202328064
Name: /dev/oracleasm/disks/VOL6 Size: 1170202328064
6 FILEs found.

Maximum Large MBPS=1574.57 @ Small=0 and Large=6

As you can see, I am getting 1.5 gbs/sec.
The rule of thumb to expect for these i/o subsystems is around 250 MB/s per LUN.   Therefore for six LUNs it would be 1.5 GB/s. 

And I am happy, as that is what I got.
Thx

Monday, September 26, 2011

JUST SEEMS A LITTLE SLOW TO ME

Over the past week or two I just had a gut feeling that one of our databases i/o subsystem was performing slower than it should have been. This is Oracle 11gR2 on Linux 64 bit using ASM with San storage (8 HBA’s). The DASD is RAID 5 sliced into 1.1 tb Lun’s. There are 15 of these Lun’s presented to ASM as Volume’s (asm lib was used).

When I found a window of lower utilization on the server we ran some “Read” tests against the disks. We were getting 2.5 gbs/second and peaked at 2.9 gbs/second. That told me that my gut feeling was not coming from the i/o sub-system.

Some digging reveled the following:
/dev/oracleasm/disks> ls -ltrfa
brw-rw---- 1 oracle dba 253, 28 Sep 26 10:04 VOL9
brw-rw---- 1 oracle dba 253, 30 Sep 26 10:04 VOL8
brw-rw---- 1 oracle dba 253, 32 Sep 26 10:04 VOL7
 brw-rw---- 1 oracle dba 253, 24 Sep 26 10:04 VOL6
brw-rw---- 1 oracle dba      8, 29 Sep 26 10:04 VOL5
brw-rw---- 1 oracle dba      8, 31 Sep 26 10:04 VOL4
brw-rw---- 1 oracle dba      8, 33 Sep 26 10:04 VOL3
brw-rw---- 1 oracle dba      8, 19 Sep 26 10:04 VOL2
brw-rw---- 1 oracle dba 253, 22 Sep 26 10:04 VOL14
brw-rw---- 1 oracle dba 253, 23 Sep 26 10:04 VOL13
brw-rw---- 1 oracle dba 253, 25 Sep 26 10:04 VOL12
brw-rw---- 1 oracle dba 253, 26 Sep 26 10:04 VOL11
brw-rw---- 1 oracle dba 253, 27 Sep 26 10:04 VOL10
brw-rw---- 1 oracle dba     8, 20 Sep 26 10:04 VOL1

Take note of the column with the numbers 253 and 8 in it. This is the Agent (Major) number. A 253 is a multi-path’ing device. However, an 8 is a single-path device. We changed these all to 253’s and now the world is a little better place.

Thursday, September 15, 2011

Beware if you are RAC & ASM and are planning on upgrading to 11R2

We are starting to use ACFS some and while doing some reading I came across the following information as a Burleson comment and wanted to pass it along. Thx ASM Cluster Filesystem (ACFS) There is a serious vulnerability for Oracle RAC databases noted in the Oracle 11g Release 2 RAC documentation. This note indicates a scenario where it's possible for an entire RAC cluster to fail. This exposure is only for RAC databases that are using Automatic Storage Management (ASM). When using RAC, Oracle has announced that the key clusterware files, the OCR and Votedisks can now be stored in Oracle Automated Storage Management (ASM). However, research by expert DBA's indicate that it may well be premature to use this feature. According to Oracle's documentation, if a single node is lost, then the entire cluster may go down if that node happens to be the master node. The entire cluster only fails if the Oracle ASM instance on the OCR master node fails. If the majority of the OCR locations are in Oracle ASM and if there is an OCR read or write access, then the crsd stops and the node becomes inoperative. In order to identify the OCR master node, search the crsd.log file for the line -- "I AM THE NEW OCR MASTER" -- with the most recent time stamp. Oracle has announced that new installations of 11g Release 2 (as opposed to upgrades) will not allow the installation of the OCR and Votedisks on RAW. So the best solution at this time is to install these files on a shared, non-ASM file system such as OCFS2. Clusterware upgraded to 11g Release 2 may still be on RAW.

Monday, July 25, 2011

OUR FIRST BUG PATCH (11R2)

Well, it was bound to happen “sooner” than “later”; we hit our fist bug with a newly installed 11.2.0.2 instance. She is installed on 64bit Linux. This one is a sqlplus issue.
The bug report from Oracle is as follows:

Bug 10269193 - Wrong results with outer join and CASE expression optimization [ID 10269193.8]

Modified 27-MAY-2011 Type PATCH Status PUBLISHED

Bug 10269193 Wrong results with outer join and CASE expression optimization

This note gives a brief overview of bug 10269193.
The content was last updated on: 27-MAY-2011

Affects:
Product (Component) Oracle Server (Rdbms)
Range of versions believed to be affected Versions BELOW 12.1
Versions confirmed as being affected 11.2.0.2

Platforms affected Generic (all / most platforms affected)


Wrong results are possible with outer join and case expression optimization containing a ROWID column.

Rediscovery Notes:
1. Wrong Results OR an ORA-600 in Parallel Query
2. Outer Join
3. CASE Expression
4. ROWID column is involved in CASE Expression

This bug is possible even when ROWID column or CASE expression is not explicitly present in the query, because
* A ROWID column can be added during query transformation
(eg. non-native full outer join)
* CASE and ROWID column can be added during view merge

A 10053 trace maybe needed to identify it.

Here is an example of what we saw:

If you make a query that (1) doesn’t select a case statement (2) from a left join with no matches, then the result is correct - no matches from the right table.
If you make a query that (1) selects a case statement (2) from a left join with no matches, then the result behaves as if there were actually matches in the right table.

On other, more reasonable databases:
If you write the same query (left join to a table with no matches), then you get the correct result regardless of whether or not there was a case statement in the select.

SELECT
CASE WHEN 1=1 THEN 'comment this out' END case_statement,
VAL_A should_be_A,
VAL_B should_be_null
FROM
(SELECT 'A' VAL_A from dual),
(SELECT 'B' VAL_B from dual)
WHERE
VAL_A = VAL_B(+);

SELECT
--CASE WHEN 1=1 THEN 'comment this out' END case_statement,
VAL_A should_be_A,
VAL_B should_be_null
FROM
(SELECT 'A' VAL_A from dual),
(SELECT 'B' VAL_B from dual)
WHERE
VAL_A = VAL_B(+);

The corrective patch set we applied was:
p10269193_112020_Linux-x86-64.zip

*** Since this was posted we have found that there is a workaround for this as well.
It appears that if you substitute a regular table where you have dual, it works correctly.
Thx

Wednesday, June 08, 2011

OUCH: Fixing Invalid Objects

One of our DBA’s hit an interrupt today when trying to recompile invalid object on one of the instances we support.

He said, “Please be careful when using the database upgrade assistant. After I ran:
dbua -silent -dbname $ORACLE_SID -oracleHome $ORACLE_HOME -sysDBAUserName SYS -sysDBAPassword $PASS -recompile_invalid_objects true

It put the database into RESTRICTED and UPGRADE modes, disconnecting all sessions, unfortunately. After I took the db out of restricted mode it still stayed in UPGRADE mode; the only remedy seem to be to bounce the database.” Which we did and everything went back to “normal” (FYI, “normal’ is just a setting on your clothes dryer.) and all production jobs were restarted.

Thx

Thursday, March 24, 2011

Oracle Stops All Software Development For Intel Itanium Microprocessor

REDWOOD SHORES, Calif., March 22, 2011

After multiple conversations with Intel senior management Oracle has decided to discontinue all software development on the Intel Itanium microprocessor. Intel management made it clear that their strategic focus is on their x86 microprocessor and that Itanium was nearing the end of its life.

Both Microsoft and RedHat have already stopped developing software for Itanium. HP CEO Leo Apotheker made no mention of Itanium in his long and detailed presentation on the future strategic direction of HP.

Oracle will continue to provide customers with support for existing versions of Oracle software products that already run on Itanium.

The following url is the respone from HP.

http://www.hp.com/hpinfo/newsroom/press/2011/110323c.html?mtxs=rss-corp-news

Thursday, March 17, 2011

INACTIVE TOAD

We have a new policy regrading 3rd party GUI tools do to resource usage. Here it the policy note:


EFFECTIVE IMMEDIATELY: All users who are using TOAD or some other tool to query the warehouse, if your session is NOT active, please LOG OFF. Do not let your session or sessions sit idol.

Over the last few days we’ve been running all kinds of stats to try to better understand how we can tune our system, to support the enhanced volume pumping thru. We’ve learned that although your session may be idol and not doing anything, system resources are not being released and are staying engaged.

Please turn this process into a habit.

We can use ALL the system resources we can get.


Thanks in advance.

Tuesday, March 15, 2011

Bug 7581347

I currently support 44 Oracle instances on 40 different servers and that number is growing as new projects come along. The majority of these are 10gR2. A few are 11gR1 and one is 11gR2. We will be moving most of them if not all of them to 11g this calendar year. Yesterday one of the 10.2.0.4 instances residing on 64bit Red Hat Linux apparently hit a combination that opened a can of worms that I identified as Bug 7581347.

Here is the summary of the event:

Enduser was attempting to create a temp table in a process.

Step 15: SQL> create table tmp_somename_tb_12345 as select ...
Step 15: create table tmp_somename_tb_12345 as select Step 15: * Step 15: ERROR at line 1:
Step 15: ORA-00955: name is already used by an existing object

ERROR at line 1:
Step 15: ORA-38301: can not perform DDL/DML over objects in Recycle Bin

First I had them try this statement:

drop table purge;

Logged in as the table owner in question and of course replacing the with the table_name that was dropped and it now generating the error.

Result: Same errors generated.

Then I had them have the table_owner issue a:

purge user_recyclebin;

Note (That if this is issued all the tables that are in that users recycle_bin will be purged and therefore no longer useable for undo).

After purging the user recycle bin and reissuing the statements they got the following errors:

ORA-00600: internal error code, arguments: [ktagetg0-1], [], [], [], [], [], [], []

This ora600 is pointing (as you might expect) toward: (space/spcmgmt support for DML locks and temporary tables associated with table access). And it stands for “Kernel Transaction Access Get Generic DML lock.

This is a bug (7581347). It is fixed in 10.2.0.5. The only workaround given by Oracle was: “Do not drop objects whilst DML is using them.” Not to impressed with that one.

The row cache entry for the object seems to have been corrupted. There were only two things short of an upgrade that I could think of at that time:

1) Flushing the shared pool. (Note: that there may be somewhat of a temporary negative impact on performance when you flush all the SQL from the shared pool).

And if that does not work then:

2) Restart the database.

The idea here, being to try and clear out the row cache.

We ended up having to bounce the instance. And, keeping our fingers crossed, so far haven’t seen it again… yet.

Thx

Friday, March 11, 2011

A GREAT VIEW

The V$PWFILE_USERS view lists all users for an instance who have been granted the SYSDBA and/or the SYSOPER privileges. This information is derived from the password file. The views contents are:
Column Datatype Description
USERNAME VARCHAR2(30) Name of the user that is contained in the password file
SYSDBA VARCHAR2(5) Indicates whether the user can connect with SYSDBA privileges
(TRUE) or not (FALSE)
SYSOPER VARCHAR2(5) Indicates whether the user can connect with SYSOPER privileges
(TRUE) or not (FALSE)
SYSASM VARCHAR2(5) Indicates whether the user can connect with SYSASM privileges
(TRUE) or not (FALSE)

If you want to grant a user the sysdba privilege the following items need to be in place:
1) The remote_login_password parameter must be set to either “shared” or “exclusive”. (“shared” then it can be used by other instances; “exclusive” then it can only be used by that single instance). You can check the parameter setting by issuing the following statement:
SQL> show parameter password
NAME TYPE VALUE
------------------------------------ ----------- ----------remote_login_passwordfile string EXCLUSIVE

If it is not set or set to “shared” or “exclusive” you can add/change the parameter in the init.ora file:
remote_login_passwordfile=EXCLUSIVE
2) You must have or create an orapwd password file using the orapwd utility:
$ orapwd file= password= nosysdba= entries= force= ignorecase=

The file is as you would expect the name of the file that will hold the password information. Unless you path it, the file location will default to your current directory.

The password is the password for the SYS user.

The nosysdba is to identify whether or not to shut out the sysdba loging.

The entries are the maximum number of users that can be granted the SYSDBA or SYSOPER privilege.

The force identifies whether or not to overwrite the existing password file.

The ignorecase identifies if the passwords are case sensitive or not.

Note: The contents of the orapwd file are encrypted and are therefore unreadable. Also, if there is no orapwd password file and you attempt to grant sysdba to a user it will fail and generate an ORA-01994 error (GRANT failed: cannot add user to public password file). And an important change to be aware of is that until 11g passwords were not case sensitive; however, since the introduction of 11g passwords can be either case or non-case sensitive. The values are either required or optional as follows: file=required; password=optional; nosysdba=optional; entries=required; force=optional; ignorecase=optional. It is also important to remember that there can be no spaces on either side of the ‘=’ (equals) sign.

One last thing is that you issue an orapwd help=y to get the usage:
Usage: orapwd file= password= entries= force=
ignorecase= nosysdba=

3) Grant sysdba to ; Grant succeeded.

Thursday, March 10, 2011

HISTORY – SYSDBA

In life it is part of the natural process for things to come and go; to age out and change over time. And so it is with “connect internal” (I kind of miss that one), and so shall it be with SYSDBA at least in regard to ASM. Beginning with Oracle 11g for ASM there is a new sheriff in town, his name is SYSASM. This new role is a specialized version of SYSDBA for all things ASM instance.

A couple of things to keep in mind are, one: you can still use SYSDBA for starting up the ASM instance, however when you do you will generate a warning message in the alert-log that says:

WARNING: Deprecated privilege SYSDBA for command ‘STARTUP’

Two: you cannot use the SYSASM role on a RDBMS instance. If you do you will generate an ORA-01031 Insufficient Privileges error message.

Well for now you can still get away with using the SYSDBA role for ASM startup but in future versions it will go the way of ‘connect internal’.

Wednesday, March 09, 2011

ALL OR NONE

The night before last night we encountered the following on one of our instances (Oracle 11.1.0.7 on RedHat Linux 64bit OS):

sqlplus / as sysdba resulted in the following error message:
ORA-01031: insufficient privileges

After some troubleshooting I found that I could connect as follows:
sqlplus /nolog
Connect sys/ as sysdba

I brought down the rdbms instance and then I brought down the ASM instance and the server was rebooted. Post reboot I brought up the ASM instance with no issues and then proceeded to attempt to startup the rdbms instance. However, this failed with the following errors:

ORA-00210: cannot open the specified control file
ORA-00202: control file: '+DGRP_RAID5_01_0000/control03.ctl'
ORA-17503: ksfdopn:2 Failed to open file +DGRP_RAID5_01_0000/control03.ctl
ORA-15001: diskgroup "DGRP_RAID5_01_0000" does not exist or is not mounted
ORA-15055: unable to connect to ASM instance
ORA-01031: insufficient privileges

I connected to the ASM instance and confirmed that the diskgroup was mounted and that the controlfiles were really there:

select group_number,name,state from v$asm_diskgroup;

GROUP_NUMBER NAME STATE
------------------ ------------------------------ -----------
1 DGRP_RAID5_01_0000 MOUNTED

select a.name FILE_ALIAS, b.name DISKGROUP_NAME
from v$asm_alias a, v$asm_diskgroup b
where a.group_number=b.group_number
order by a.name;

FILE_ALIAS DISKGROUP_NAME
---------------------------------------------------------------------- ------------------------
CONTROLFILE DGRP_RAID5_01_0000
CONTROLFILE DGRP_RAID5_01_0000
CONTROLFILE DGRP_RAID5_01_0000


I then traced it down to an entry in the sqlnet.ora file:
SQLNET.AUTHENTICATION_SERVICES= (NONE)

I changed this to:
SQLNET.AUTHENTICATION_SERVICES= (ALL)

I was then able to login to sqlplus / as sysdba as expected and start the rdbms instance with no issues.

Tuesday, March 08, 2011

QUIESCE

Apparently this morning after, as part of a rebuild of the OEM Repository (11.1.0.7.0 on RedHat Linux 64bit) during the dropping of a queue table (…DROP_QUEUE_TABLE(queue_table => 'SYSMAN.MGMT_NOTIFY_QTABLE',force => TRUE);) so that the sysman account and management objects could be dropped and rebuilt the base went into a quiesce restricted mode and then came out as expected; however, for some reason the alert log shows that it went back into the quiesce state:

Tue Mar 08 08:59:13 2011
Database in quiesce mode
Database out of quiesce mode
Tue Mar 08 09:14:51 2011
Database in quiesce mode

This of course caused all sqlplus session connection attempts to hang. After issuing an (ALTER SYSTEM UNQUIESCE;) The base went back to normal as expected.

The (ALTER SYSTEM QUIESCE RESTRICTED) SQL statement to is used to quiesce the database. While the instance is in this mode of operation only the users SYS can connect as sysdba. All others users connections will hang and wait. If a SQLPlus session is already connected it will wait (hang) as well until the instance is unquiesce. Therefore no other users besides SYS and SYSTEM can start any new transactions, or new queries, etc. Only the SYS and SYSTEM users can execute the ALTER SYSTEM QUIESCE RESTRICTED SQL statement.

Also while the instance is in the quiesced state, no changes to the Resource Manager plan(s) can be made for the instance. And as noted earlier, the ALTER SYSTEM UNQUIESCE SQL statement can be issued to put the instance back into its normal mode of operation; thusly releasing all blocked activities.

A handy view to be aware of is the v$blocking_quiesce view. You can use it to help determine which sessions are blocking a quiesce from finishing.

The following link is to an Oracle document that gives more detail about quiesce:
http://download.oracle.com/docs/cd/B28359_01/server.111/b28310/start004.htm

Thursday, March 03, 2011

Oracle Client Character MisMatch

One of my endusers just installed an Oracle 10g R2 client on their laptop and got the following error when trying to connect to one of my instances (Oracle 11.1.0.7.0 running on Red Hat Linux - the base has a UTF8 Charter Set to support the meta-data repository for the Unica Campaign tool): “Database character set (UTF8) and Client character set (WE8MSWIN1252) are different. Charter set conversion may cause unexpected results.”
So I told her to do the following:

Setting NLS_LANG
You can set the NLS_LANG environment variable to control globalization features.
Example 12-1 Configuring UTF8 Support in SQL*Plus on Windows
1. Ensure you have exited your current SQL*Plus session.
2. Open System from Start > Settings > Control Panel.
3. Click the Advanced tab and select Environment Variables.
4. Create a new environment variable, NLS_LANG, with a value of AMERICAN_AMERICA.UTF8.
5. You may need to restart Windows for this setting to take effect.
Thx

Thursday, February 24, 2011

SOME ALIAS I USE

Here are some alias' I like to setup in my .profile or my .bash_profile files:

alias oal='tail -500f $ORACLE_BASE/admin/$ORACLE_SID/bdump/alert_$ORACLE_SID.log' --Last 500 Lines from the Oracle Alert Log

alias otrace='cd $ORACLE_BASE/admin/$ORACLE_SID/bdump' --Shortcut to CD to the Oracle Trace file directory.

alias init='cd $ORACLE_BASE/admin/$ORACLE_SID/pfile' --Shortcut to CD to the pfile directory for the Oracle Initialization file.

alias dba='cd /home//dba' --Shortcut to CD to my DBA Scripts directory.

alias tns='cd $ORACLE_HOME/network/admin' --Shortcut to CD to the tnsnames.ora file directory.

alias psg='ps -ef |grep' --Shortcut to look at certain system processes. Use psg

alias ll='ls -ltarF' --List out in timestamp order (oldest to newest), including hidden files, and '/' on directories.

alias oadmin='cd $ORACLE_HOME/rdbms/admin' --Shortcut to CD to the Oracle rdbms/admin directory.

alias lsd='ls -ltr | grep drw' --List only directories. Very handy with directories containing a lot of files.

Wednesday, February 23, 2011

DBUA from 10gR2 to 11gR2 Issue

One of my fellow dba's hit this snag recently upgrading from 10r2 to 11r2: "I used Xming and Putty and was able to Install everything for Oracle 11gR2, Upgrade ASM to the new Grid Infrastructure just fine. Installed Oracle 11gR2. The problem was when I started to use the DBUA to upgrade my database from 10gR2 to 11gR2 DBUA showed me all the panels until I got to the very last panel where you would actually launch the upgrade process and it stopped working. No processes where running on the server and no information on the screen to indicate the steps were processing. I let it stay this was for over an hour. Oracle (SR) said I had to use another X-windows tool.

I used my trusty Tru64 server I have at my desk and it worked fine. The progress bar started displaying within one second."

A good FYI.
Thx

Thursday, February 17, 2011

A HIERARCHICAL APPROACH TO FILE-SYSTEM PRIVILEGE MANAGEMENT

When ever I am setting up a new Unix or Linux server I like to take in consideration the type of users that will be utilizing the box and divided them up into their appropriate groups by type of function. Once this is complete, I then determine level or access each of these types of users groups should have in regard to each other. For example:

I have identified the following types of user categories:
Customer/Client, Testers, Production executors, Developers, Database Administrators, System Administrators and Backup Support Team Members.

Now I will break them down into a privilege hierarchical arrangement (this arrangement is managed by using Unix groups). The groups would be as follows:

1. cust = customer
2. testr = testers
3. pord = production executors
4. dev = developers
5. dba = database administrators
6. sysa = system administrators
7. bkup = backup team members

Therefore each user would receive as their primary group the group which they are a part of. That is to say, as a DBA I would have dba set as my user’s primary group.

The next is to assign the secondary groups:
1. A customer would have a primary group of cust and no secondary groups.
2. A tester would have a primary group of testr and a secondary group of cust.
3. A production executor would have a primary group of prod and secondary groups of testr and cust.
4. A developer would have a primary group of dev and secondary groups of prod, testr and cust.
5. A Database Administrator would have a primary group of dba and secondary groups of dev, prod, testr and cust.
6. A system administrator would have primary group of system and root level access.
7. A backup team member would have primary group of bkup and would have no secondary groups.

Wednesday, February 16, 2011

BLOCK CORRUPTION!

We have been running into a block corruption error on a 10.2.0.4 Oracle instance running on RedHat Linux (kddummy_blkchk], [129], [5850]). The other day it hit us on an insert again due to a bad block error. This was the second week in a row we encountered this for the same process. The insert had ran for 4 hours and died on the commit! After some root-cause-analysis (RCA) it seems that this issue is pointing to a table logical error. There seems to be a problem with that table not the physical structure. We were thinking that this table may have gotten created with invalid indexe or something. This problem has now occurred in 3 different environments (dev, test, prod) for the same table object. The table is inserted into every weekend after a truncate is issued. However, if we do a CTAS and then an insert using only the temp table created in the CTAS it works fine without encountering the bad block error.

We opened an SR with Oracle and they said pretty much the same thing for the workaround; here is what they said about it:

ODM Action Plan Oracle Support
=== ODM Action Plan ===

In order to recreate the table you need to disable the DB_BLOCK_CHECKINg because in other case, it will fail with the ORA-600 error. We are removing 1 check layer in order to be able to recreate the table. We know there is a block corruption, but we know as well it is not affected any data on the table, so we can do it safely.

Process should be:
1.-Disable DB_BLOCK_CHECKING
2.-Recreate the table
3.-Re-enable DB_BLOCK_CHECKING

On the other hand, the steps you can follow in order to fix the corruption are
DROP TABLE .. PURGE;
or
ALTER TABLE .. MOVE ..;
or
Create table as select(CTAS)
or
export/import,etc

You don't need to follow all steps but only one.

Since we have been using the CTAS workaround we have no longer been receiving the error. Here are the steps we are using now for this object:

Drop table tabone;
CTAS tmp_tabone;
Rename tmp_tabone to tabone;

Here is a nice little piece of code if you just want to find out what object is corrupt (replace the file and blockid):

select tablespace_name, segment_type, owner, segment_name
from dba_extents
where file_id =
and between block_id AND block_id + blocks-1;

Tuesday, February 15, 2011

RMAN 11G Issue: ORA-20446 and ORA-06512

Ran into this one today when trying to setup a new tablespace backup via RMAN in a newly installed Oracle 11.1.0.7 instance on RedHat Linux:
ORA-20446: The owner of the job is not registered ORA-06512: at "SYSMAN.MGMT_JOBS", line 168 ORA-06512: at "SYSMAN.MGMT_JOBS", line 86 ORA-06512: at line 1

It turns out that this is a know bug with 11G. With a simple workaround:
Login as SYSMAN and execute the following command for the users in question:

EXECUTE MGMT_USER.MAKE_EM_USER(‘’);

Wednesday, February 09, 2011

NetBackup S/W Upgrade Required for Upgrade to Oracle 11R2

Some versions of NetBackup will not run with Oracle 11gR2 unless you update it at least 6.5.6 or 7.

From Oracle ORA-07445 [Strcpy()+48] during RMAN backup or restore to tape using Oracle 11.2 [ID 959015.1]

Metalink note 959015.1

The latest version of Netbackup is not compatible with 11gR2, an upgrade to Netbackup will be provided by Veritas when they have tested and released the newer version. This issue can be seen during either a backup or restore.

Information available on Veritas website regarding this issue:

Document ID: 337527
RMAN-10038 error when attempting to backup Oracle version 11.2 using NetBackup 6.x.
URL: http://support.veritas.com/docs/337527

Monday, February 07, 2011

GENERAL SQL PERFREED USAGE

1. If it is unnecessary to read an entire table and a suitable index is available ensure that it is being used.
a. Typically speaking a concatenated index on last-name and first-name for example is faster than separate indexes on last-name and first-name.
i. Generally the more columns which are specified in a concatenated index make it more selective and thereby tend to improve its performance. For instance you could add dob to the exampled concatenated index.
b. Oracle cannot use an index to search for NULL values; however it can use an index to find values which are NOT NULL.
c. An index which contains the columns in the ORDER BY clause and the columns in the SELECT list will provide pretty good performance for ‘first row’ and ‘all rows’. However, when an index on ORDER BY columns alone is issued, the speed will be pretty good for ‘first row’ however; it will often be slower than a full table scan when retrieving ‘all rows’.

2. When using LIKE command with a wildcard ‘%’, the placement of that wildcard can make a difference in the amount of I/O required to answer the query.
a. If (LIKE ‘TIM%’) or if (LIKE ‘%ME’) with a full table scan there is no difference in the I/O required to satisfy the query.
b. If (LIKE ‘TIM%’) is used with an index it will be much faster than a full table scan.
c. If (LIKE ‘%ME’) is used with an index it will be much faster than that using (LIKE ‘TIM%’) is used with an index.

3. When considering JOIN Performance HASH JOIN’s tend to be faster, then SORT and lastly NESTED LOOPS.

4. A JOIN is usually faster than an IN as a JOIN gives the optimizer more choices for access paths.

5. If applicable the START WITH clause is much faster than a WHERE clause in eliminating rows in a hierarchal query.

6. A sub-query with an EXISTS will beat a sub-query with an IN which will beat an equivalent JOIN.
a. When using an EXISTS with a Correlated Sub-query it is best if the sub-query can be satisfied using an index lookup only.

7. When you use COUNT(*) it performs a full table scan; if the table has an index on a NOT NULL column then you can use COUNT() to reduce the count time substantially on large tables.
a. If a NOT NULL indexed column is not available then it is better to use COUNT(*) rather than COUNT() in regard to speed.

8. You can use an index to resolve a GROUP BY if the index contains the columns in the GROUP BY list, and all the columns are aggregated in the SELECT list.

9. Using WHERE instead of HAVING (where possible of course) reduces I/O requirements by eliminating rows before they are grouped.

10. If you don’t need to get rid of duplicate rows in a UNION operation that you are performing, you should use UNION ALL instead of UNION to avoid sorts.

11. Hash Joins tend to perform much better than INTERSECT’s; therefore you should consider recoding INTERSET’s into HASH JOINs.

12. In the same fashion you should consider recoding a MINUS into an ANTI-JOIN using the HASH_AJ hint.

13. Where applicable (removing all rows) TRUNCATE TABLE is preferred over a DELETE.

14. Remember that as great as indexes can be, they also add overhead to INSERT, DELETE and some UPDATE’s. Therefore avoid over-indexing, especially on columns which are frequently changed.

15. Parallel SQL can be a utilized to achieve great performance gains. The following are types of statements that can be parallelized:
a. SQL queries (especially if they contain large full table scans).
b. Building or the rebuilding of indexes.
c. Creating Tables as SELECT (CTAS).
d. An UPDATE, DELETE or INSERT statement.

16. Trying to accomplish a lot within one complex query tends to be much slower in Oracle than breaking it down via temporary tables that are cleaned up once the result set is achieved.

17. When DELETE’s are required on large tables, if possible use a CTAS to create the table without the rows which would have been deleted, thereby greatly reducing the overall time that would have been incurred by the DELETE.

Friday, February 04, 2011

OUI May Deinstall Unintended Files

One of my co-workers encountered the following issue earlier today:

This is just a heads up that Oracle OUI may deinstall unintended files. I used 11.1.0.7 OUI to deinstall Oracle 10.1 & 10.2 client and server installations today, and soon after these errors appeared –
ORA-12154: TNS:could not resolve the connect identifier specified

The deinstaller removed the tnsnames.ora, sqlnet.ora & listener.ora files in the active $ORACLE_HOME/network/admin directory even though TNS_ADMIN was set correctly. The missing files had to be extracted from the tar backup and restored to the correct location. I did not find this specific issue in My Oracle Support, but did find some similar problems reported for Oracle 11.2. Support request was opened.

Thx

Tuesday, January 25, 2011

How To Tell If My Linux O/S is 64bit or 32bit

From the command prompt issue the "uname -m" command, if the output is i386 and/or i686 then it's 32bit, if it's x86_64 then it's 64bit.


$>uname -m
i686

Monday, January 24, 2011

A Tar Gzip Split

The issue:

We had a 350gbs text file that was gzip’ed down to 115gbs that needed to be move from one of our Linux servers to a client’s Lunix server. However, as usual they wanted it there yesterday. So to speed this movement up I wanted to split the file so that it could be sftp’ed in parallel. Of course unzipping and splitting and re-zipping would require a lot of time not to mention space that was not available. What to do…

The answer:

Here is what I did.

First I split the gzip’ed file using the following command:

tar cfz - |split -b 100m /u1/directoryname/somefilename.gz

This split the zipped source file into 11 pieces. Allowing them to be ftp’ed in parallel; thereby greatly reducing the overall transfer time. Once the 11 files were on the target server they were put back together and then they where uncompressed.

cat x* > somefilename.gz (note: it is important to use the x* as if you try to order the reconstruction of the files the resulted gzip’ed file will be corrupt).

Then this new file must be unzipped:

gunzip somefilename.gz

I go no errors and a quick head and wc –l on the file showed everything to be fine.
Thx

Friday, January 21, 2011

Fatal NI connect error 12170.

I recently installed an 11 R1 on a Linux box and after creating the database instance started getting the following error a lot in the alert log:

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.1.0.7.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.7.0 - Production
Time: 21-JAN-2011 09:26:53
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=xx.xxx.x.xx)(PORT=xxxx))
WARNING: inbound connection timed out (ORA-3136)

This message is easily corrected by following the instruction in the Oracle document on Metalink:
Net Diagnostic Data Found in 11g Alert Log [ID 745167.1]

The boil down of the solution is to add the following parameter to your sqlnet.ora file to disable a new feature in 11 that is defaulted to ‘ON’:

DIAG_ADR_ENABLED =OFF

Tuesday, January 11, 2011

FLUSH BEFORE TESTING

When testing it can be important to ensure that you are not running your sql from cache as this may give you false results in regard to timing. You wouldn’t want to find yourself in production saying, “well, I it ran fast in test”. Anytime you are performing any type of performance tests it is important to try and duplicate your production environment as best as you can. This is even more important when you are talking about hdd disk reads (very time-consuming in relation to memory reads).

So you are doing testing and want to know how you can flush the buffer cache within Oracle without having to bounce the database instance?

The answer is to Flush the data buffer cache.

alter system flush buffer_cache;

This is for testing and wouldn’t be a very good idea to execute it on a production system as it could have an adverse effect on performance.

A QUICK LOOK AT WHY

The other day I got the following email:

***********
Sam,
Can you please take a look at this query? It’s been running twice as long as expected:

SID
738 aname bname SQL*Plus @@corvette 15,051 776.06 08-JAN-11 02:40:23 32 02:55:15 136
insert /*+ append */ into tmp_somename_tb select scheme. Id…

Thanks!
***********

Here is the quick pass I performed to see what was going on.

First the summary I gave:
As expected the insert is doing heavy direct path reads to temp and then once those slave buckets are full it does heavy direct path writes to temp. During the reads the box load is light (9 and 10), the cpu is light (51% used – 47% user and 4% system), the run queue not bad (4, 5); however, The cpu’s are waiting on i/o (25, 28). During the writes the box load is very high (126 and 152), the cpu is heavy (99% used – 94% user and 5% system), the run queue is very high (116, 122); The cpu’s are not waiting on i/o (0).

Now, the where I got it from:
Direct path reads/writes to temp:
Within grid control, under top activity, selected the sqlid in question, in the details section the summary block using the color codes to determine the direct path reads/writes to temp.

Box Load:
At the top of ‘top’: load average: 9.03, 10.93, 14.60

CPU: Also from ‘top’

Run Queue:
sfulke@corvette[gmdwhp]:/home/sfulke> vmstat 5 5
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
11 18 900 14836684 276388 4334588 0 0 37171 4913 0 0 32 7 49 11
9 22 900 14893188 276412 4334564 0 0 671667 54523 10845 18386 59 11 3 27
28 11 900 14684604 276424 4334552 0 0 624320 68223 7881 12968 48 8 14 30
33 13 900 14683636 276444 4334532 0 0 680810 320344 12198 14958 66 13 9 12
20 14 900 14682612 276492 4334484 0 0 739581 278236 10658 15601 57 12 14 17
sfulke@corvette[gmdwhp]:/home/sfulke>

The run queue is the first column from the vmastat output (remember to disregard the first line of output as it can be accumulative) ‘r’.

The last column from the vmstat output is the cpu’s waiting on i/o ‘wa’.


Here is a little more detail for each of the columns from the vmstat output:
Processes:
r: Processes waiting in line for some time on the CPU
b: Uninterruptble sleeping processes

Memory:
swpd: Virtual memory usage
free: Idle memory usage
buff: Buffer Memory
cache: Cache Memory

Swap:
si: Memory swapped in from disk
so: Memory swapped out to disk

IO:
bi: Blocks received hdd
bo: Blocks sent to hdd

System:
in: Number of interrupts per second
cs: Number of context switches per second

CPU:
us: Time used executing non-kernel code. (For example: User Time items like ‘nice’ – actual work)
sy: Time used executing kernel code. (For example: System Time items like ‘IO interrupts’- work management)
id: Time in idle state.
wa: Time used waiting for IO. Prior

Have a great day!