CRS and 10g Real Application Clusters



PURPOSE
-------------

This document is to provide additional information on CRS (Cluster Ready Services)
in 10g Real Application Clusters.


SCOPE & APPLICATION
---------------------------------

This document is intended for RAC Database Administrators and Oracle support
enginneers.


CRS and 10g REAL APPLICATION CLUSTERS
-------------------------------------------------------------

CRS (Cluster Ready Services) is a new feature for 10g Real Application Clusters
that provides a standard cluster interface on all platforms and performs
new high availability operations not available in previous versions. 

CRS KEY FACTS
-------------

Prior to installing CRS and 10g RAC, there are some key points to remember about
CRS and 10g RAC:

- CRS is REQUIRED to be installed and running prior to installing 10g RAC.

- CRS can either run on top of the vendor clusterware (such as Sun Cluster,
  HP Serviceguard, IBM HACMP, TruCluster, Veritas Cluster, Fujitsu Primecluster,
  etc...) or can run without the vendor clusterware.  The vendor clusterware
  was required in 9i RAC but is optional in 10g RAC.

- The CRS HOME and ORACLE_HOME must be installed in DIFFERENT locations.

- Shared Location(s) or devices for the Voting File and OCR (Oracle
  Configuration Repository) file must be available PRIOR to installing CRS.  The
  voting file should be at least 20MB and the OCR file should be at least 100MB.

- CRS and RAC require that the following network interfaces be configured prior
  to installing CRS or RAC:
  - Public Interface
  - Private Interface
- The root.sh script at the end of the CRS installation starts the CRS stack.
  If your CRS stack does not start, see Note 240001.1.

- Only one set of CRS daemons can be running per RAC node.

- On Unix, the CRS stack is run from entries in /etc/inittab with "respawn".

- If there is a network split (nodes loose communication with each other).  One
  or more nodes may reboot automatically to prevent data corruption.

- The supported method to start CRS is booting the machine.  MANUAL STARTUP OF
  THE CRS STACK IS NOT SUPPORTED UNTIL 10.1.0.4 OR HIGHER.
- The supported method to stop is shutdown the machine or use "init.crs stop".

- Killing CRS daemons is not supported unless you are removing the CRS
  installation via Note 239998.1 because flag files can become mismatched.

- For maintenance, go to single user mode at the OS.

Once the stack is started, you should be able to see all of the daemon processes
with a ps -ef command:

[rac1]/u01/home/beta> ps -ef | grep crs

oracle  1363   999  0 11:23:21 ?  0:00 /u01/crs_home/bin/evmlogger.bin -o /u01
oracle   999     1  0 11:21:39 ?  0:01 /u01/crs_home/bin/evmd.bin
root    1003     1  0 11:21:39 ?  0:01 /u01/crs_home/bin/crsd.bin
oracle  1002     1  0 11:21:39 ?  0:01 /u01/crs_home/bin/ocssd.bin


CRS DAEMON FUNCTIONALITY
--------------------------------------------

Here is a short description of each of the CRS daemon processes:

CRSD:
- Engine for HA operation
- Manages 'application resources'
- Starts, stops, and fails 'application resources' over
- Spawns separate 'actions' to start/stop/check application resources
- Maintains configuration profiles in the OCR (Oracle Configuration Repository)
- Stores current known state in the OCR.
- Runs as root
- Is restarted automatically on failure

OCSSD:
- OCSSD is part of RAC and Single Instance with ASM
- Provides access to node membership
- Provides group services
- Provides basic cluster locking
- Integrates with existing vendor clusteware, when present
- Can also runs without integration to vendor clustware
- Runs as Oracle.
- Failure exit causes machine reboot. 
--- This is a feature to prevent data corruption in event of a split brain.

EVMD:
- Generates events when things happen
- Spawns a permanent child evmlogger
- Evmlogger, on demand, spawns children
- Scans callout directory and invokes callouts.
- Runs as Oracle.
- Restarted automatically on failure

CRS LOG DIRECTORIES
--------------------------------

When troubleshooting CRS problems, it is important to review the directories
under the CRS Home.

$ORA_CRS_HOME/crs/log - This directory includes traces for CRS resources that are
joining, leaving, restarting, and relocating as identified by CRS.

$ORA_CRS_HOME/crs/init - Any core dumps for the crsd.bin daemon should be written
here.  Note 1812.1 could be used to debug these.

$ORA_CRS_HOME/css/log - The css logs indicate all actions such as
reconfigurations, missed checkins , connects, and disconnects from the client
CSS listener . In some cases the logger logs messages with the category of
(auth.crit) for the reboots done by oracle. This could be used for checking the
exact time when the reboot occured.

$ORA_CRS_HOME/css/init - Core dumps from the ocssd primarily and the pid for the
css daemon whose death is treated as fatal are located here. If there are
abnormal restarts for css then the core files will have the formats of
core..  Note 1812.1 could be used to debug these.

$ORA_CRS_HOME/evm/log - Log files for the evm and evmlogger daemons.  Not used
as often for debugging as the CRS and CSS directories.

$ORA_CRS_HOME/evm/init - Pid and lock files for EVM.  Core files for EVM should
also be written here.  Note 1812.1 could be used to debug these.

$ORA_CRS_HOME/srvm/log - Log files for OCR.


STATUS FOR CRS RESOURCES
------------------------

After installing RAC and running the VIPCA (Virtual IP Configuration Assistant)
launched with the RAC root.sh, you should be able to see all of your CRS
resources with crs_stat.  Example:

cd $ORA_CRS_HOME/bin
./crs_stat

NAME=ora.rac1.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac1.oem
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac1.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac1.vip
TYPE=application
TARGET=ONLINE          
STATE=ONLINE

NAME=ora.rac2.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac2.oem
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac2.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE

NAME=ora.rac2.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE

There is also a script available to view CRS resources in a format that is
easier to read.  Just create a shell script with:

--------------------------- Begin Shell Script -------------------------------

#!/usr/bin/ksh
#
# Sample 10g CRS resource status query script
#
# Description:
#    - Returns formatted version of crs_stat -t, in tabular
#      format, with the complete rsc names and filtering keywords
#   - The argument, $RSC_KEY, is optional and if passed to the script, will
#     limit the output to HA resources whose names match $RSC_KEY.
# Requirements:
#   - $ORA_CRS_HOME should be set in your environment

RSC_KEY=$1
QSTAT=-u
AWK=/usr/xpg4/bin/awk    # if not available use /usr/bin/awk

# Table header:echo ""
$AWK \
  'BEGIN {printf "%-45s %-10s %-18s\n", "HA Resource", "Target", "State";
          printf "%-45s %-10s %-18s\n", "-----------", "------", "-----";}'

# Table body:
$ORA_CRS_HOME/bin/crs_stat $QSTAT | $AWK \
'BEGIN { FS="="; state = 0; }
  $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1};
  state == 0 {next;}
  $1~/TARGET/ && state == 1 {apptarget = $2; state=2;}
  $1~/STATE/ && state == 2 {appstate = $2; state=3;}
  state == 3 {printf "%-45s %-10s %-18s\n", appname, apptarget, appstate; state=0;}'

--------------------------- End Shell Script -------------------------------

Example output:

[opcbsol1]/u01/home/usupport> ./crsstat

HA Resource            Target     State            
-----------            ------     -----            
ora.V10SN.V10SN1.inst  ONLINE     ONLINE on opcbsol1
ora.V10SN.V10SN2.inst  ONLINE     ONLINE on opcbsol2        
ora.V10SN.db           ONLINE     ONLINE on opcbsol2
ora.opcbsol1.ASM1.asm  ONLINE     ONLINE on opcbsol1
LISTENER_OPCBSOL1.lsnr ONLINE     ONLINE on opcbsol1
ora.opcbsol1.gsd       ONLINE     ONLINE on opcbsol1
ora.opcbsol1.ons       ONLINE     ONLINE on opcbsol1
ora.opcbsol1.vip       ONLINE     ONLINE on opcbsol1
ora.opcbsol2.ASM2.asm  ONLINE     ONLINE on opcbsol2
LISTENER_OPCBSOL2.lsnr ONLINE     ONLINE on opcbsol2
ora.opcbsol2.gsd       ONLINE     ONLINE on opcbsol2
ora.opcbsol2.ons       ONLINE     ONLINE on opcbsol2
ora.opcbsol2.vip       ONLINE     ONLINE on opcbsol2


CRS RESOURCE ADMINISTRATION
---------------------------

You can use srvctl to manage these resources.  Below are syntax and examples.

-------------------------------------------------------------------------------

CRS RESOURCE STATUS

srvctl status database -d [-f] [-v] [-S ]
srvctl status instance -d -i >[,]
       [-f] [-v] [-S ]
srvctl status service -d -s [,]
       [-f] [-v] [-S ]
srvctl status nodeapps [-n ]
srvctl status asm -n

EXAMPLES:

Status of the database, all instances and all services. 
srvctl status database -d ORACLE -v
Status of named instances with their current services. 
srvctl status instance  -d ORACLE -i RAC01, RAC02 -v
Status of a named services.
srvctl status service -d ORACLE -s ERP  -v
Status of all nodes supporting database applications.
srvctl status node

-------------------------------------------------------------------------------

START CRS RESOURCES

srvctl start database -d [-o < start-options>]
       [-c | -q]
srvctl start instance -d -i
       [,] [-o ] [-c | -q]
srvctl start service -d [-s [,]]
       [-i ]  [-o ] [-c | -q]
srvctl start nodeapps -n
srvctl start asm -n [-i ] [-o ]

EXAMPLES:

Start the database with all enabled instances. 
srvctl start database -d ORACLE
Start named instances. 
srvctl start instance  -d ORACLE -i RAC03, RAC04
Start named services.  Dependent instances are started as needed.
srvctl start service -d ORACLE -s CRM
Start a service at the named instance.
srvctl start  service -d ORACLE -s CRM -i RAC04
Start node applications.
srvctl start  nodeapps -n myclust-4

-------------------------------------------------------------------------------

STOP CRS RESOURCES

srvctl stop database -d [-o ] 
       [-c | -q]
srvctl stop instance -d -i [,]
       [-o ][-c | -q]
srvctl stop service -d [-s [,]]
       [-i ][-c | -q] [-f]
srvctl stop nodeapps -n
srvctl stop asm -n [-i ] [-o ]

EXAMPLES:

Stop the database, all instances and all services. 
srvctl stop database -d ORACLE
Stop named instances, first relocating all existing services. 
srvctl stop instance  -d ORACLE -i RAC03,RAC04
Stop the service.
srvctl stop service -d ORACLE -s CRM
Stop the service at the named instances. 
srvctl stop  service -d ORACLE -s CRM -i RAC04
Stop node applications.  Note that instances and services also stop.
srvctl stop  nodeapps -n myclust-4

-------------------------------------------------------------------------------

ADD CRS RESOURCES

srvctl add database -d -o [-m ] [-p ]
       [-A /netmask] [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY}]
       [-s ] [-n ]
srvctl add instance -d -i -n
srvctl add service -d -s -r
       [-a ] [-P ] [-u]
srvctl add nodeapps -n -o
       [-A /netmask[/if1[|if2|...]]]
srvctl add asm -n -i -o

OPTIONS:

-A vip range, node, and database, address specification. The format of
        address string is:
[]//[/
        host interface2 |..]>] [,] []//
        [/]
-a for services, list of available instances, this list cannot include
        preferred instances
-m domain name with the format "us.mydomain.com"
-n node name that will support one or more instances
-o $ORACLE_HOME to locate Oracle binaries
-P for services, TAF preconnect policy - NONE, PRECONNECT
-r for services, list of preferred instances, this list cannot include
        available instances.
-s spfile name
-u updates the preferred or available list for the service to support the
        specified instance. Only one instance may be specified with the -u
        switch.  Instances that already support the service should not be
        included.

EXAMPLES:

Add a new node:
srvctl add nodeapps -n myclust-1 -o $ORACLE_HOME  –A 
        139.184.201.1/255.255.255.0/hme0
Add a new database. 
srvctl add  database  -d ORACLE -o $ORACLE_HOME
Add named instances to an existing database. 
        srvctl add instance -d ORACLE -i RAC01 -n myclust-1
        srvctl add instance -d ORACLE -i RAC02 -n myclust-2
        srvctl add instance -d ORACLE -i RAC03 -n myclust-3
Add a service to an existing database with preferred instances (-r) and
available instances (-a).  Use basic failover to the available instances.
srvctl add service -d ORACLE -s STD_BATCH -r RAC01,RAC02 -a RAC03,RAC04
Add a service to an existing database with preferred instances in list one and
available instances in list two. Use preconnect at the available instances.
srvctl add service -d ORACLE -s STD_BATCH -r RAC01,RAC02 -a RAC03,RAC04  -P PRECONNECT

-------------------------------------------------------------------------------

REMOVE CRS RESOURCES

srvctl remove database -d  
srvctl remove instance  -d [-i ]
srvctl remove service -d -s [-i ] 
srvctl remove nodeapps -n

EXAMPLES:

Remove the applications for a database. 
srvctl remove database  -d ORACLE
Remove the applications for named instances of an existing database. 
srvctl remove instance -d ORACLE -i  RAC03
srvctl remove instance -d ORACLE -i  RAC04
Remove the service.
srvctl remove  service -d ORACLE -s STD_BATCH
Remove the service from the instances.
srvctl remove  service  -d ORACLE -s STD_BATCH -i RAC03,RAC04
Remove all node applications from a node.
srvctl remove  nodeapps -n myclust-4

-------------------------------------------------------------------------------

MODIFY CRS RESOURCES

srvctl modify database -d [-n ] [-m ]
       [-p ]  [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY}]
       [-s ]
srvctl modify instance -d -i -n
srvctl modify instance -d -i {-s | -r}
srvctl modify service -d -s -i
       -t [-f]
srvctl modify service -d -s -i
       -r  [-f]
srvctl modify nodeapps -n [-A ] [-x]

OPTIONS:

-i -t   the instance name (-i) is replaced by the
   instance name (-t)
-i -r the named instance is modified to be a preferred instance
-A address-list for VIP application, at node level
-s add or remove ASM dependency

EXAMPLES:

Modify an instance to execute on another node.
srvctl modify  instance  -d ORACLE  -n myclust-4
Modify a service to execute on another node.
srvctl modify service -d ORACLE  -s HOT_BATCH -i  RAC01 -t RAC02
Modify an instance to be a preferred instance for a service.
srvctl modify service -d ORACLE  -s HOT_BATCH -i  RAC02 –r

-------------------------------------------------------------------------------

RELOCATE SERVICES

srvctl relocate service -d -s [-i ]-t [-f]

EXAMPLES:

Relocate a service from one instance to another
srvctl relocate  service -d ORACLE -s CRM -i RAC04 -t RAC01

-------------------------------------------------------------------------------

ENABLE CRS RESOURCES (The resource may be up or down to use this function)

srvctl enable database -d
srvctl enable instance -d -i [,]
srvctl enable service -d -s ] [, ] [-i ] 

EXAMPLES:

Enable the database. 
srvctl enable database -d ORACLE
Enable the named instances. 
srvctl enable instance  -d ORACLE -i RAC01, RAC02
Enable the service. 
srvctl enable  service -d ORACLE -s ERP,CRM
Enable the service at the named instance.
srvctl enable  service -d ORACLE -s CRM -i RAC03

-------------------------------------------------------------------------------

DISABLE CRS RESOURCES (The resource must be down to use this function)

srvctl disable database -d
srvctl disable instance -d -i [,]
srvctl disable service -d -s ] [,] [-i ] 

EXAMPLES:

Disable the database globally. 
srvctl disable database -d ORACLE
Disable the named instances. 
srvctl disable instance  -d ORACLE -i RAC01, RAC02
Disable the service globally. 
srvctl disable  service -d ORACLE -s ERP,CRM
Disable the service at the named instance.
srvctl disable  service -d ORACLE  -s CRM -i RAC03,RAC04

Comments

Popular posts from this blog

AWR Reports

ORA-01565: error in identifying file '?/dbs/spfile@.ora'

My Fav Song in Telugu: Aa challani samudra garbham