CRS and 10g Real Application Clusters
PURPOSE
-------------
This document is to provide additional information on CRS
(Cluster Ready Services)
in 10g Real Application Clusters.
SCOPE & APPLICATION
---------------------------------
This document is intended for RAC Database Administrators
and Oracle support
enginneers.
CRS and 10g REAL APPLICATION CLUSTERS
-------------------------------------------------------------
CRS (Cluster Ready Services) is a new feature for 10g Real
Application Clusters
that provides a standard cluster interface on all platforms
and performs
new high availability operations not available in previous
versions.
CRS KEY FACTS
-------------
Prior to installing CRS and 10g RAC, there are some key
points to remember about
CRS and 10g RAC:
- CRS is REQUIRED to be installed and running prior to
installing 10g RAC.
- CRS can either run on top of the vendor clusterware (such
as Sun Cluster,
HP Serviceguard, IBM
HACMP, TruCluster, Veritas Cluster, Fujitsu Primecluster,
etc...) or can run
without the vendor clusterware. The
vendor clusterware
was required in 9i
RAC but is optional in 10g RAC.
- The CRS HOME and ORACLE_HOME must be installed in
DIFFERENT locations.
- Shared Location(s) or devices for the Voting File and OCR
(Oracle
Configuration
Repository) file must be available PRIOR to installing CRS. The
voting file should
be at least 20MB and the OCR file should be at least 100MB.
- CRS and RAC require that the following network interfaces
be configured prior
to installing CRS or
RAC:
- Public Interface
- Private Interface
- The root.sh script at the end of the CRS installation
starts the CRS stack.
If your CRS stack
does not start, see Note 240001.1.
- Only one set of CRS daemons can be running per RAC node.
- On Unix, the CRS stack is run from entries in /etc/inittab
with "respawn".
- If there is a network split (nodes loose communication
with each other). One
or more nodes may
reboot automatically to prevent data corruption.
- The supported method to start CRS is booting the
machine. MANUAL STARTUP OF
THE CRS STACK IS NOT
SUPPORTED UNTIL 10.1.0.4 OR HIGHER.
- The supported method to stop is shutdown the machine or
use "init.crs stop".
- Killing CRS daemons is not supported unless you are
removing the CRS
installation via
Note 239998.1 because flag files can become mismatched.
- For maintenance, go to single user mode at the OS.
Once the stack is started, you should be able to see all of
the daemon processes
with a ps -ef command:
[rac1]/u01/home/beta> ps -ef | grep crs
oracle 1363
999 0 11:23:21 ? 0:00 /u01/crs_home/bin/evmlogger.bin -o /u01
oracle 999
1 0 11:21:39 ? 0:01 /u01/crs_home/bin/evmd.bin
root 1003 1 0
11:21:39 ? 0:01
/u01/crs_home/bin/crsd.bin
oracle 1002 1 0
11:21:39 ? 0:01
/u01/crs_home/bin/ocssd.bin
CRS DAEMON FUNCTIONALITY
--------------------------------------------
Here is a short description of each of the CRS daemon
processes:
CRSD:
- Engine for HA operation
- Manages 'application resources'
- Starts, stops, and fails 'application resources' over
- Spawns separate 'actions' to start/stop/check application
resources
- Maintains configuration profiles in the OCR (Oracle
Configuration Repository)
- Stores current known state in the OCR.
- Runs as root
- Is restarted automatically on failure
OCSSD:
- OCSSD is part of RAC and Single Instance with ASM
- Provides access to node membership
- Provides group services
- Provides basic cluster locking
- Integrates with existing vendor clusteware, when present
- Can also runs without integration to vendor clustware
- Runs as Oracle.
- Failure exit causes machine reboot.
--- This is a feature to prevent data corruption in event of
a split brain.
EVMD:
- Generates events when things happen
- Spawns a permanent child evmlogger
- Evmlogger, on demand, spawns children
- Scans callout directory and invokes callouts.
- Runs as Oracle.
- Restarted automatically on failure
CRS LOG DIRECTORIES
--------------------------------
When troubleshooting CRS problems, it is important to review
the directories
under the CRS Home.
$ORA_CRS_HOME/crs/log - This directory includes traces for
CRS resources that are
joining, leaving, restarting, and relocating as identified
by CRS.
$ORA_CRS_HOME/crs/init - Any core dumps for the crsd.bin
daemon should be written
here. Note 1812.1
could be used to debug these.
$ORA_CRS_HOME/css/log - The css logs indicate all actions
such as
reconfigurations, missed checkins , connects, and
disconnects from the client
CSS listener . In some cases the logger logs messages with
the category of
(auth.crit) for the reboots done by oracle. This could be
used for checking the
exact time when the reboot occured.
$ORA_CRS_HOME/css/init - Core dumps from the ocssd primarily
and the pid for the
css daemon whose death is treated as fatal are located here.
If there are
abnormal restarts for css then the core files will have the
formats of
core..
Note 1812.1 could be used to debug these.
$ORA_CRS_HOME/evm/log - Log files for the evm and evmlogger
daemons. Not used
as often for debugging as the CRS and CSS directories.
$ORA_CRS_HOME/evm/init - Pid and lock files for EVM. Core files for EVM should
also be written here.
Note 1812.1 could be used to debug these.
$ORA_CRS_HOME/srvm/log - Log files for OCR.
STATUS FOR CRS RESOURCES
------------------------
After installing RAC and running the VIPCA (Virtual IP
Configuration Assistant)
launched with the RAC root.sh, you should be able to see all
of your CRS
resources with crs_stat.
Example:
cd $ORA_CRS_HOME/bin
./crs_stat
NAME=ora.rac1.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE
NAME=ora.rac1.oem
TYPE=application
TARGET=ONLINE
STATE=ONLINE
NAME=ora.rac1.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE
NAME=ora.rac1.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE
NAME=ora.rac2.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE
NAME=ora.rac2.oem
TYPE=application
TARGET=ONLINE
STATE=ONLINE
NAME=ora.rac2.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE
NAME=ora.rac2.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE
There is also a script available to view CRS resources in a
format that is
easier to read. Just
create a shell script with:
--------------------------- Begin Shell Script
-------------------------------
#!/usr/bin/ksh
#
# Sample 10g CRS resource status query script
#
# Description:
# - Returns
formatted version of crs_stat -t, in tabular
# format,
with the complete rsc names and filtering keywords
# - The
argument, $RSC_KEY, is optional and if passed to the script, will
# limit
the output to HA resources whose names match $RSC_KEY.
# Requirements:
# -
$ORA_CRS_HOME should be set in your environment
RSC_KEY=$1
QSTAT=-u
AWK=/usr/xpg4/bin/awk # if not available use /usr/bin/awk
# Table header:echo ""
$AWK \
'BEGIN
{printf "%-45s %-10s %-18s\n", "HA Resource",
"Target", "State";
printf "%-45s %-10s %-18s\n", "-----------",
"------", "-----";}'
# Table body:
$ORA_CRS_HOME/bin/crs_stat $QSTAT | $AWK \
'BEGIN { FS="="; state = 0; }
$1~/NAME/
&& $2~/'$RSC_KEY'/ {appname = $2; state=1};
state == 0
{next;}
$1~/TARGET/
&& state == 1 {apptarget = $2; state=2;}
$1~/STATE/
&& state == 2 {appstate = $2; state=3;}
state == 3
{printf "%-45s %-10s %-18s\n", appname, apptarget, appstate;
state=0;}'
--------------------------- End Shell Script
-------------------------------
Example output:
[opcbsol1]/u01/home/usupport> ./crsstat
HA Resource Target State
----------- ------ -----
ora.V10SN.V10SN1.inst ONLINE
ONLINE on opcbsol1
ora.V10SN.V10SN2.inst ONLINE
ONLINE on opcbsol2
ora.V10SN.db ONLINE ONLINE on opcbsol2
ora.opcbsol1.ASM1.asm ONLINE
ONLINE on opcbsol1
LISTENER_OPCBSOL1.lsnr ONLINE ONLINE on opcbsol1
ora.opcbsol1.gsd ONLINE ONLINE on opcbsol1
ora.opcbsol1.ons ONLINE ONLINE on opcbsol1
ora.opcbsol1.vip ONLINE ONLINE on opcbsol1
ora.opcbsol2.ASM2.asm ONLINE
ONLINE on opcbsol2
LISTENER_OPCBSOL2.lsnr ONLINE ONLINE on opcbsol2
ora.opcbsol2.gsd ONLINE ONLINE on opcbsol2
ora.opcbsol2.ons ONLINE ONLINE on opcbsol2
ora.opcbsol2.vip ONLINE ONLINE on opcbsol2
CRS RESOURCE ADMINISTRATION
---------------------------
You can use srvctl to manage these resources. Below are syntax and examples.
-------------------------------------------------------------------------------
CRS RESOURCE STATUS
srvctl status database -d [-f]
[-v] [-S ]
srvctl status instance -d -i
>[,]
[-f]
[-v] [-S ]
srvctl status service -d -s
[,]
[-f]
[-v] [-S ]
srvctl status nodeapps [-n ]
srvctl status asm -n
EXAMPLES:
Status of the database, all instances and all services.
srvctl status database -d ORACLE -v
Status of named instances with their current services.
srvctl status instance
-d ORACLE -i RAC01, RAC02 -v
Status of a named services.
srvctl status service -d ORACLE -s ERP -v
Status of all nodes supporting database applications.
srvctl status node
-------------------------------------------------------------------------------
START CRS RESOURCES
srvctl start database -d [-o <
start-options>]
[-c
| -q]
srvctl start instance -d -i
[,] [-o ] [-c
| -q]
srvctl start service -d [-s
[,]]
[-i
] [-o
] [-c | -q]
srvctl start nodeapps -n
srvctl start asm -n [-i
] [-o ]
EXAMPLES:
Start the database with all enabled instances.
srvctl start database -d ORACLE
Start named instances.
srvctl start instance
-d ORACLE -i RAC03, RAC04
Start named services.
Dependent instances are started as needed.
srvctl start service -d ORACLE -s CRM
Start a service at the named instance.
srvctl start service
-d ORACLE -s CRM -i RAC04
Start node applications.
srvctl start nodeapps
-n myclust-4
-------------------------------------------------------------------------------
STOP CRS RESOURCES
srvctl stop database -d [-o
]
[-c
| -q]
srvctl stop instance -d -i
[,]
[-o
][-c | -q]
srvctl stop service -d [-s
[,]]
[-i
][-c | -q] [-f]
srvctl stop nodeapps -n
srvctl stop asm -n [-i
] [-o ]
EXAMPLES:
Stop the database, all instances and all services.
srvctl stop database -d ORACLE
Stop named instances, first relocating all existing
services.
srvctl stop instance
-d ORACLE -i RAC03,RAC04
Stop the service.
srvctl stop service -d ORACLE -s CRM
Stop the service at the named instances.
srvctl stop service
-d ORACLE -s CRM -i RAC04
Stop node applications.
Note that instances and services also stop.
srvctl stop nodeapps
-n myclust-4
-------------------------------------------------------------------------------
ADD CRS RESOURCES
srvctl add database -d -o
[-m ] [-p ]
[-A
/netmask] [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY}]
[-s
] [-n ]
srvctl add instance -d -i -n
srvctl add service -d -s
-r
[-a
] [-P ] [-u]
srvctl add nodeapps -n -o
[-A
/netmask[/if1[|if2|...]]]
srvctl add asm -n -i
-o
OPTIONS:
-A vip range, node, and database, address specification. The
format of
address string
is:
[]//[/
host interface2 |..]>] [,]
[]//
[/]
-a for services, list of available instances, this list
cannot include
preferred
instances
-m domain name with the format "us.mydomain.com"
-n node name that will support one or more instances
-o $ORACLE_HOME to locate Oracle binaries
-P for services, TAF preconnect policy - NONE, PRECONNECT
-r for services, list of preferred instances, this list
cannot include
available instances.
-s spfile name
-u updates the preferred or available list for the service
to support the
specified
instance. Only one instance may be specified with the -u
switch. Instances that already support the service
should not be
included.
EXAMPLES:
Add a new node:
srvctl add nodeapps -n myclust-1 -o $ORACLE_HOME –A
139.184.201.1/255.255.255.0/hme0
Add a new database.
srvctl add
database -d ORACLE -o
$ORACLE_HOME
Add named instances to an existing database.
srvctl add
instance -d ORACLE -i RAC01 -n myclust-1
srvctl add
instance -d ORACLE -i RAC02 -n myclust-2
srvctl add
instance -d ORACLE -i RAC03 -n myclust-3
Add a service to an existing database with preferred
instances (-r) and
available instances (-a).
Use basic failover to the available instances.
srvctl add service -d ORACLE -s STD_BATCH -r RAC01,RAC02 -a
RAC03,RAC04
Add a service to an existing database with preferred
instances in list one and
available instances in list two. Use preconnect at the
available instances.
srvctl add service -d ORACLE -s STD_BATCH -r RAC01,RAC02 -a
RAC03,RAC04 -P PRECONNECT
-------------------------------------------------------------------------------
REMOVE CRS RESOURCES
srvctl remove database -d
srvctl remove instance
-d [-i ]
srvctl remove service -d -s
[-i ]
srvctl remove nodeapps -n
EXAMPLES:
Remove the applications for a database.
srvctl remove database
-d ORACLE
Remove the applications for named instances of an existing
database.
srvctl remove instance -d ORACLE -i RAC03
srvctl remove instance -d ORACLE -i RAC04
Remove the service.
srvctl remove service
-d ORACLE -s STD_BATCH
Remove the service from the instances.
srvctl remove
service -d ORACLE -s STD_BATCH -i
RAC03,RAC04
Remove all node applications from a node.
srvctl remove
nodeapps -n myclust-4
-------------------------------------------------------------------------------
MODIFY CRS RESOURCES
srvctl modify database -d [-n ] [-m ]
[-p
] [-r {PRIMARY |
PHYSICAL_STANDBY | LOGICAL_STANDBY}]
[-s
]
srvctl modify instance -d -i
-n
srvctl modify instance -d -i
{-s | -r}
srvctl modify service -d -s
-i
-t
[-f]
srvctl modify service -d -s
-i
-r [-f]
srvctl modify nodeapps -n [-A
] [-x]
OPTIONS:
-i -t the instance name (-i) is replaced by the
instance name (-t)
-i -r the named instance is modified
to be a preferred instance
-A address-list for VIP application, at node level
-s add or remove ASM dependency
EXAMPLES:
Modify an instance to execute on another node.
srvctl modify
instance -d ORACLE -n myclust-4
Modify a service to execute on another node.
srvctl modify service -d ORACLE -s HOT_BATCH -i RAC01 -t RAC02
Modify an instance to be a preferred instance for a service.
srvctl modify service -d ORACLE -s HOT_BATCH -i RAC02 –r
-------------------------------------------------------------------------------
RELOCATE SERVICES
srvctl relocate service -d -s
[-i ]-t [-f]
EXAMPLES:
Relocate a service from one instance to another
srvctl relocate
service -d ORACLE -s CRM -i RAC04 -t RAC01
-------------------------------------------------------------------------------
ENABLE CRS RESOURCES (The resource may be up or down to use
this function)
srvctl enable database -d
srvctl enable instance -d -i
[,]
srvctl enable service -d -s
] [, ] [-i
]
EXAMPLES:
Enable the database.
srvctl enable database -d ORACLE
Enable the named instances.
srvctl enable instance
-d ORACLE -i RAC01, RAC02
Enable the service.
srvctl enable service
-d ORACLE -s ERP,CRM
Enable the service at the named instance.
srvctl enable service
-d ORACLE -s CRM -i RAC03
-------------------------------------------------------------------------------
DISABLE CRS RESOURCES (The resource must be down to use this
function)
srvctl disable database -d
srvctl disable instance -d -i
[,]
srvctl disable service -d -s
] [,] [-i ]
EXAMPLES:
Disable the database globally.
srvctl disable database -d ORACLE
Disable the named instances.
srvctl disable instance
-d ORACLE -i RAC01, RAC02
Disable the service globally.
srvctl disable
service -d ORACLE -s ERP,CRM
Disable the service at the named instance.
srvctl disable
service -d ORACLE -s CRM -i
RAC03,RAC04
Comments
Post a Comment