10g RAC: Lessons Learned
PURPOSE
-------
Summarize experiences and lessons learned from 10g RAC Early Adopter program
SCOPE & APPLICATION
-------------------
support engineers, customer DBAs, novice to intermediate
10g RAC Lessons Learned
-----------------------------
The following note summarizes lessons learned regarding Oracle 10g RAC during 10g RAC Early Adopter Program (10.1.0.2, 10.1.0.3). It covers a lot of general information, how to's, tips and techniques that the support analyst or DBA will find handy in troubleshooting or managing a 10g RAC installation. The environment for the subject installation was RedHat Linux (RHEL3).
10g Infrastructure Components
1) Cluster Ready Services (CRS)
2) Oracle Cluster Repository (OCR)
3) Voting Disk/quorum device
4) Managing CRS Resources (srvctl)
5) Nodeapps Demystified
6) Automatic Storage Management (ASM)
7) RAC Database Management
8) Database/Grid Control
1) Cluster Ready Services (CRS)
New integrated cluster manager
Separate home - $CRS_HOME
How do you start/stop it?
Start - reboot the server, only supported method
Stop - as root
$ /etc/init.d/init.crs stop (will start on reboot)
$ /etc/init.d/init.crs disable (doesn't stop but will prevent start on reboot)
Is CRS running?
$ ps -ef|grep d.bin
root 1318 1 0 Oct21 ? 00:06:40 /home/oracle/product/crs/bin/crsd.bin
oracle 1510 1316 0 Oct21 ? 00:03:41 /home/oracle/product/crs/bin/evmd.bin
oracle 1558 1484 0 Oct21 ? 00:12:42 /home/oracle/product/crs/bin/ocssd.bin
$ ps -efm|grep crsd.bin
root 1318 1 0 Oct21 ? 00:00:00 /home/oracle/product/crs/bin/crsd.bin
root 1602 1318 0 Oct21 ? 00:00:00 /home/oracle/product/crs/bin/crsd.bin
root 1620 1318 0 Oct21 ? 00:00:00 /home/oracle/product/crs/bin/crsd.bin
...
root 1620 1318 0 Oct21 ? 00:00:00 /home/oracle/product/crs/bin/crsd.bin
2) Oracle Cluster Repository (OCR)
file contained in a raw device on shared storage (100MB)
$ ls -l
total 4
crw-rw---- 1 oracle dba 162, 107 Feb 18 2004 css
drwxr-xr-x 2 root root 4096 Jul 22 11:24 DB10
crw-r----- 1 root dba 162, 106 Feb 18 2004 ocr
Created as part of CRS install
Contains metadata about highly available resources
Nodeapps (vip, tns listener, ons, gsd)
ASM
Database instances
Services
Backed up automatically by Oracle
Find most recent backup (could be on any node of cluster)
$ ./$CRS_HOME/bin/ocrconfig -showbackup
Restore OCR from backup (make sure CRS is down)
$ ./$CRS_HOME/bin/ocrconfig -restore
3) Voting Disk (quorum device)
file contained in a raw device on shared storage (100MB, css in this case)
$ ls -l
total 4
crw-rw---- 1 oracle dba 162, 107 Feb 18 2004 css <==== voting disk
drwxr-xr-x 2 root root 4096 Jul 22 11:24 DB10
crw-r----- 1 root dba 162, 106 Feb 18 2004 ocr
4) CRS Resources
Resources that are highly available
Nodeapps
vip (virtual IP)
ons (oracle notification service)
gsd (global services daemon)
tns listener
asm instances
database
database instances
services
Monitored by CRS
Restarted in case of failure
Failed over if necessary (VIP)
\ Managing CRS Resources
srvctl - interface for managing HA resources
Usage
$ srvctl -h
Usage: srvctl [-V]
Usage: srvctl add database -d -o [-m ] [-p ] [-A /netmask] [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY}] [-s ] [-n ]
Usage: srvctl add instance -d -i -n
Usage: srvctl add service -d -s -r "" [-a ""] [-P ]
... etc
Add, remove, modify resources
Nodeapps
ASM instances
Database instances
Start, stop, status resources
don't use sqlplus, use srvctl
single point of control
run from any node in the cluster
Status resources...
nodeapps
$ srvctl status nodeapps -n otldb01
VIP is running on node: otldb01
GSD is running on node: otldb01
Listener is running on node: otldb01
ONS daemon is running on node: otldb01
asm
$ srvctl status asm -n otldb01
ASM instance +ASM1 is running on node otldb01.
database instance
$ srvctl status instance -d engp -i engp1
Instance engp1 is running on node otldb01
database
$ srvctl status database -d engp
Instance engp1 is running on node otldb01
Instance engp2 is running on node otldb02
Start resources...
nodeapps
$ srvctl start nodeapps -n otldb01
$ srvctl status nodeapps -n otldb01
VIP is running on node: otldb01
GSD is running on node: otldb01
Listener is running on node: otldb01
ONS daemon is running on node: otldb01
asm
$ srvctl start asm -n otldb01
$ srvctl status asm -n otldb01
ASM instance +ASM1 is running on node otldb01.
database instance
$ srvctl start instance -d engp -i engp1
$ srvctl status instance -d engp -i engp1
Instance engp1 is running on node otldb01
database
$ srvctl start database -d engp
$ srvctl status database -d engp
Instance engp1 is running on node otldb01
Instance engp2 is running on node otldb02
CRS Resource Attributes
CRS resources managed by srvctl
Resources can be listed using crs_stat utility
$ $CRS_HOME/bin/crs_stat
NAME=ora.engp.db
TYPE=application
TARGET=ONLINE ==========> OFFLINE if resource stopped
STATE=ONLINE on otldb02 ==========> OFFLINE if resource stopped
NAME=ora.engp.engp1.inst
TYPE=application
TARGET=ONLINE
STATE=ONLINE on otldb01
NAME=ora.otldb01.ASM1.asm
TYPE=application
TARGET=ONLINE
STATE=ONLINE on otldb01
NAME=ora.otldb01.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on otldb01
... etc
Accessing individual resource attributes...
Obtain resource name using crs_stat, then...
$ $CRS_HOME/bin/crs_stat -p ora.engp.engp1.inst
NAME=ora.engp.engp1.inst
TYPE=application
ACTION_SCRIPT=/db/dbhome/oracle/product/10.1.0/bin/racgwrap
ACTIVE_PLACEMENT=0
AUTO_START=1
CHECK_INTERVAL=600
DESCRIPTION=CRS application for Instance
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
HOSTING_MEMBERS=otldb01
OPTIONAL_RESOURCES=
PLACEMENT=restricted
REQUIRED_RESOURCES=ora.otldb01.vip ora.otldb01.ASM1.asm ===? note dependency on vip and asm ?===
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=600
USR_ORA_ALERT_NAME=
USR_ORA_CHECK_TIMEOUT=0
USR_ORA_CONNECT_STR=/ as sysdba
... etc
5) Nodeapps demystified
Each node has its own nodeapps, managed individually
$ srvctl start nodeapps -n otldb01
$ srvctl start nodeapps -n otldb02
If things are configured properly shouldn't have to worry about nodeapps
Nodeapps should come up automatically after reboot
If ONS daemon doesn't start, try
$ onsctl start
If you stop the nodeapps be aware:
vip will go down
ASM will go down
Database will go down
Created using vipca which runs as part of root.sh of db install
Alternately created using srvctl add nodeapps command
Four individual nodeapps
$ srvctl status nodeapps -n otldb01
VIP is running on node: otldb01
GSD is running on node: otldb01
Listener is running on node: otldb01
ONS daemon is running on node: otldb01
vip - Virtual IP address
Required for each RAC node in cluster before installation
Second IP address in addition to regular static IP address
Should be a public IP address on same subnet as static server IP address
Configured in DNS and /etc/hosts
Shares public interface, typically eth0
$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:06:5B:F4:80:91
inet addr:10.32.3.53 Bcast:10.32.7.255 Mask:255.255.248.0
eth0:1 Link encap:Ethernet HWaddr 00:06:5B:F4:80:91 ==? vip
inet addr:10.32.3.14 Bcast:10.255.255.255 Mask:255.255.248.0
Stop the nodeapps and the vip is not pingable
vip configuration - vipca or srvctl
vipca GUI runs as part of database install root.sh (root needs to export DISPLAY for GUI)
Verify LD_ASSUME_KERNEL=2.4.19 on RHEL3
srvctl usage
Eg., Srvctl add nodeapps -n otldb01 -o /db/dbhome/oracle/product/10.1.0 -A 10.32.3.14/255.255.248.0/eth0
GSD - global services daemon
ONS - Oracle Notification Services
Listener - traditional TNS Listener
made highly available within CRS
created/configured via netca
6) Automatic Storage Management (ASM)
Separate instance, about 100MB of memory
simlar to database instance with smaller footprint
One ASM instance per node
One or many database instances per ASM instance
RMAN is the main interface to manipulate files in ASM
ASM managed one node at a time using srvctl
$ srvctl start asm -n otldb01
$ srvctl stop asm -n otldb01
$ srvctl status adm -n otldb01
ASM dependent upon the vip, if vip goes down so will ASM
If need be, create dependency between DB instance and ASM instance
$ srvctl modify instance -d engp -i engp1 -s +ASM1
ASM does NOT stop automatically when DB is stopped
ASM instance looks like a database instance
$ ps -ef|grep asm
oracle 22085 1 0 14:02 ? 00:00:03 asm_pmon_+ASM1
oracle 22087 1 0 14:02 ? 00:00:17 asm_diag_+ASM1
oracle 22089 1 0 14:02 ? 00:00:11 asm_lmon_+ASM1
oracle 22091 1 0 14:02 ? 00:00:08 asm_lmd0_+ASM1
oracle 22093 1 0 14:02 ? 00:00:13 asm_lms0_+ASM1
oracle 22095 1 0 14:02 ? 00:00:00 asm_mman_+ASM1
oracle 22097 1 0 14:02 ? 00:00:00 asm_dbw0_+ASM1
oracle 22099 1 0 14:02 ? 00:00:00 asm_lgwr_+ASM1
oracle 22101 1 0 14:02 ? 00:00:01 asm_ckpt_+ASM1
oracle 22103 1 0 14:02 ? 00:00:00 asm_smon_+ASM1
oracle 22105 1 0 14:02 ? 00:00:01 asm_rbal_+ASM1
oracle 22107 1 0 14:02 ? 00:00:01 asm_lck0_+ASM1
oracle 22129 1 0 14:02 ? 00:01:13 /db/dbhome/oracle/product/10.1.0/bin/racgimon daemon ora.otldb01.ASM1.asm
oracle 22207 1 0 14:03 ? 00:00:00 ora_asmb_engp1
ASM has dynamic performance views like a DB
V$ASM_ALIAS
V$ASM_CLIENT
V$ASM_DISK
V$ASM_DISKGROUP
V$ASM_FILE
V$ASM_OPERATION
V$ASM_TEMPLATE
Database has same views
However...
Database instance V$ASM_DISK and V$ASM_DISKGROUP do not report FREE_MB accurately
This is a feature, not a bug
The documentation should have stated this but doesn't
Logged into database instance:
SQL> select total_mb,free_mb from v$asm_diskgroup;
TOTAL_MB FREE_MB
---------- ----------
511192 0
Logged into ASM instance:
SQL> select total_mb,free_mb from v$asm_diskgroup;
TOTAL_MB FREE_MB
---------- ----------
511192 488872
7) RAC Database Management
DBCA and SID length
SID limited to 5 characters
Documented in RAC Installation and Configuration Guide, Chapter 9
"SID prefix must be between one and five characters
in length and also begin with an alphabetical character.
DBCA uses the SID prefix to generate a unique value for
the ORACLE_SID for each instance."
Use srvctl to start and stop database, instances
Manage from one node, no need to visit every node
$ srvctl status database -d engp
Instance engp1 is running on node otldb01
Instance engp2 is running on node otldb02
$ srvctl stop instance -d engp -i engp1
$ srvctl status database -d engp
Instance engp1 is not running on node otldb01
Instance engp2 is running on node otldb02
$ srvctl start instance -d engp -i engp1
Validate database CRS configuration
$ srvctl config database -d engp
otldb01 engp1 /db/dbhome/oracle/product/10.1.0
otldb02 engp2 /db/dbhome/oracle/product/10.1.0
If all database instance homes aren't listed then correct before proceeding
This should be considered a pre-requisite for Grid Control Agent installation.
If any instance is missing from above command then Grid Control Agent files won't
properly propagate to other nodes.
RAC Interconnect (Linux)
UDP send and receive packet size
256k recommended minimum size (9i also)
To check:
$ cat /proc/sys/net/core/rmem_default
$ cat /proc/sys/net/core/rmem_max
$ cat /proc/sys/net/core/wmem_default
$ cat /proc/sys/net/core/wmem_max
Should be 262144 (minimum)
To change dynamically (as root)
$ sysctl -w net.core.rmem_max=262144
$ sysctl -w net.core.wmem_max=262144
$ sysctl -w net.core.rmem_default=262144
$ sysctl -w net.core.wmem_default=262144
If 'gc cr multiblock request' waits are a problem
x Check Bug 3951017
UDP Packet Size >= (db_file_multiblock_read_count * db_block_size)
_db_block_prefetch_limit = db_file_multiblock_read_count
ASYNC IO (Linux)
Bug 3438751 - 10.1.0.2 not linked in properly with async IO
To check (verify even with 10.1.0.3):
$ nm $ORACLE_HOME/bin/oracle |grep io_getevent
w io_getevents@@LIBAIO_0.1
Lack of @@LIBAIO_0.1 following w io_getevents indicates Oracle not properly
linked with asynch IO
See Note 270213.1 for more information
Another check:
$ cat /proc/slabinfo |grep kio
kioctx 690 690 128 23 23 1 : 1008 252
kiocb 58446 65160 128 1971 2172 1 : 1008 252
kiobuf 2244 4260 128 102 142 1 : 1008 252
Look for kioctx and kiocb, if the first 2 columns are 0, then Oracle isn't using async IO
Make sure database is running for this check
See Note 237299.1 for more details
Yet another check:
$ ldd $ORACLE_HOME/bin/oracle |grep libaio.so.1
libaio.so.1 => /usr/lib/libaio.so.1 (0xb6cf3000)
if libaio.so.1 not listed then oracle isn't using async IO. (make sure database is running)
ASYNC IO (Linux best practices)
disk_async_io=true
Duh, right? But it won't do any good if Oracle not linked properly with async IO!
disk_async_io=true is default in 10g
aio-max-size OS parameter
DSS workloads
/proc/sys/fs/aio-max-size default 131072 bytes should be increased >=1M
suggested initial value for aio-max-size could be 1048576
set this value executing the following command as root user:
$ /echo >/proc/sys/fs/aio-max-size 1048576/
OLTP workloads
default of 131072 should suffice, as these workloads perform small writes.
See Note 225751.1 for more details
|
Comments
Post a Comment