51. When I
run 10.2 CLUVFY on a system where RAC 10g Release 1 is running I get following
output:
Package
existence check failed for "SUNWscucm:3.1".
Package
existence check failed for "SUNWudlmr:3.1".
Package
existence check failed for "SUNWudlm:3.1".
Package
existence check failed for "ORCLudlm:Dev_Release_06/11/04,_64bit_3.3.4.8_reentrant".
Package
existence check failed for "SUNWscr:3.1".
Package
existence check failed for "SUNWscu:3.1".
Checking
this Solaris system I don't see those packages installed. Can I continue my
install?
Note that cluvfy checks all possible prerequisites
and tells you whether your system passes the check or not. You can then cross
reference with the install guide to see if the checks that failed are required
for your type of installation. It the above case, if you are not planning on
using Sun Cluster, then you can continue the install. The checks that failed
are the checks for Sun Cluster required packages and are not needed on your
cluster. As long as everything else checks out successfully, you can continue.
52. Why is
validateUserEquiv failing during install (or cluvfy run)?
SSH must be set up as per the
pre-installation tasks. It is also necessary to have file permissions set as described
below for features such as Public Key Authorization to work. If your
permissions are not correct, public key authentication will fail, and will
fallback to password authentication with no helpful message as to why. The
following server configuration files and/or directories must be owned by the
account owner or by root and GROUP and WORLD WRITE permission must be disabled.
$HOME
$HOME/.rhosts
$HOME/.shosts
$HOME/.ssh
$HOME/.ssh.authorized-keys
$HOME/.ssh/authorized-keys2
#Openssh specific for ssh2 protocol.
SSH (from OUI) will also fail if you have
not connected to each machine in your cluster as per the note in the installation
guide:
The first time you use SSH to connect to
a node from a particular system, you may see a message similar to the
following:
The authenticity of host 'node1
(140.87.152.153)' can't be established. RSA key fingerprint is 7z:ez:e7:f6:f4:f2:4f:8f:9z:79:85:62:20:90:92:z9.
Are you sure you want to continue
connecting (yes/no)?
Enter |yes| at the prompt to continue.
You should not see this message again when you connect from this system to that
node. Answering yes to this question causes an entry to be added to a
"known-hosts" file in the .ssh directory which is why subsequent
connection requests do not re-ask. This is known to work on Solaris and Linux
but may work on other platforms as well.
53. Can I
use ASM to mirror Oracle data in an extended RAC environment?
This support is for 10gR2 onwards and has
the following limitations:
1. As in any extended RAC environments,
the additional latency induced by distance will affect I/O and cache fusion
performance. This effect will vary by distance and the customer is responsible
for ensuring that the impact attained in their environment is acceptable for
their application.
2. OCR must be mirrored across both sites
using Oracle provided mechanisms.
3. Voting Disk redundancy must exists
across both sites, and at a 3rd site to act as an arbitrage. This third site
may be via a WAN.
4. Storage at each site much be setup as
seperate failure groups and use ASM mirroring, to ensure at least one copy of
the data at each site.
5. Customer must have a seperate and
dedicated test cluster also in an extended configuration setup using the same
software and hardware components (can be fewer or smaller nodes).
6. Customer must be aware that in 10gR2
ASM does not provide partial resilvering. Should a loss of connectivity between
the sites occur, one of the failure groups will be marked invalid. When the
site rejoins the cluster, the failure groups will need to be manually dropped
and added.
54. How can I register the listener with
Oracle Clusterware in RAC 10g Release 2?
NetCA is the only tool that configures
listener and you should be always using it. It will register the listener with
Oracle Clusterware. There are no other supported alternatives.
55. Can I
use ASM as mechanism to mirror the data in an Extended RAC cluster?
Yes, but it cannot replicate everything
that needs replication. ASM works well to replicate any object you can put in
ASM. But you cannot put the OCR or Voting Disk in ASM.
In 10gR1 they can either be mirrored
using a different mechanism (which could then be used instead of ASM) or the
OCR needs to be restored from backup and the Voting Disk can be recreated. In
the future we are looking at providing Oracle redundancy for both.
56. How
should voting disks be implemented in an extended cluster environment? Can I
use standard NFS for the third site voting disk?
http://www.oracle.com/technology/products/database/clustering/pdf/thirdvoteonnfs.pdf
Standard NFS is only supported for the tie-breaking voting disk in an extended
cluster environment. See platform and mount option restrictions at:
http://www.oracle.com/technology/products/database/clustering/pdf/thirdvoteonnfs.pdf
Otherwise just as with database files, we only support voting files on
certified NAS devices, with the appropriate mount options. Pls refer to
Metalink Note 359515.1 for a full description of the required mount options.
For a complete list of supported NAS vendors refer to OTN at:
http://www.oracle.com/technology/deploy/availability/htdocs/vendors_nfs.html
57. What
are the network requirements for an extended RAC cluster?
Interconnect, SAN, and IP Networking need
to be kept on separate channels, each with required redundancy. Redundant
connections must not share the same Dark Fiber (if used), switch, path, or even
building entrances. Keep in mind that cables can be cut.
The SAN and Interconnect connections need
to be on dedicated point-to-point connections. No WAN or Shared connection
allowed. Traditional cables are limited to about 10 km if you are to avoid
using repeaters. Dark Fiber networks allow the communication to occur without
repeaters. Since latency is limited, Dark Fiber networks allow for a greater
distance in separation between the nodes. The disadvantage of Dark Fiber networks
are they can cost hundreds of thousands of dollars, so generally they are only
an option if they already exist between the two sites.
If direct connections are used (for short
distances) this is generally done by just stringing long cables from a switch.
If a DWDM or CWDM is used then then these are directly connected via a
dedicated switch on either side.
Note of caution: Do not do RAC
Interconnect over a WAN. This is a the same as doing it over the public network
which is not supported and other uses of the network (i.e. large FTPs) can
cause performance degradations or even node evictions.
For SAN networks make sure you are using
SAN buffer credits if the distance is over 10km.
If Oracle Clusterware is being used, we
also require that a single subnet be setup for the public connections so we can
fail over VIPs from one side to another.
58. Can a
customer use SE RAC to implement an "Extended RAC Cluster" ?
YES. Effective with 11g Rel.1 the former
restriction to have all nodes co-located in one room when using SE RAC has been
lifted. Customers can now use SE RAC clusters in extended environments.
However, other SE RAC restrictions still apply (e.g. compulsory usage of ASM,
no third party cluster nor volume manager must be installed).
59. What
is the maximum distance between nodes in an extended RAC environment?
The high impact of latency create
practical limitations as to where this architecture can be deployed. While there
is not fixed distance limitation, the additional latency on round trip on I/O
and a one way cache fusion will have an affect on performance as distance
increases. For example tests at 100km showed a 3-4 ms impact on I/O and 1 ms
impact on cache fusion, thus the farther distance is the greater the impact on performance.
This architecture fits best where the 2 datacenters are relatively close
(<~25km) and the impact is negligible. Most customers implement under this
distance w/ only a handful above and the farthest known
example is at 100km. Largest distances
than the commonly implemented may want to estimate or measure the performance
hit on their application before implementing. Due ensure a proper setup of SAN
buffer credits to limit the impact of distance at the I/O layer.
60. How is the voting disk used by
Oracle Clusterware?
The voting disk is accessed exclusively
by CSS (one of the Oracle Clusterware daemons). This is totally different from
a database file. The database looks at the database files and interacts with
the CSS daemon (at a significantly higher level conceptually than any notion of
"voting disk").
"Non-synchronized access" (i.e.
database corruption) is prevented by ensuring that the remote node is down before
reassigning its locks. The voting disk, network, and the control file are used
to determine when a remote node is down, in different, parallel, indepdendent
ways that allow each to provide additional protection compared to the other.
The algorithms used for each of these three things are quite different.
As far as voting disks are concerned, a
node must be able to access strictly more than half of the voting disks at any
time. So if you want to be able to tolerate a failure of n voting disks, you
must have at least 2n+1 configured. (n=1 means 3 voting disks). You
can configure up to 32 voting disks, providing protection against 15
simultaneous disk failures, however it's unlikely that any customer would have
enough disk systems with statistically independent failure characteristics that
such a configuration is meaningful. At any rate, configuring multiple voting
disks increases the system's tolerance of disk failures (i.e. increases
reliability).
Configuring a smaller number of voting
disks on some kind of RAID system can allow a customer to use some other means
of reliability than the CSS's multiple voting disk mechanisms. However there seem
to be quite a few RAID systems that decide that 30-60 second (or 45 minutes in
the case of veritas) IO latencies are acceptable. However we have to wait for
at least the longest IO latency before we can declare a node dead and allow the
database to reassign database blocks. So while using an independent RAID system
for the voting disk may appear appealing, sometimes there are failover latency
consequenecs
61. Does
Oracle Clusterware support application vips?
Yes, with Oracle Database 10g Release 2,
Oracle Clusterware now supports an "application" vip. This is to support
putting applications under the control of Oracle Clusterware using the new high
availability API and allow the user to use the same URL or connection string
regardless of which node in the cluster the application is running on. The
application vip is a new resource defined to Oracle Clusterware and is a functional
vip. It is defined as a dependent resource to the application. There can be
many vips defined, typically one per user application under the control of
Oracle Clusterware. You must first create a profile (crs_profile), then
register it with Oracle Clusterware (crs_register). The usrvip script must run
as root.
62. How do
I put my application under the control of Oracle Clusterware to achieve higher availability?
First write a control agent. It must
accept 3 different parameters: start-The control agent should start the application,
check-The control agent should check the application, stop-The Control agent
should start the application. Secondly you must create a profile for your
application using crs_profile. Thirdly you must register your application as a
resource with Oracle Clusterware (crs_register).
63. Can I
set up failover of the VIP to another card in the same machine or what do I do
if I have different network interfaces on different nodes in my cluster (I.E.
eth0 on node1,2 and eth1 on node 3,4)?
With srvctl, you can modify the nodeapp
for the VIP to list the NICs it can use. Then VIP will try to start on eth0 interface
and if it fails, try eth1 interface.
./srvctl
modify nodeapps -n -A //eth0\|eth1
Note how the interfaces are a list
separated by the ‘|’ symbol and how you need to quote this with a ‘\’ character
or the Unix shell will interpret the character as a ‘pipe’. So on a node called
ukdh364 with a VIP address of ukdh364vip and we want a netmask (say) of
255.255.255.0 then we have:
./srvctl
modify nodeapps -n ukdh364 -A ukdh364vip/255.255.255.0/eth0\|eth1
To check which interfaces are configured
as public or private use oifcfg getif
example output:
eth0
138.2.238.0 global public
eth1
138.2.240.0 global public
eth2
138.2.236.0 global cluster_interconnect
An ifconfig on your machine will show
what the hardware names for the interface cards installed.
64. How do I identify the voting file
location ?
Run the following command from /bin
"crsctl
query css votedisk"
65. Is it
supported to allow 3rd Party Clusterware to manage Oracle resources (instances,
listeners,
etc) and
turn off Oracle Clusterware management of these?
In 10g we do not support using 3rd Party
Clusterware for failover and restart of Oracle resources. Oracle Clusterware
resources should not be disabled.
66. What
is the High Availability API?
An application-programming interface to
allow processes to be put under the High Availability infrastructure that is
part of the Oracle Clusterware distributed with Oracle Database 10g. A user
written script defines how Oracle Clusterware should start, stop and relocate
the process when the cluster node status changes. This extends the high
availability services of the cluster to any application running in the cluster.
Oracle Database 10g Real Application Clusters (RAC) databases and associated
Oracle processes (E.G. listener) are automatically managed by the clusterware.
67. Is it a requirement to have the
public interface linked to ETH0 or does it only need to be on a ETH
lower than the private interface?: - public on ETH1 -
private on ETH2
There is no requirement for interface
name ordering. You could have - public on ETH2 - private on ETH0 Just make sure
you choose the correct public interface in VIPCA, and in the installer's
interconnect classification screen.
68. Does Oracle Clusterware have to be
the same or higher release than all instances running on the
cluster?
Yes - Oracle Clusterware must be the same
or a higher release with regards to the RDBMS or ASM Homes.Note#337737.1
69. Can I
use Oracle Clusterware to monitor my EM Agent?
There is nothing special about the
commands, but you do need to follow the startup/shutdown sequence to avoid any
discontinuity of monitoring. The agent does start a watchdog that monitors the
health of the actual monitoring process. This is done automatically at agent
start. Therefore you could use Oracle Clusterware but you should not need to.
70. My
customer has noticed tons of log files generated under $CRS_HOME/log//client,
is there any way automated way we can setup through Oralce Clusterware to
prevent/minimize/remove those aggressively generated files?
Check Note.5187351.8 You can either apply
the patchset if it is available for your platform or have a cron job that
removes these files until the patch is available.
71. I am trying to move my voting disks
from one diskgroup to another and getting the error "crsctl replace
votedisk – not permitted between ASM Disk Groups." Why?
You need to review the ASM and crsctl
logs to see why the command is failing. To put your voting disks in ASM, you
must have the diskgroup set up properly. There must be enough failure groups to
support the redundancy of the voting disks as set by the redundancy on the disk
group. EG: Normal redundancy, 3 failure groups are requried, High redundancy, 5
failure groups. Note: by default each disk in a diskgroup is put in its own failure
group. The compatible.asm attribute of the diskgroup must be set to 11.2
and you must be using 11.2 version of
Oracle Clusterware and ASM.
72. Does the hostname have to match the
public name or can it be anything else?
When there is no vendor clusterware, only
Oracle Clusterware, then the public node name must match the host name. When
vendor clusterware is present, it determines the public node names, and the
installer doesn't present an opportunity to change them. So, when you have a
choice, always choose the hostname.
73. Can I
use Oracle Clusterware to provide cold failover of my single instance Oracle
Databases?
Oracle does not provide the necessary
wrappers to fail over single-instance databases using Oracle Clusterware. It's
possible for customers to use Oracle Clusterware to wrap arbitrary
applications, it'd be possible for them to wrap single-instance databases this
way. A sample can be found in the DEMOs that are
distributed with Oracle Database 11g.
74. Why
does Oracle Clusterware use an additional 'heartbeat' via the voting disk, when
other cluster software products do not?
Oracle uses this implementation because
Oracle clusters always have access to a shared disk environment. This is
different from classical clustering which assumes shared nothing architectures,
and changes the decision of what strategies are optimal when compared to other
environments. Oracle also supports a wide variety of storage types, instead of
limiting it to a specific storage type (like SCSI), allowing the customer quite
a lot of flexibility in configuration.
75. Why
does Oracle still use the voting disks when other cluster sofware is present?
Voting disks are still used when 3rd
party vendor clusterware is present, because vendor clusterware is not able to
monitor/detect all failures that matter to Oracle Clusterware and the database.
For example one known case is when the vendor clusterware is set to have its
heartbeat go over a different network than RAC traffic. Continuing to use the
voting disks allows CSS to resolve situations which would otherwise end up in cluster
hangs.
76. In the
course of failure testing in an extended RAC environment we find entries in the
cssd logfile which indicate actions like 'diskShortTimeout set to (value)' and 'diskLongTimeout
set to (value)'.
Can anyone
please explain the meaning of these two timeouts in addition to disktimeout?
Having a short and long disktimeout, and
no longer just one disktimeout, is due to patch for bug 4748797 (included in
10.2.0.2). The long disktimeout is 200 sec by default unless set differently
via 'crsctl
set css disktimeout', and applies to time outside a reconfiguration.
The short disktimeout is in effect during a reconfiguration and is
misscount-3s. The point is that we can tolerate a long disktimeout when all
nodes are just running fine, but have to revert back to a short disktimeout if
there's a reconfiguration.
77. During Oracle Clusterware
installation, I am asked to define a private node name, and then on the next
screen asked to define which interfaces should be used as private and public
interfaces.
What information is required to answer these
questions?
The private names on the first screen
determine which private interconnect will be used by CSS.
Provide exactly one name that maps to a
private IP address, or just the IP address itself. If a logical name is used,
then the IP address this maps to can be changed subsequently, but if you IP
address is specified CSS will always use that IP address. CSS cannot use
multiple private interconnects for its communication hence only one name or IP
address can be specified.
The private interconnect enforcement page
determines which private interconnect will be used by the RAC instances.
It's equivalent to setting the
CLUSTER_INTERCONNECTS init.ora parameter, but is more convenient because it is
a cluster-wide setting that does not have to be adjusted every time you add
nodes or instances.
RAC will use all of the interconnects
listed as private in this screen, and they all have to be up, just as their IP addresses
have to be when specified in the init.ora paramter. RAC does not fail over
between cluster interconnects; if one is down then the instances using them
won't start.
78. I am
trying to install Oracle Clusterware (10.2) and when I run the OUI, at the
Specify Cluster
Configuration
screen, the Add, Edit and Remove buttons are grayed out. Nothing comes up in
the
cluster nodes either. Why?
Check for 3rd Party Vendor clusterware
(such as Sun Cluster or Veritas Cluster) that was not completely removed. IE
Look for /opt/ORCLcluster directory, it should be removed.
79. What happens if I lose my voting
disk(s)?
If you lose 1/2 or more of all of your
voting disks, then nodes get evicted from the cluster, or nodes kick themselves
out of the cluster. It doesn't threaten database corruption. Alternatively you
can use external redundancy which means you are providing redundancy at the
storage level using RAID.
For this reason when using Oracle for the
redundancy of your voting disks, Oracle recommends that customers use 3 or more
voting disks in Oracle RAC 10g Release 2. Note: For best availability, the 3
voting files should be physically separate disks. It is recommended to use an
odd number as 4 disks will not be any more highly available than 3 disks, 1/2
of 3 is 1.5...rounded to 2, 1/2 of 4 is 2, once we lose 2 disks, our cluster
will fail with both 4 voting disks or 3 voting disks.
Restoring corrupted voting disks is easy
since there isn't any significant persistent data stored in the voting disk.
80. How
should I test the failure of the public network (IE Oracle VIP failover) in my
Oracle RAC environment?
Prior to 10.2.0.3, It was possible to
test VIP failover by simply running ifconfig <interface_name> down.
The intended behaviour was that the VIP
would failover to the another node. In 10.2.0.3 this is the same behaviour on
Linux, however on other operating systems the VIP will NOT failover, instead
the interface will be plumbed again. To test VIP failover on platforms other
than Linux, the switch can be turned off or the physical cable pulled. The is
best way to test. NOTE: if you have
other DB’s that share the same IP’s then they will be affected. Your tests
should simulate Production failures which are generally Switch errors or
interface errors.
81. What is the voting disk used for?
A voting disk is a backup communications
mechanism that allows CSS daemons to negotiate which subcluster will survive.
These voting disks keep a status of who is currently alive and counts votes in
case of a cluster reconfiguration. It works as follows:
a) Ensures that you cannot join the
cluster if you cannot access the voting disk(s)
b) Leave the cluster if you cannot
communicate with it (to ensure we do not have aberrant nodes)
c) Should multiple sub-clusters form, it
will only allow one to continue. It prefers a greater number of nodes, and
secondly the node with the lowest incarnation number.
d) Is kept redundant by Oracle in 10g
Release 2 (you need to access a majority of existing voting disks) At most only
one sub-cluster will continue and a split brain will be avoided.
82. I am
installing Oracle Clusterware with a 3rd party vendor clusterware however in
the "Specify
Cluster
Configuration Page" , Oracle Clusterware installer doesn't show the
existing nodes. Why?
This shows that Oracle Clusterware does
not detect the 3rd Party clusterware is installed. Make sure you have followed
the installation instructions provided by the vendor for integrating with
Oracle RAC. Make sure LD_LIBRARY_PATH is not set.
For example with Sun Cluster, make sure
the libskgxn* files to the /opt/ORCLcluster directory. Check that lsnodes
returns the correct list of nodes in the Sun Cluster.
83. Can I run the fixup script generated
by the 11.2 OUI or CVU on a running system?
It depends on what the problem that were
listed to be fixed. The fixup scripts can change system parameters so you
should not change system parameters while applications are running. However, if
an earlier version of Oracle Database is already running on the system, there
should not be any need to change the system parameters.
84. What should the permissions be set
to for the voting disk and ocr when doing an Oracle RAC Install?
The Oracle Real Application Clusters
install guide is correct. It describes the PRE-INSTALL ownership/permission
requirements for ocr and voting disk. This step is needed to make sure that the
Oracle Clusterware install succeeds. Please don't use those values to determine
what the ownership/permmission should be POST INSTALL. The root script will change the ownership/permission of ocr and voting
disk as part of install. The POST INSTALL permissions will end up
being : OCR - root:oinstall - 640 Voting Disk -
oracle:oinstall - 644
85. Oracle
Clusterware fails to start after a reboot due to permissions on raw devices
reverting to default values. How do I fix this?
After a successful installation of Oracle
Clusterware a simple reboot and Oracle Clusterware fails to start.
This is because the permissions on the
raw devices for the OCR and voting disks e.g. /dev/raw/raw{x} revert to their
default values (root:disk) and are inaccessible to Oracle. This change of
behavor started with the 2.6 kernel; in RHEL4, OEL4, RHEL5, OEL5, SLES9 and
SLES10. In RHEL3 the raw devices maintained their permissions across reboots so
this symptom was not seen.
The way to fix this is on RHEL4, OEL4 and
SLES9 is to create /etc/udev/permission.d/40-udev.permissions (you must choose
a number that's lower than 50). You can do this by copying
/etc/udev/permission.d/50- udev.permissions, and removing the lines that are
not needed (50-udev.permissions gets replaced with upgrades so you do not want
to edit it directly, also a typo in the 50-udev.permissions can render the
system non-usable). Example permissions file:
# raw
devices
raw/raw[1-2]:root:oinstall:0640
raw/raw[3-5]:oracle:oinstall:0660
Note that this applied to all raw device
files, here just the voting and OCR devices were specified.
On RHEL5, OEL5 and SLES10 a different
file is used /etc/udev/rules.d/99-raw.rules, notice that now the number must be
(any number) higher than 50. Also the syntax of the rules is different than the
permissions file, here's an example:
KERNEL=="raw[1-2]*",
GROUP="oinstall", MODE="640"
KERNEL=="raw[3-5]*",
OWNER="oracle", GROUP="oinstall", MODE="660"
86. Can the Network Interface Card (NIC)
device names be different on the nodes in a cluster, for both public and
private?
All public NICs must have the same name
on all nodes in the cluster. Similarly, all private NICs must also have the
same names on all nodes. Do not mix NICs with different interface types
(infiniband, ethernet, hyperfabric, etc.) for the same subnet/network.
87. What
are the Best Practices for using a clustered file system with Oracle RAC?
Can I use
a cluster file system for OCR, Voting Disk, Binaries as well as database files?
Oracle Best
Practice for using Cluster File Systems (CFS) with Oracle RAC
* Oracle Clusterware binaries should not be
placed on a CFS as this reduces cluster functionality while CFS is recovering,
and also limits the ability to perform rolling upgrades of Oracle Clusterware.
* Oracle Clusterware voting disks and the
Oracle Cluster Registry (OCR) should not be placed on a CFS as the I/O freeze
during CFS reconfiguration can lead to node eviction, or cluster management
activities to fail (I.E start, stop, or check of a resource).
* Oracle Database 10g binaries are
supported on CFS for Oracle RAC 10g and for Oracle Database. The system should
be configured to support multiple ORACLE_HOME’s in order to maintain the
ability to perform a rolling patch application.
* Oracle Database 10g database files
(e.g. datafiles, trace files, and archive log files) are supported on CFS.
Check Certify for
certified cluster file systems.
Rolling Upgrades with
Cluster File Systems in General
It is not recommended to use a cluster
file system (CFS) for the Oracle Clusterware binaries. Oracle Clusterware
supports in-place rolling upgrades. Using a shared Oracle Clusterware home
results in a global outage during patch application and upgrades. A workaround
is available to clone the Oracle Clusterware home for each upgrade. This is not
common practice.
If a patch is marked for rolling upgrade,
then it can be applied to a Oracle RAC database in a rolling fashion. Oracle
supports rolling upgrades for the Oracle Database Automatic Storage Management
(ASM) after you have upgraded to Oracle Database 11g. When using a CFS for the
database and ASM Oracle homes, the CFS should be configured to use of context
dependent links (CDSLs) or equivalent and these should configured to work in
conjunction with rolling upgrades and downgrades. This includes updating the
database and ASM homes in the OCR to point to the current home.
88. Do I need to have user equivalence
(ssh, etc...) set up after GRID/RAC is already installed?
Yes. Many assistants and scripts depend
on user equivalence being set up.
89. Is Sun
QFS supported with Oracle RAC? What about Sun GFS?
From certify, check there for the latest
details.
Sun Cluster - Sun StorEdge QFS (9.2.0.5
and higher,10g and 10gR2): No restrictions on placement of files on QFS
Sun StorEdge QFS is supported for Oracle
binary executables, database data files, archive logs, Oracle Cluster Registry
(OCR), Oracle Cluster ReadyServices voting disk and recovery area can be placed
on QFS.
Solaris Volume Manager for Sun Cluster
can be used for host-based mirroring. Supports up to 8 nodes
90. With GNS, do ALL public addresses
have to be DHCP managed (public IP, public VIP, public SCAN VIP)?
No, The choice to use DHCP for the public
IPs is outside of Oracle. Oracle Clusterware and Oracle RAC will work with both
static and DHCP assigned IP for the hostnames. When using GNS, Oracle Clusterwre
will use DHCP for all VIPs in the cluster, which means node vips and SCAN vips.
91. How is the Oracle Cluster Registry
(OCR) stored when I use ASM?
The OCR is stored similar to how Oracle
Database files are stored. The extents are spread across all the disks in the
diskgroup and the redundancy (which is at the extent leve) is based on the
redundancy of the disk group. You can only have one OCR in a diskgroup. Best
Practice for ASM is to have 2 diskgroups. Best Practice for OCR in ASM is to
have a copy of the OCR in each diskgroup.
92. When
does the Oracle node VIP fail over to another node and subsequently return to
its home node?
The handling of the VIP with respect to a
failover to another node and subsequent return to its home node is handled
differently depending on the Oracle Clusterware version. In general, one can
distinguish between Oracle Clusterware 10g & 11g Release 1 and Oracle
Clusterware 11g Release 2 behavior.
For Oracle Clusterware 10g & 11g
Release 1 the VIP will fail over to another node either after a network or a node
failure. However, the VIP will automatically return to its home node only after
a node failure and a subsequent restart of the node. Since the network is not
constantly monitored in this Oracle Clusterware version, there is no way that
Oracle Clusterware can detect the recovery of the network and initiate an automatic
return of the node VIP to its home node.
Exception: With Oracle Patch Set
10.2.0.3 a new behavior was introduced that allowed the node VIP to return to
its home node after the network recovered. The required network check was part
of the database instance check. However, this new check introduced quite some
side effects and hence, was disabled with subsequent bundle patches and the
Oracle Patch Set 10.2.0.4
Starting
with 10.2.0.4 and for Oracle Clusterware 11g Release 1 the default behavior is
to avoid an automatic return of the node VIP to its home node after the network
recovered. This
behavior can be activated, if required, using the "ORA_RACG_VIP_FAILBACK"
parameter. This parameter should only be used after reviewing support note
805969.1 (VIP does not relocate back to the original node starting from 10.2.0.4
and 11.1 even after the public network problem is resolved.)
With Oracle Clusterware 11g
Release 2 the
default behavior is to automatically initiate a return of the node VIP to its
home node as soon as the network recovered after a failure. It needs to be
noted that this behavior is not based on the parameter mentioned above and therefore
does not induce the same side effects.
Instead, a new network resource is used
in Oracle Clusterware 11g Release 2, which monitors the network constantly,
even after the network failed and the resource became "OFFLINE". This
feature is called "OFFLINE resource monitoring" and is per default
enabled for the network resource.
93. How do
I protect the OCR and Voting in case of media failure?
In Oracle Database 10g Release 1 the OCR
and Voting device are not mirrored within Oracle,hence both must be mirrored
via a storage vendor method, like RAID 1.
Starting with Oracle Database 10g Release
2 Oracle Clusterware will multiplex the OCR and Voting Disk (two for the OCR
and three for the Voting).
94. How do
I use multiple network interfaces to provide High Availability and/or Load
Balancing for my interconnect with Oracle Clusterware?
This needs to be done externally to
Oracle Clusterware usually by some OS provided nic bonding which gives Oracle
Clusterware a single ip address for the interconnect but provide failover (High
Availability) and/or load balancing across multiple nic cards. These solutions
are provided externally to Oracle at a much lower level than the Oracle
Clusterware, hence Oracle supports using them, the solutions are OS dependent
and therefore the best source of information is from your OS Vendor. However,
there are several articles in Metalink on how to do this. For example for Sun
Solaris search for IPMP (IP network MultiPathing).
Note: Customer should pay close attention to the
bonding setup/configuration/features and ensure their objectives are met, since
some solutions provide only failover and some only loadbalancing still others
claim to provide both. As always, it's always important to test your setup to
ensure it does what it was designed to do.
When bonding with Network Interfaces that
connect to separate switches (for redundancy) you must test if the NIC's are
configured for active/active mode. The most reliable configuration for this
architecture is to configure the NIC's for Active/Passive.
95. Is
Server Side Load Balancing supported/recommended/proven technology in Oracle
EBusiness Suite?
Yes, Customers are using it successfully
today. It is recommended to set up both Client and Server side load balancing.
Note that the pieces coming from 8.0.6 home (forms and ccm), connections are
directed to RAC instance based on the sequence its listed in the TNS entry
description list and may not get load balanced optimally. For Oracle RAC 10.2
or higher do not set PREFER_LEAST_LOADED_NODE = OFF in your listener.ora,
please set the CLB_GOAL on the service.
96. What
are the maximum number of nodes under OCFS on Linux ?
Oracle 9iRAC on Linux, using OCFS for
datafiles, can scale to a maximum of 32 nodes. According to the ** OCFS2 User
Guide User Guide, OCFS 2 can support up to 255 nodes.
97. Can I
use OCFS with SE Oracle RAC?
It is not supported to use OCFS with
Standard Edition Oracle RAC. All database files must use ASM (redo logs,
recovery area, datafiles, control files etc). You can not place binaries on
OCFS as part of the SE Oracle RAC terms. We recommend that the binaries and
trace files (non-ASM supported files) to be replicated on all nodes. This is
done automatically by install.
98. Can I
use TAF with e-Business in a RAC environment?
TAF itself does not work with e-Business
suite due to Forms/TAF limitations, but you can configure the tns failover
clause. On instance failure, when the user logs back into the system, their
session will be directed to a surviving instance, and the user will be taken to
the navigator tab. Their committed work will be available; any uncommitted work
must be re-started.
We also recommend you configure the forms
error URL to identify a fallback middle tier server for Forms processes, if no
router is available to accomplish switching across servers.
99. How to
configure concurrent manager in a RAC environment?
Large clients commonly put the concurrent
manager on a separate server now (in the middle tier) to reduce the load on the
database server. The concurrent manager programs can be tied to a specific
middle tier (e.g., you can have CMs running on more than one middle tier box).
It is advisable to use specilize CM. CM middle tiers are set up to point to the
appropriate database instance based on product module being used.
100. Should
functional partitioning be used with Oracle Applications?
We do not recommend functional
partitioning unless throughput on your server architecture demands it. Cache
fusion has been optimized to scale well with non-partitioned workload.
If your processing requirements are
extreme and your testing proves you must partition your workload in order to
reduce internode communications, you can use Profile Options to designate that
sessions for certain applications Responsibilities are created on a specific
middle tier server. That middle tier server would then be configured to connect
to a specific database instance.
To determine the correct partitioning for
your installation you would need to consider several factors like number of
concurrent users, batch users, modules used, workload characteristics etc.
101. Can I use Automatic Undo Management
with Oracle Applications?
Yes. In a RAC environment we highly
recommend it.
102. What
is the optimal migration path to be used while migrating the E-Business suite
to Oracle RAC?
Following is the recommended and most
optimal path to migrate you E-Business suite to an Oracle RAC
environment:
1. Migrate the existing application to
new hardware. (If applicable).
2. Use Clustered File System (ASM
recommended) for all data base files or migrate all database files to raw devices.
(Use dd for Unix or ocopy for NT)
3. Install/upgrade to the latest
available e-Business suite.
4. Ensure the database version is
supported with Oracle RAC
5. In step 4, install Oracle RAC option
and use Installer to perform install for all the nodes.
6. Clone Oracle Application code tree.
103. How do I gather all relevant Oracle
and OS log/trace files in an Oracle RAC cluster to provide to
Support?
Use RAC-DDT (RAC Diagnostic Data Tool),
User Guide is in Note: 301138.1. Quote from the User Guide:
RACDDT is a
data collection tool designed and configured specifically for gathering
diagnostic data related to Oracle's Real Application Cluster (RAC) technology.
RACDDT is a set of scripts and configuration files that is run on one or more
nodes of an Oracle RAC cluster. The main script is written in Perl, while a
number of proxy scripts are written using Korn shell. RACDDT will run on all
supported Unix and Linux platforms, but is not supported on any Windows platforms.
Newer versions of RDA (Remote Diagnostic
Agent) have the RAC-DDT functionality, so going forward RDA is the tool of
choice.
104. My
customer wants to understand what type of disk caching they can use with their
Windows RAC Cluster, the install guide tells them to disable disk caching?
If the write cache identified is local to
the node then that is bad for RAC. If the cache is visible to all nodes as
a 'single cache', typically in the
storage array, and is also 'battery backed' then that is OK.
105. My
customer has a failsafe cluster installed, what are the benefits of moving
their system to RAC?
Fail Safe development is continuing. Most
work on the product will be around accomodating changes in the supported
resources (new releases of RDBMS, AS, etc.) and the underlying Microsoft
Cluster Services and Windows operating system.
A failsafe protected instance is an
Active/Passive instance so, as such, does not benefit that much at all from adding
more nodes to a cluster. Microsoft have a limit of nodes in a MSCS cluster.
(typically 8 nodes - but it does vary). RAC is active active so you get dual
benefits of increased scalability and availability every time you add a node to
a cluster. We have a limit of 100 nodes in a RAC cluster (we don't use MSCS).
Your customer should really consider more than 2 nodes. (because of aggregate
computer power on node failure). If the choice is 2 of 4 CPU nodes or 4 of 2CPU
node then I would go for 2 CPU nodes. Customers are using both Windows Itanium
RAC and Windows X64 RAC. Windows X64 seems more popular.
Keep in mind, though, that for Fail Safe,
if the server is 64-Bit, regardless of flavor, Fail Safe Manager must be
installed on a 32-Bit client, which will complicate things just a bit. There is
no such restriction for RAC, as all management for RAC can be done via Grid
Control or Database Control. For EE RAC you can implement an 'extended cluster'
where there is a distance between the nodes in the cluster (usually less than
20 KM).
106. Do I
need HACMP/GPFS to store my OCR/Voting file on a shared device.
The
prerequisites doc for AIX clearly says:
"If you are not using HACMP, you
must use a GPFS file system to store the Oracle CRS files"
==> this is a documentation bug and
this will be fixed with 10.1.0.3
Note also that on AIX it is important to
use the reserve_lock=no/reserve_policy =no_reserve per shared, concurrent
device in order to allow AIX to access the devices from more than one node
simultaneously. Check the current setting using: "/usr/sbin/lsattr -El hdiskn
|grep reserve".
Depending on the type of storage used,
the command should return "no_reserve" or a similar value for all disks
meant to be used for Oracle RAC. If requiredd, use the /dev/rhdisk devices
(character special) for the crs and voting disk and change the attribute with
the following command
chdev -l hdiskn -a reserve_lock=no
(for ESS, EMC, HDS, CLARiiON, and
MPIO-capable devices you have to do an chdev -l hdiskn -a reserve_policy=no_reserve)
107. Is
VIO supported with RAC on IBM AIX?
VIO is supported on IBM AIX.
108. Is
HACMP needed for RAC on AIX 5.2 using GPFS file system?
The newest version of GPFS can be used
without HACMP, if it is available for AIX 5.2 then you do not need HACMP.
109. Can I
run Oracle RAC 10g on my IBM Mainframe Sysplex environment (z/OS)?
YES! There is no separate documentation
for RAC on z/OS. What you would call "clusterware" is built in to the
OS and the native file systems are global. IBM z/OS documentation explains how
to set up a Sysplex Cluster; once the customer has done that it is trivial to
set up a RAC database. The few steps involved are covered in in Chapter 14 of
the Oracle for z/OS System Admin Guide, which you can read here. There is also an
Install Guide for Oracle on z/OS ( here) but I don't think there are any
RAC-specific steps in the installation. By the way, RAC on z/OS does not use
Oracle's clusterware (CSS/CRS/OCR).
110. Can I
use Oracle Clusterware for failover of the SAP Enqueue and VIP services when
running SAP in a RAC environment?
Oracle has created sapctl to do this and
it is available for certain platforms. SAPCTL will be available for download on
SAP Services Marketplace on AIX and Linux. For Solaris, it will not be
available in 2007, use Veritas or Sun Cluster.
111. Does
the Oracle Cluster File System (OCFS) support network access through NFS or
Windows Network Shares?
No, in the current release the Oracle
Cluster File System (OCFS) is not supported for use by network access approaches
like NFS or Windows Network Shares.
112. Why
should I use RAC One Node instead of Oracle Fail Safe on Windows?
Oracle RAC One Node provides better high
availability than Oracle Fail Safe. RAC One Node's ability to online relocate a
database offers protection from both unplanned failures and maintenance
outages. Fail Safe only protects from failures and cannot online relocate a
database. RAC One Node supports online maintenance operations such as online
database patches, online OS patches and upgrades, online database relocation
for load balancing, online server migrations, and online upgrade to full RAC.
In an environment where it is difficult to get windows of downtime for
maintenance, this is a big advantage. Also, where Fail Safe
is only available on Windows, RAC One
Node is available on all platforms. A customer with a mixed platform environment
would benefit from having a standard HA solution across all their platforms.
113. Can I
configure HP's Autoport aggregation for NIC Bonding after the install?(i.e. not
present beforehand)
You are able to add NIC bonding after the
installation although this is more complicated than the other way round. There
are several notes on webiv regarding this. Note 276434.1 Modifying the VIP of a
Cluster Node Regarding the private interconnect, please use oifcfg delif /
setif to modify this.
Configure Redundant Network Cards /
Switches for Oracle Database 10g Release 1 Real Application Cluster on Linux
114. What do I do when I get an ORA-01031
error logging into the ASM instance?
This sounds like the ORA_DBA group on
Node2 is empty, or else does not have the correct username in it. Double-check
what user account you are using to logon to Node2 as ( a 'set' command will
show you the USERNAME and USERDOMAIN values) and then make sure that this
account is part of ORA_DBA.
The other issue to check is that SQLNET.AUTHENTICATION_SERVICES=(NTS)
is set in the SQLNET.ORA
115. The
OracleCRService does not start with my windows Oracle RAC implementation, what
do I do?
If OracleCRService doesn't start that's
quite a different issue than say OracleCSService not starting -because due to
dependencies, this is the last of the three Oracle Clusterware services that we
expect to start.
This could be caused by a few different
things. It could be caused by a change from to auto-negotiate instead of
100/full on the interconnect. Once set back to 100/full on all NICs as well as
the network switch associated with the interconnect the problem is resolved.
This could also be: - inability to access the shared disk housing your OCR -
permissions issue OR - Bug:4537790 which introduced OPMD to begin with - which
for reference sake was logged against 9.2.0.8 ... and is still relevant today
in 10.2.0.3 times. For OPMD, see Metalink Note 358156.1
116. How
do I verify that Host Bus Adapter Node Local Caching has been disabled for the
disks I will be using in my RAC cluster?
Disabling write caching is a standard
practice while using the volume managers/file systems are shared. Go to My
computer -> Manage->Storage->Disk
Management->Disk-Properties->Policies-> and uncheck the "Enable
Write Caching on Disk". This will disable the write caching.
3rd party HBA's may have their own
management tools to modify these settings. Just remember that centralized,
shared cache is generally OK. It's the node local cache that you need to turn
off. How exactly you do this will vary from HBA vendor to HBA vendor.
117. Can I
run my Oracle 9i RAC and Oracle RAC 10g on the same Windows cluster?
Yes but the Oracle 9i RAC database must
have the 9i Cluster Manager and you must run Oracle Clusterware for the Oracle
Database 10g. 9i Cluster Manager can coexsist with Oracle Clusterware 10g. Be
sure to use the same 'cluster name' in the appropriate OUI field for both 9i
and 10g when you install both together in the same cluster.
The OracleCMService9i service will remain
intact during the Oracle Clusterware 10g install, as a Oracle 9i RAC database
would require that the 9i OracleCMService9i, it should be left running. The
information for the 9i database will get migrated to the OCR during the Oracle
Clusterware installation. Then, for future database management, you would use
the 9i srvctl to manage the 9i database, and the 10g srvctl to manage any new 10g
databases. Both srvctl commands will use the OCR. The same applies for Oracle
RAC 11g
118. When
using MS VSS on Windows with Oracle RAC, do I need to run the VSS on each node
where I have an Oracle RAC instance?
There is no need to run Oracle VSS writer
instance on each Oracle RAC node (even though it is installed and enabled by
default on all nodes). And the documentation in Windows Platform Doc for Oracle
VSS writer is applicable to Oracle RAC also.
The ability of clustered file system to
create a Windows Shadow copy is a MUST to backup Oracle RAC database using
Oracle VSS writer. The only other requirement is that, all the archived logs
generated by database must be accessible on node where backup is initiated
using Oracle VSS writer.
VSS coordinates storage snapshot of db
files - the VSS writer places the db in hot backup mode so that the VSS
provider can initiate the snapshot. So, RMAN is not backing up anything in this
case. When a VSS restore of a db is issued, the writer automatically invokes
RMAN to perform needed recovery actions after the snapshot is restored by the
provider - that is the real value add of the writer.
119. How
do I configure raw devices in order to install Oracle Clusterware 10g on RHEL5
or OEL5?
The raw devices OS support scripts like /etc/sysconfig/rawdevices
are not shipped on RHEL5 or OEL5, this is because raw devices are being
deprecated on Linux. This means that in order to install Oracle Clusterware 10g
you'd have to manually bind the raw devices to the block devices for the OCR
and voting disks so that the 10g installer will proceed without error.
Refer to Note 465001.1 for exact details
on how to do the above.
Oracle Clusterware 11g doesn't require
this configuration since the installer can handle block devices directly.
120. How
to reorder or rename logical network interface (NIC) names in Linux
Although this is rarely needed, since
most hardware will detect the cards in the correct order on all nodes, if you
still need to change/control the ordering, see external website, here is more
help on writing UDEV rules.
121. Can I
configure IPMP in Actie/Active to increase bandwidth of my interconnect?
For IPMP For active/active configurations
please follow the sun doc instructions http://docs.sun.com/app/docs/doc/816-4554/6maoq027i?a=view
IPMP active/active is known to load balance on transmit but serialize on a
single interface for receive. So you are likely not to get the throughput you
might have expected. Unless you experience explicit bandwidth limitations that
require active/active, it is a recommended best practice to configure for
maximum availability, as described in webiv note 283107.1.
Please note too that debugging
active/active interfaces at the network layer is cumbersome and time consuming.
In an active/active configuration and the switch side link fails, you are
likely to lose both interconnect connections, whereas active/standby, you would
failover.
122.Does
Sun Solaris have a multipathing solution ?
Sun Solaris includes an inherent
Multipathing tool: MPXIO - this is part of Solaris. You need to have the SanFoundation
Kit installed (newest version). Please, be aware that the machines are
installed following the EIS-standard. This is a quality assurance standard introduced
by Sun that mainly takes care that you always have the newest patches.
MPXIO is free of charge and comes with
Solaris 8,9,10. BTW, if you have a Sun LVM, it would use this feature
indirectly. Therefore, Sun confirmed that MPXIO will work with RAWs.
123. Are
Red Hat GFS and GULM certified for DLM?
Both are part of Red Hat RHEL 5. For
Oracle Database 10g Release 2 on Linux x86 and Linux x86-64, it is certified on
OEL5 and RHEL5 as per certify. GFS is not certified yet , certification in
progress by RedHat. OCFS2 is certified and it's the preferred choice for
Oracle. ASM is recommended storage for the database. Since GFS is part of the
RHEL5 distribution and Oracle fully supports RHEL under the Unbreakable Linux Progam,
Oracle will support GFS as part of RHEL5 for customers buying the Unbreakable
Linux Support. This only applies to RHEL5 and not to RHEL4 where GFS is
distributed with an additional fee
124. In
Solaris 10, do we need Sun Cluster to provide redundancy for the interconnect
and multiple switches?
Link Aggregation (GLDv3) is bundled in
the OS as of Solaris 10. IPMP is available for Solaris 10 and Solaris 9.
Neither require Sun Cluster to be installed. For the interconnect and switch
redundancy, as a best practice, avoid VLAN trunking across the switches. We can
configure stand-alone redundant switches that do not require the VLAN to be
trunked between them, nor the need for an inter-switch link (ISL). If the
interconnect VLAN is trunked with other VLANS between the redundant switches,
insure that the interconnect VLAN is pruned from the trunk to avoid unnecessary
traffic propagation through the corportate network. For ease of
configuration (e.g. fewer IP address
requirements), use IPMP with link mode failure detection in primary/standby
configuration. This will give you a single failover IP which you will define in
cluster_interconnects init.ora parameter. Remove any interfaces for the
interconnect from the OCR using `oifcfg delif`. AND TEST THIS RIGOROUSLY. For
now, as Link Aggregation (GLDv3) cannot span multiple switches from a single
host, you will need to configure the switch redundancy and the host NICs with
IPMP.
When configuring IPMP for the
interconnect with multiple switches available, configure IPMP as active/standby
and *not* active/active. This is to avoid potential latencies in switch failure
detection/failover which may impact the availability of the rdbms. Note, IPMP
spreads/load balances outbound packets on the bonded interfaces, but inbound
packets are received on a single interface. In an active/active configuration this
makes send/receive problems difficult to diagnose. Both Link Aggregation
(GLDv3) and IPMP are core OS packages SUNWcsu, SUNWcsr respectively and do not
require Sun Clusterware.
125. Is
OCFS2 certified with Oracle RAC 10g?
Yes. See Certify to find out which
platforms are currently certified.
126. How
do I configure my RAC Cluster to use the RDS Infiniband?
The configuration takes place below
Oracle. You need to talk to your Infiniband vendor. Check certify for what is
currently available as this will change as vendors adopt the technology. The
database must be at least 10.2.0.3. If you want to switch a database running
with IP over IB, you will need to relink Oracle.
$ cd $ORACLE_HOME/rdbms/lib $ make -f ins_rdbms.mk
ipc_rds ioracle
You can check your interconnect through
the alert log at startup. Check for the string “cluster interconnect IPC
version:Oracle RDS/IP (generic)” in the alert.log file. See Note: 751343.1 for
more details.
127.Can
different releases of Oracle RAC be installed and run on the same physical
Linux cluster?
Yes - However Oracle Clusterware (CRS)
will not support a Oracle 9i RAC database so you will have to leave the current
configuration in place. You can install Oracle Clusterware and Oracle RAC 10g
or 11g into the same cluster. On Windows and Linux, you must run the 9i Cluster
Manager for the 9i Database and the Oracle Clusterware for the 10g Database.
When you install Oracle Clusterware, your 9i srvconfig file will be converted
to the OCR. Oracle 9i RAC, Oracle RAC 10g, and Oracle RAC 11g will use the OCR.
Do not restart the 9i gsd after you have installed Oracle Clusterware. Remember
to check certify for details of what
vendor clusterware can be run with Oracle
Clusterware. Oracle Clusterware must be the highest level (down to the
patchset). IE Oracle Clusterware 11g Release 2 will support Oracle RAC 10g and
Oracle RAC 11g databases. Oracle Clusterware 10g can only support Oracle RAC
10g databases.
128. Is
3rd Party Clusterware supported on Linux such as Veritas or Redhat?
No, Oracle RAC 10g and Oracle RAC 11g do
not support 3rd Party clusterware on Linux. This means that if a cluster file
system requires a 3rd party clusterware, the cluster file system is not
supported.
129. Can
the Oracle Database Configuration Assistant (DBCA) be used to create a database
with Veritas DBE / AC 3.5?
DBCA can be used to create databases on
raw devices in 9i RAC Release 1 and 9i Release 2. Standard database creation
scripts using SQL commands will work with file system and raw. DBCA cannot be
used to create databases on file systems on Oracle 9i Release 1. The user can
choose to set up a database on raw devices, and have DBCA output a script. The
script can then be modified to use cluster file systems instead.
With Oracle 9i RAC Release 2 (Oracle
9.2), DBCA can be used to create databases on a cluster filesystem. If the
ORACLE_HOME is stored on the cluster filesystem, the tool will work directly.
If ORACLE_HOME is on local drives on each system, and the customer wishes to
place database files onto a cluster file system, they must invoke DBCA as
follows: dbca -datafileDestination /oradata where /oradata is on the CFS
filesystem. See 9iR2 README and bug 2300874 for more info.
130. Is
Oracle Database on VMware support? Is Oracle RAC on VMware supported?
Oracle Database support on VMware is
outlined in Metalink Note 249212.1. Effectively, for most customers, this means
they are not willing to run production Oracle databases on VMware. Regarding
Oracle RAC - the explicit mention not to run RAC on vmware was removed in
11.2.0.2 (Novemeber 2010)
No comments:
Post a Comment