Monday, 11 June 2018

Interview Q and A for Oracle Performance Tuning Part - 7

381. Does AWR report provide any PQ stats/usage?
Yes. It reports PX wait events so you can see time spent waiting for them. This can help you assess whether this is reasonable or too long is being spent. There are also statisitcs displayed under "Instance Activity Stats" related to Parallel Query for example "PX local messages recv'd", "PX local messages sent" , "Parallel operations downgraded to ..."

382. How accurate are the advisory sections in the AWR ?
The advisory sections are based upon data collected in a particular snapshot period that the report was run against. As such they provide good advice based upon actual collected evidence.
If you choose to act upon the advice, as with any changes, verification, testing and commons sense should be used.

383. What is the reference time/values to consider in the AWR for checking any problem?
There is not really a hard limit for all cases.It is better to look at what the top waits are overall on the system and then examine these to see what is causing them. By looking at the top waits you are looking at a combination of wait time and frequency. A small number of longer waits may not be an issue overall. A larger number of medium waits may have a far greater impact on the system.

384. If we have a deadlock, can we find out in an ADDM, AWR or ASH report?
No. These reports are not useful for that purpose. If deadlocks are occurring, you may see more waits for enqueues than normal, but if only a couple of sessions are involved this would likely be insignificant in the context of the whole system. If you are encountering deadlock please refer to the following article:
Document 60.1 Troubleshooting Assistant: Oracle Database ORA-00060 Errors on Single Instance (Non-RAC) Diagnosing Using Deadlock Graphs in ORA-00060 Trace Files

385. Is it possible to report the activity/access for a certain table using AWR/ASH/ADDM?
AWR records database wide information, so other than showing the top SQL generally, that may happen to be against the table you are interested in, it does not specifically report on them.
ASH data reports on the session history so you may be able to retrieve the information that you want from there. If you use the Enterprise Manager interface to the AWR data then this provides more flexibility than can be provided by the AWR report itself.

386. How can you obtain the CPU consumption by a user for each hour from AWR?
The AWR report itself is showing you information from the whole database at once. It does not provide you with information at the individual user level. You could see session information from an ASH report. If you use the Enterprise Manager interface to the AWR data then this provides more flexibility than can be provided by the AWR report itself.

387. Is there a way to find all the memory used the by the database including all the user processes within a certain time frame?
Not with AWR, no. The memory statistics reported by AWR are a comparison between the start and the end of a period.
It is possible for memory to have been allocated and de-allocated within a period and for the statistics to remain the same making it look as though nothing has changed in the period.
Additionally, for a user process, the AWR report only reports PGA statistics and that process may be taking up additional server memory at the operating system level that is not covered or reported. This means that you might not get the total usage for the database from AWR. You can get a full picture of memory usage at the O/S level using OSWatcher.

388. How is ADDM/ASH related to AWR?
The AWR is the repository where workload information is automatically recorded. This workload information contains information about the database as a whole and also about individual sessions. It can be reported on in different fashions.
The AWR report generates output showing the base statistics from the AWR for a given period for the whole database.
The ADDM report uses AWR information for a given period for the whole database in order to present the findings (in the form of a report) in the form of more readable advice.
The ASH report generates output based upon the Active Session History rather than the whole database
389. I have found the addmrpt.sql goes directly to the problem. But it seems like awrrpt.sql gives more detail, but I have found addmrpt.sql more useful. Any comments on this?
AWR contains a detailed set of information that ADDM analyses and reports on. ADDM uses the information from AWR to formulate actions. ADDM does not necessarily use all the information
in every case since AWR is comprehensive in its coverage and the majority of information in AWR is not applicable all the time for performance improvement.

390. In terms of ADDM and SGA sizing the ADDM always seems to suggest SGA is undersized, how reliable it this as a counter? Is there a better way to determine the optimal SGA size?
The ADDM report is working from the basis that you are looking at it because you have a performance problem. This means it is somewhat more strict than it might be if it sees that there is insufficient memory to store everything. There is a misconception that it "always suggests the SGA is undersized". It does not. If you are encountering memory issues then this information would be useful. If you are not encountering memory issues but have other problems then this advice may be less important. It is difficult to differentiate between cases where the SGA is making a difference or not based on automated information so a shortage is reported if there is one and then this can be dealt with as necessary.
The SGA Target Advisory in the AWR report provides further information regarding the difference that changes to the SGA size is predicted to make.
If you consistently feel that you want ADDM to report this less frequently, there is an example of how to achieve this with DBMS_ADDM.INSERT_FINDING_DIRECTIVE

391. How can confirm if my database has enough memory allocated to it?
The Advisory Statistics section of the AWR report is a good place to start. But, generally, if you do not have much in the way of waits for buffers (data blocks in memory) or reloads (of statement text), you have enough memory.
ADDM will also provide advice for a related AWR report period.

392. Can I reduce cache waits by adding RAM to the database?
There is no direct answer to that, you should investigate *why* there are cache waits first. If a shortage of RAM is not the problem then adding more will make no difference

393. What is the best way to determine that the hardware is not enough and more needs to be added?
The ADDM report will provide useful information in this area. You can also use OSWatcher to get information about how the operating system is performing.

394. Would adding new HW make all these events go away, for example adding cpu or memory?
Not necessarily, no. Adding hardware would only help if the hardware was constraining the activity that you are trying to perform. If you are contending with other processes for a particular resource then adding more unrelated hardware will not help.

395. What does swapping mean?
Swapping is when memory is written to swap space on disk (either by pages or whole processes memory space).

396. Are you more likely to encounter swapping is you do not have sufficent RAM, especially in the case of multiple instances on the same machine?
Not necessarily - some operating systems will actively swap pages of memory. This is usually not a problem. Only when whole (key) processes are swapped does it becomes an issue.

397. Should I get worried if I see that the system is swapping, does that indicate a need to add more physical RAM?
It may mean that. This is usually the case, but not always. Your memory capacity should be planned for and swapping should not happen but because you probably do not have infinite capacity on your system you will occassionally encounter swapping in a normal environment.

398. What is the performance impact of running ASH 4 times in an hour?
The concept of "running" ASH is little different than AWR. AWR captures "pictures" (snapshots) of the system every N times, ASH just flushes some memory info to disk. Thus ASH is a much more lightweight activity event and does not place any considerable load on the system.

399. Can an ASH report generate the same level of detail as a 10046 trace?
No - that is not possible. It is insufficiently granular and would be misleading (it does not sample enough data). Remember that the ASH report is sampling data from the outside. Trace is writing everything that happens as it progresses through the code.

400. At what time interval is ASH captured from memory?
The data stored in memory is sampled every second. When the data is stored in the DBA_HIST_ACTIVE_SESS_HISTORY one in 10 of these is recorded (i.e. it is recorded every 10 seconds by default)

401. How can you reduce the 10 second time for ASH samples stored in dba_hist_active_sess_history ?
There is a hidden parameter that you can change to alter from the default 10 (one in 10 - ie every 10 seconds) to another value. The parameter is _ash_disk_filter_ratio.  Note that all underscore parameter changes should be cleared through Support first so make sure that you have good reason to modify this parameter.
For example setting this to 5 would collect information every 5 seconds. Please be aware that changing this parameter will affect the amount of session history data that you are storing and so you may need to alter your storage accordingly.

402. What is the minimum interval possible by AWR snapshot?
There is no minimum. But you need to make a call on how scalable and useful collecting at that frequency is based upon your application and storage limitations.

403. The interval for AWR reports are normally generated using the snapshots created automatically (eg. every 30 mins). Is there a way to manually create snapshots so I can generate AWR with any interval?
Yes. You can create manual snapshots whenever you like. Please look into DBMS_WORKLOAD_REPOSITORY package

404. If we had 4 AWR reports of 15 minutes each, then does that mean we do not need ASH reports?
No. ASH and AWR reprots cover different areas. AWR reports deal with the database as a whole. ASH focuses on individual sessions.

405. Can we use a sql tuning task on AWR history information?
Yes, you can supply a SQLID and snap_ids to DBMS_SQLTUNE.CREATE_TUNING_TASK()

406. Can the AWR run timeframe be modified to run for periods in the past?
You can only report on periods where you have already collected information (in the form of snapshots). So you can report on historic snapshots but you cannot change the times of them.

407. Is there any way to generate multiple hourly AWR reports at once instead of generating them one at a time interactively?
There is no direct routine to generate reports for every snapshot taken in a range of snapshots. You could however create a pl/sql loop and run the report for each snapshot pair. i.e. in pseudo code:
for i in begin_day_snap .. end_day_snap
loop
exec dbms_workload_repository.awr_report(..)
end loop

408. Is there a way we can automatically send old snapshot of all prodcution DB's to a remote DB before they get purged based on retention?
There is no automated mechanism like this available now.

410. If it is not automatic, is there a way to transfer snapshots to Remote DB's which can store snap of all production DB's manually? How can we implement it?
To archive AWR data in an archive DB - one option could probably be to replicate the relevant AWR tables in the AWR repository from the main database to the archive DB (via STREAMS / Goldengate). Remember that you would need to filter and delete operations if you were purging from the source database so as not to also purge the target.
Alternatively, you can export awr data from one database and import it into another database. One usage of this would be to allow you to do comparison reporting. If these suggestions do not help you then you probably will need to ask in the Community or file a Service Request.

411. If a database has been created as a copy of another database it may contains 'cloned' AWR information. Is it possible to clear the old cloned instance AWR/ASH report history from database.
To clear up space, have a look at articles like :
Document 552880.1 General Guidelines for SYSAUX Space Issues
Document 1292724.1 Suggestions if your SYSAUX Tablespace grows rapidly or too large
If these suggestions do not help you then you probably will need to ask in the Community or file a Service Request.

412. Is there a best way to cleanup the AWR data after DBID changes?
Document 1251795.1 How To remove Workload Repository information related to an old DBID from workload repository

413. What products/feature packs need to be licensed to use AWR/ASH/ADDM ?
AWR/ASH and ADDM reports are all based upon data collected in the Automatic Workload Repository. Any use or interpretation of that data needs a diagnostic pack license, see:Document 1490798.1 AWR Reporting - Licensing Requirements Clarification

414. Can AWR/ASH/ADDM be used on Oracle Standard Edition?
No. Automatic Workload Repository (AWR) is an Enterprise Edition feature. To find out about this and other features see:
Document 1361401.1 Where to Find Information About Performance Related Features

415. Do we need to have a separate license to use AWR/ASH/ADDM from OEM ?
No. It needs the diagnostic pack license regardless of how it is accessed
416. Do we have any healthcheck reports to check overall DB performance issue besides running ADDM/AWR reports for bad period time?
There is a general database health monitor to check the database. Details can be found in the following article:
Document 466920.1 Health monitor
There is no specific performance health check report. An ADDM report should provide good recommendations for this and AWR is the best way to get a baseline for good performance.

417.Are there any special considerations for the Automatic Workload Repository (AWR) when migrating databases to 11g?
No. There are no AWR specific considerations

418. AWR does not give recommendations on things found in the report. How do I get recommendations?
Use the ADDM report to assist you to interpret the information. This report does numerous comparisons to determine improvement actions.

419. Is it possible to baseline AWR reports and is this worthwhile?
Yes, you can use any meaningful set of snapshot as a baseline and then use this to compare that with other reports. You can use the "good" period from your system to compare with the "bad" one, in this way you have a reliable expectation about what your healthy system looks like. The easiest way to do this is via Enterprise Manager. Basic commands are also mentioned in the following article:
Document 748642.1 How to Generate an AWR Report and Create Baselines
A good period could be defined as when the performance is considered "good enough" to meet business requirements

420. Why does generating an AWR report in HTML format takes much more time than AWR in TXT format?
Generating a HTML report is more complex since more formatting is required to produce the tags and internal links. If it takes MUCH longer than a TXT report then please log a Service Request.

421. Where can we get more information on specific events and system calls ?
The most common wait events are documented in the reference manual and in MOS you can find documents with additional details. For example, you can find information about 'library cache load lock' wait event in detail in :
Document 444560.1 Troubleshooting Library Cache: Lock, Pin and Load Lock
Note that there are no specific "bad" wait events. Whatever your top consumer is, it needs investigation even if it is just CPU (for example, you may be wasting CPU cycles with poorly tuned SQLs)

422. What is the difference between latch and lock?
A latch is an internal mechanism used by Oracle to protect internal structures from modification during concurrent activity.
A lock is a mechanism provided by Oracle to protect user data from concurrent activity while it is being changed.
The goal of both is to make sure that only a single process can modify a resource at the same time. Both are serialisation mechanisms. Document 22908.1 FAQ: What are Latches and What Causes Latch Contention

423. Is oradebug still being used?
Yes. The ORADEBUG utility is used primarily by Oracle Support to diagnose problems that may arise with an Oracle database. You may use it for generating various different types of trace.
The AWR records performance data across the whole database by taking intermittent snapshots that can then be analysed and compared with other snapshots.

424. Is there a "performance" parameter in Oracle?
No. You need to look at the issue that is causing the performance issue and address that directly.

425. What is is difference between TKProf and AWR reports?
TKProf produces a report from a single trace file generated by a single session.
AWR produces a report based upon the whole database activity as recorded in the AWR repository.
Document 29012.1 QREF: TKPROF Usage - Quick Reference
Document 760786.1 TKProf Interpretation (9i and above)

426. How do I gather Performance Related Diagnostics?
AWR reports/Statspack reports   1363422.1 AWR Reports - Information Center
AWR/Statspack reports provide a method for evaluating the relative performance of a database.Assuming you have the appropriate licenses for AWR, please gather Automatic Workload Repository (AWR) reports for the system. Without the specific licenses please gather statspack snapshots.
Oracle 10g+ :From 10g onwards, to check for general performance issues, use the Automatic Workload Repository (AWR) and specifically the Automatic Database Diagnostic Monitor (ADDM) tool for assistance.
276103.1 PERFORMANCE TUNING USING 10g ADVISORS AND MANAGEABILITY FEATURES
Prior to 10g :For 9i and 8i, statspack, rather than AWR, reports should be gathered.
To gather a statspack report, please refer to: Document 94224.1 FAQ- Statspack Complete Reference.
10046 Trace (SQL_TRACE)  10046 trace gathers tracing information about a session.
Example usage in a session is as follows:
alter session set timed_statistics = true;
alter session set statistics_level=all;
alter session set max_dump_file_size = unlimited;
alter session set events '10046 trace name context forever,level 12';
-- run the statement(s) to be traced --
select * from dual;
exit;
These settings ensure that all the relevant information for tracing is collected and that the cursor involved is successfully closed.
376442.1 Recommended Method for Obtaining 10046 trace for Tuning.
Querying Waits from V$Session_wait  The view V$Session_wait can show useful information about what a session is waiting for.
Multiple selects from this view can indicate if a session is moving or not. Document When wait_time=0 the session is waiting, any other value indicates CPU activity. The 'state' column also helps to confirm the wait state of the session.
The following is an example formatted select:
set lines 132 pages 999
column event format a30
SELECT sid,state,event,seq#,p1,p2,p3,wait_time FROM V$session_wait WHERE sid = &&SID;
SELECT sid,state,event,seq#,p1,p2,p3,wait_time FROM V$session_wait WHERE sid = &&SID;
SELECT sid,state,event,seq#,p1,p2,p3,wait_time FROM V$session_wait WHERE sid = &&SID;
Document 43718.1 VIEW "V$SESSION_WAIT" Reference  
** IMPORTANT **   v$session_wait is often misinterpreted.
Apart from the wait_time=0 confusion, users will often assume a session is waiting because an event is shown in v$session_wait and the seconds_in_wait column is rising. It should be remembered that seconds_in_wait only applies to a current wait if wait_time =0 , otherwise it is actually "seconds since the last wait completed". So if the wait has completed (i.e. non-zero wait_time) then this would be expected to change. The "state" column can be used to clear up this misinterpretation: If the session is waiting it says "WAITING" and "WAITED%" if we are no longer waiting.
Finding Current Session ID - This select is useful for finding the current session information for tracing later:
SELECT p.pid ,p.SPID,s.SID FROM v$process p,v$session s WHERE s.paddr = p.addr AND s.audsid = userenv('SESSIONID') /
System State Dumps  If the database is hung then we need to gather systemstate dumps to try to determine what is happening. At
least 3 dumps should be taken as follows:
Login to sqlplus as the internal user:
10g and Later:
(a) Non-Rac
sqlplus "/ as sysdba"
oradebug setmypid
oradebug unlimit
oradebug dump systemstate 266
wait 90 seconds
oradebug dump systemstate 266
wait 90 seconds
oradebug dump systemstate 266
quit
(b) RAC
$ sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimits
oradebug setinst all
oradebug -g all hanganalyze 4
oradebug -g all dump systemstate 266
quit
Prior to 10g:
sqlplus "/ as sysdba"
rem -- set trace file size to unlimited:
alter session set max_dump_file_size = unlimited;
alter session set events '10998 trace name context forever, level 1';
alter session set events 'immediate trace name systemstate level 10';
alter session set events 'immediate trace name systemstate level 10';
alter session set events 'immediate trace name systemstate level 10';
If no connection is possible at all then please refer to the following article which describes how to collect systemstates in that situation:
Document 121779.1 Taking a SYSTEMSTATE dump when you cannot CONNECT to Oracle
Errorstack  Errorstack traces are Oracle Call Stack dumps that can be used to gather stack information for a process. They are useful for determination of what a process is doing at a particular time. Attach to the process and gather at least 3 errorstacks. For details see: Document 1364257.1 How to Collect Errorstacks for use in Diagnosing Performance Issues.
PSTACK    Pstack is an operating system tool that can be used to gather stack information on some unix platforms. Attach to the process and gather about 10 pstacks while the job is running.
% script pstacks.txt
% /usr/proc/bin/pstack pid
--repeat 10x
% /usr/proc/bin/pstack pid
% exit
The PID is the o/s process id of the process to be traced. Repeat the pstack command about 10 times to capture possible stack changes. Further details of pstack are in :
Document 1386530.1 Using truss and pstack to Debug Hang Type Problems
Document 70609.1 How To Display Information About Processes on SUN Solaris
PL/SQL Profiler   The PL/SQL profiler provides information about PL/SQL code with regard to CPU usage and other resource
usage information. See Document 243755.1 Implementing and Using the PL/SQL Profiler.
Hanganalyze  Hanganalyze is often gathered for hang situations. The following describes how to gather hanganalyze dumps:
Document 175006.1 Steps to generate HANGANALYZE trace files.
OS Watcher  OS Watcher (OSW) is a collection of UNIX shell scripts intended to collect and archive operating system and
network metrics to aid support in diagnosing performance issues. Document 301137.1 OS Watcher User Guide
LTOM The Lite Onboard Monitor (LTOM) is a java program designed as a proactive, real-time diagnostic platform and
provides real-time automatic problem detection and data collection.
Document 352363.1 LTOM - The On-Board Monitor User Guide
Document 461050.1 The LTOM Graph (LTOMg) User Guide
Document 461228.1 The LTOM Graph FAQ
Document 461052.1 LTOM System Profiler - Sample Output
Document 977975.1 Diagnose LTOM Connection Problems

427. How do I Interpret the Results of Various Traces
Statspack Reports Look at the Top 5 waiters section and work to reduce the time spent in the top waiter first. Once that has been addressed, then regather a statspack report and see what effect the changes have made.The following assumptions hold true:-
Top waiter is IO/CPU -> Main issue is likely to be SQL tuning
Top waiter is any other event -> Database wide performance issue
For more information on interpretation of statspacks output refer to:
http://www.oracle.com/technology/deploy/performance/pdf/statspack_tuning_otn_new.pdf
10046 traces  In order to find what is causing a session to run slowly with a 10046 trace of that session, do the following:
Run the 10046 trace through TKProf and look at the total time spent in SQL. search back through the TKProf report looking for a SQL Statement which takes up the largest proportion of the total.
Look at the breakdown of time and wait events for that SQL.
Always remember that the 'number of executions' is important as although the time for a statement may be high this may be accompanied by an equally high execution count. Assume the following:-
If most of the time is spent in parsing there may be a parsing issue
If the number of physical IOs is high then look at changing the access path of the query to do less work or increasing the buffer cache to get buffers from memory rather than blocks from disk.
If the wait events are enqueue related then generally this is an application design issue.
Determine the enqueue which is being waited for and address appropriately.
Document 21154.1 EVENT 10046 "enable SQL statement tracing (including binds/waits)"
Document 39817.1 Interpreting Raw SQL_TRACE and DBMS_SUPPORT.START_TRACE output
Document 199081.1 SQL_TRACE (10046), TKProf and Explain Plan - Overview Reference
Systemstates  These should be sent to Oracle Support Services to interpret.
Hanganalyze  These should be sent to Oracle Support Services to interpret.
Errorstacks  These should be sent to Oracle Support Services to interpret (Some of the calls on the stack are generic and as
a result of how an errorstack works so , if searched for on MyOracleSupport, can lead to incorrect analysis).

428. Top Database Performance Issues/Problems and How To Resolve Them
Library Cache/Shared Pool Latch waits  Typically Library Cache/Shared Pool Latch waits is a contention problem caused by unshared SQL (in the case of  the library cache latch), or exhaustion of space in the shared pool (for the shared pool latch).
With the shared pool latch specifically, while new space allocations will require the latch, it is typically the freeing AND allocation of space in a shared pool that is too small which causes problem. See: 62143.1 Understanding and Tuning the Shared Pool
High Version Counts High version counts occur when there are multiple copies of the 'same' statement in the shared pool, but some
factor prevents them from being shared. This wastes space and causing latch contention.
Document 296377.1 Handling and resolving unshared cursors/large version_counts
Document 438755.1 Formated V$SQL_SHARED_CURSOR Report by SQLID or Hash Value (Doc ID 438755.1)
Log File Sync waits Log file sync waits occur when sessions wait for redo data to be written to disk. Typically this is caused by slow writes or committing too frequently in the application.
Document 34592.1 WAITEVENT: "log file sync" Reference Note.
Document 857576.1 How to Tune Log File Sync?
Document 1064487.1 Script to Collect Log File Sync Diagnostic Information (lfsdiag.sql)
Buffer Busy waits Buffer Busy waits occur when a session wants to access a database block in the buffer cache but it cannot as
the buffer is "busy" .
Typical solutions are:- Look for SQL that accesses the blocks in question and determine if the repeated reads are necessary.
Check for suboptimal SQL (this is the most common cause of the events) - look at the execution plan for the SQL being run and try to reduce the gets per executions which will minimise the number of blocks being accessed and therefore reduce the chances of multiple sessions contending for the same block.
Document 34405.1 WAITEVENT: "buffer busy waits" Reference Note
Document 155971.1 Resolving Intense and "Random" Buffer Busy Wait Performance Problems:
Document 163424.1 How To Identify a Hot Block Within The Database Buffer Cache.
Cache Buffers Chains Latch waits
Document 1342917.1 Troubleshooting 'latch: cache buffers chains' Wait Contention (Doc ID 1342917.1)
WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!  This Issue occurs when the database detects that a waiter had waited for a resource for longer than a particular threshold. The message "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" appears in the alert log and trace and systemstates are dumped. Typically this is caused by two (or more) incompatible operations being run simultaneously.
Document 278316.1 Potential reasons for "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! "
ORA-60 DEADLOCK DETECTED/enqueue hash chains latch  ORA-60 is an application error which usually occurs because a consistent locking strategy has not been followed throughout an application.
Refer to: hash chains latch waits' are referenced here is that, typically, during deadlock detection (ie the routine Oracle uses to determine if a deadlock actually exists), there is a heavy need for the latch which can cause issues for other sessions. If there is a problem with this latch, check if a trace file is generated for the ORA-60 and resolve that issue.
TX - Document 62354.1 TX Transaction locks - Example wait scenarios
TM - Document 33453.1 REFERENTIAL INTEGRITY AND LOCKING
Procwatcher/Cluster Health Monitor These tools can be used to monitor various aspects of database health. Refer to:
Document 459694.1 Procwatcher: Script to Monitor and Examine Oracle DB and Clusterware Processes
Document 736752.1 Introducing Cluster Health Monitor

429. What If I cannot Find an Answer to my Issue?
Performance issues come in all sorts of guises, but you might find that someone in the community at large has seen something similar and can help. A community has been set up for My Oracle Support (MOS) users moderated by Oracle. The goal of this community is to exchange database related Tuning knowledge and concepts. The community can be found here: Database Tuning

430. Pro-Active Problem Avoidance and Diagnostic Collection
Although some problems may be unforeseen, in many cases problems may be avoidable if signs are detected early enough. Additionally, if an issue does occur, it is no use collecting information about that issue after the event. AWR reports are one of the support recommend methods for collecting such diagnostics. For information on suggested uses, other proactive preparations and diagnostics, see: Document 1482811.1 Best Practices: Proactively Avoiding Database and Query Performance Issues
Document 1477599.1 Best Practices Around Data Collection For Performance Issues
AWR reports are an extremely useful diagnostic tool for the determination of the potential cause of database wide performance issues. Typically when a performance issue is detected you would collect an AWR report covering the period of the poor  performance. It is best to use a reporting period no longer than 1 hour as otherwise specifics can be lost.
It is also prudent to Gather AWR reports during times when performance is acceptable to provide baselines for comparison when there is a problem. Ensure that the baseline snapshot duration is the same as the problem duration to facilitate like with like comparison      Document 1363422.1 Automatic Workload Repository (AWR) Reports - Start Point
NOTE: It is often prudent to use a matched ADDM report initially to give a pointer to the main issues. Reading the corresponding ADDM report as a first step to tuning can save a lot of time because it immediately points at the main user as compared to trying to understand what an AWR report is presenting.
Interpretation
Since we are looking at a performance issue, our primary concern is what the database is waiting for. When processes wait, they are being prevented from doing an activity because of some other factor. High waits provide the highest benefit when wait times are reduced and as such are a good focus. The Top Wait information provides such information and allows us to focus on the main problem areas without wasting time investigating areas that are not causing significant delay.
Top 5 Timed Events
As mentioned, the Top waits section is the most important single section in the whole report being as it quatifies and allows comparison of the primary diagnostic: what each session is waiting for. An example output is provided below:
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
db file scattered read 10,152,564 81,327 8 29.6 User I/O
db file sequential read 10,327,231 75,878 7 27.6 User I/O
CPU time 56,207 20.5
read by other session 4,397,330 33,455 8 12.2 User I/O
PX Deq Credit: send blkd 31,398 26,576 846 9.7 Other
-------------------------------------------------------------
The Top 5 Waits section reports on a number of useful topics related to Events. It records the number of waits encountered in the period and the total time spent waiting together with the average time waited for each event. The section is ordered by the %age of the total call time that each Event is responsible for.
Dependent on what is seen in this section, other report sections may need to be referenced in order to quantify or check the findings. For example, the wait count for a particular event needs to be assessed based upon the duration of the reporting period and also the number of users on the database at the time; 10 Million waits in 10 minutes is far more significant than 10 Million in 10 hours, or if shared among 10 users as opposed to 10,000.
In this example report, almost 60% of the time is spent waiting for I/O related reads. Event 'db file scattered read ' is typically used when fetching blocks for a full tablescan index fast full scan and performs multiblock IO.
Event 'db file sequential read' is a single block read and is typically engaged for any activity where multiblock io is unavailable (for example index reads).
◦ Another 20% of the time is spent waiting for or using CPU time. High CPU usage is often a symptom of poorly tuned SQL (or at least SQL which has potential to take less resource) of which excessive I/O can also be a symptom.
Based on this we would investigate whether these waits indicate a problem or not. If so, resolve the problem, if not, move on to the next wait to determine if that is a potential cause. There are 2 main reasons why I/O related waits are going to be top of the waits:
The database is
 ◦ doing lots of reads
◦ The individual reads are slow
The Top 5 events show us information that helps us here :
Is the database doing lots of reads?: The section shows > 10 Million reads for each of these events in the period.
Whether this is a lot depends on whether the report duration is 1 hour or 1 minute. Check the report duration to asses this. If the reads do seem excessive, then why would the database do a lot of reads?
The database only reads data because the execution of SQL statements has instructed it to do so.
◦Are the individual reads slow? The section shows waits of <=8 ms for the 2 I/O related events.
Whether this is fast or slow is dependent on the hardware underlying the I/O subsystem, but typically anything under 20 ms is acceptable. If the I/O was slow then you can get further information from the 'Tablespace IO Stats ' section:
Tablespace IO Stats DB/Inst: VMWREP/VMWREP Snaps: 1-15
-> ordered by IOs (Reads + Writes) desc
Tablespace
------------------------------
Av Av Av Av Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
TS_TX_DATA
14,246,367 283 7.6 4.6 145,263,880 2,883 3,844,161 8.3
USER
204,834 4 10.7 1.0 17,849,021 354 15,249 9.8
UNDOTS1
19,725 0 3.0 1.0 10,064,086 200 1,964 4.9
AE_TS
4,287,567 85 5.4 6.7 932 0 465,793 3.7
TEMP
2,022,883 40 0.0 5.8 878,049 17 0 0.0
UNDOTS3
1,310,493 26 4.6 1.0 941,675 19 43 0.0
TS_TX_IDX
1,884,478 37 7.3 1.0 23,695 0 73,703 8.3
>SYSAUX
346,094 7 5.6 3.9 112,744 2 0 0.0
SYSTEM
101,771 2 7.9 3.5 25,098 0 653 2.7
Specifically, look for the timing under Rd(ms). If it is higher than 20 milliseconds per read and reads are high, then you may want to start investigating a potential I/O bottleneck from the os.
NOTE: You should ignore relatively idle tablespaces/files as you can get high values due to disk spinup etc. which are not relevant. If you have an issue with 10 million reads being slow it is unlikely that a tablespace/file with 10 reads has caused the problem!
Note:223117.1 Troubleshooting I/O-related waits
Although high waits for 'db file scattered read' and 'db file sequential read' can be I/O related, it is actually more common to find that these waits are relatively 'normal' based on the SQL that the database is being asked to run. In fact, on a well tuned database, you would want these events to be top of the waits, since that would mean that no 'problem' events were there instead!
The trick is being able to assess whether the high waits is indicative of some SQL statements are not using optimal paths (as mentioned earlier) or otherwise. If there are high waits for 'db file scattered read', then SQL may not be using optimal access paths and so are tending to do Full Table Scans as opposed to indexes (or there may be missing indexes or not optimal indexes).
Furthermore, high waits for 'db file sequential read' may indicate SQL statements are using unselective indexes and there for reading more index blocks than necessary or using the wronmg indexes. So these waits may point to poor execution plans for sqls.
In either case the next step would be to check the top resource consuming sqls from the AWR report to determine whether these look excessive or whether improvements can be made.
To do this look at the SQL Statistics Section.
As mentioned 20% of the time is spent waiting for or using CPU time. This should also be looked at when looking at the SQL Statistics.
Remember that the next step to take following the Top 5 Waits is dependent upon the findings within that section. In the example above, 3 of the waits point towards potentially Sub-optimal SQL so that should be the section investigated next.
Equally, if you do not see any latch waits, then latches are not causing a significant problem on your instance and so you do not need to investigate latch waits further.
Generally, if the database is slow, and the Top 5 timed events include "CPU" and "db file sequential read" and "db file scattered read" in any order, then it is usually worth jumping to the Top SQL (by logical and physical reads) section of an AWR report and calling the SQL Tuning Advisor on them (or tune them manually) just to make sure that they are running efficiently.
• SQL Statistics
AWR Reports show a number of different SQL statistics: The different SQL statistic sub sections should be examined based upon the Top Wait events seen in the Top 5 Section.
In our example we saw top waits as 'db file scattered read' , 'db file sequential read' and CPU. For these we are most interested in SQL ordered by CPU Time, Gets and Reads. These sections actually duplicate some information adding other specifics as appropriate to the topic.
Often looking at 'SQL ordered by gets' is a convenient stating point as statements with high buffer gets are usually good candidates for tuning :
SQL ordered by Gets
-> Resources reported for PL/SQL code includes the resources used by all SQL
statements called by the code.
-> Total Buffer Gets: 4,745,943,815
-> Captured SQL account for 122.2% of Total
Gets CPU Elapsed
Buffer Gets Executions per Exec %Total Time (s) Time (s) SQL Id
-------------- ------------ ------------ ------ -------- --------- -------------
1,228,753,877 168 7,314,011.2 25.9 8022.46 8404.73 5t1y1nvmwp2
SELECT ADDRESSID",CURRENT$."ADDRESSTYPEID",CURRENT$URRENT$."ADDRESS3",
CURRENT$."CITY",CURRENT$."ZIP",CURRENT$."STATE",CURRENT$."PHONECOUNTRYCODE",
CURRENT$."PHONENUMBER",CURRENT$."PHONEEXTENSION",CURRENT$."FAXCOU
1,039,875,759 62,959,363 16.5 21.9 5320.27 5618.96 grr4mg7ms81
Module: DBMS_SCHEDULER
INSERT INTO "ADDRESS_RDONLY" ("ADDRESSID","ADDRESSTYPEID","CUSTOMERID","
ADDRESS1","ADDRESS2","ADDRESS3","CITY","ZIP","STATE","PHONECOUNTRYCODE","PHONENU
854,035,223 168 5,083,543.0 18.0 5713.50 7458.95 4at7cbx8hnz
SELECT "CUSTOMERID",CURRENT$."ISACTIVE",CURRENT$."FIRSTNAME",CURRENT$."LASTNAME",CU<
RRENT$."ORGANIZATION",CURRENT$."DATEREGISTERED",CURRENT$."CUSTOMERSTATUSID",CURR
ENT$."LASTMODIFIEDDATE",CURRENT$."SOURCE",CURRENT$."EMPLOYEEDEPT",CURRENT$.
Tuning can either be performed either manually or by calling the SQL Tuning Advisor on them:
Document 271196.1 Automatic SQL Tuning - SQL Profiles.
Document 262687.1 How to use the Sql Tuning Advisor.
Document 276103.1 PERFORMANCE TUNING USING ADVISORS AND MANAGEABILITY FEATURES: AWR, ASH, and ADDM and Sql Tuning Advisor.
Analysis:
-> Total Buffer Gets: 4,745,943,815   : On the assumption that this is an hour long report, this is a significant number of gets and as
such this confirms that it is worth investigating the top SQL statements to make sure they are taking optimal paths.
◦ Individual Buffer Gets : The buffer gets for the individual statements shown are very high with the lowest being 850 Million. These 3 statements actually point towards 2 different reasons for the large number of buffers:
◦ Excessive Buffer Gets/Execution : SQL_IDs '5t1y1nvmwp2' and '4at7cbx8hnz' are only executed 168 times, but each execution reads over 5 Million buffers. This SQL statement is a prime candidate for tuning since the number of buffers read in each execution is so high.
■ Excessive Executions : On the other hand SQL_ID 'grr4mg7ms81' only reads 16 buffers for each execution. Tuning the individual statement may not be able to reduce that significantly. However, the issue with this statement is caused by the number of times it is executed - 65 Million. Changing the way in which the statement is called is likely to have the largest impact here - it is likely that the statement is called in a loop, once per record, if it could be called so as to process multiple records at once then there is potential for significant economies of scale.
■ Remember that these numbers may be 'normal' for this environment (since some are very busy). By comparing this report against a baseline, you can see whether these SQL statements also read this much data when the database performs well. If they do then they are not the cause of the issue and can be ignored (although there may be benefit generally in improving them).
Other SQL Statistic Sections
As mentioned previously there are a number of different report sections that help for specific causes. If you do not have the particular cause then there is likely to be little benefit in looking at these. The following section outlines some potential causes and uses:
◦ Waits for 'Cursor: mutex/pin'
If there are mutex waits such such as 'Cursor: pin S wait on X' or 'Cursor: mutex X' etc , then these are indicative of parsing issues. On this basis look for statements with high parse counts or high version counts under 'SQL ordered by Parse Calls' and 'SQL ordered by Version Count' as these are most likely to be the causes of problems. The following notes can assist further:
Document 1356828.1 FAQ: 'cursor: mutex ..' / 'cursor: pin ..' / 'library cache: mutex ..' Type Wait Events
Note:1349387.1 Troubleshooting 'cursor: pin S wait on X' waits.
•Load Profile
Dependent on the waits, the load profile section either provides useful general background information or specific details related to potential issues.
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 4,585,414.80 3,165,883.14
Logical reads: 94,185.63 65,028.07
Block changes: 40,028.57 27,636.71
Physical reads: 2,206.12 1,523.16
Physical writes: 3,939.97 2,720.25
User calls: 50.08 34.58
Parses: 26.96 18.61
Hard parses: 1.49 1.03
Sorts: 18.36 12.68
Logons: 0.13 0.09
Executes: 4,925.89 3,400.96
Transactions: 1.45
% Blocks changed per Read: 42.50 Recursive Call %: 99.19
Rollback per transaction %: 59.69 Rows per Sort: 1922.64
In the example, the waits section shows potential for issues with the execution of SQL so the load profile can be checked for details in this area, although it is not the primary source of such information.If you were looking at the AWR report for general tuning you might pick up that the load section shows relatively high redo activity with high physical writes. There are more writes than reads on this load with 42% block changes.
Furthermore, there is less hard parsing compared the soft parses. If there was a mutex wait as top wait such as 'library cache: mutex X', then statistics such as the overall parse rate would be more relevant. Again, comparing to a baseline will provide the best information, for example, checking to see if the load has changed by comparing redo size, users calls, and parsing.
•Instance Efficiency
Again, instance efficiency stats are more use for general tuning as opposed to addressing specific issues (unless waits point at these).
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.91 Redo NoWait %: 100.00
Buffer Hit %: 98.14 In-memory Sort %: 99.98
Library Hit %: 99.91 Soft Parse %: 94.48
Execute to Parse %: 99.45 Latch Hit %: 99.97
Parse CPU to Parse Elapsd %: 71.23 % Non-Parse CPU: 99.00
The most important Statistic presented here from the point of view of our example is the '% Non-Parse CPU' because this indicates that almost all the CPU time that we see in the Top Waits section is attributable to Execution and not parse, which means that tuning SQL may help to improve this.
If we were tuning then 94.48% soft parse rate would show a small proportion of hard parsing which is desirable. The high execute to parse % indicates good usage of cursors. Generally, we want the statistics here close to 100%, but remember that a few percent may not be relevant dependent on the application. For example, in a data warehouse environment, hard parsing may be higher due to usage of materialized views and, or histograms. So again comparing to baseline report when performance was good is important.
• Latch Activity
In the example we are not seeing significant waits for latches so this section could be ignored. However if latch waits were significant, then we would be looking for high latch sleeps under Latch Sleep Breakdown for latch free waits:
Latch Sleep Breakdown
* ordered by misses desc
Latch Name
----------------------------------------
Get Requests Misses Sleeps Spin Gets Sleep1 Sleep2 Sleep3
-------------- ----------- ----------- ---------- -------- -------- --------
cache buffers chains
2,881,936,948 3,070,271 41,336 3,031,456 0 0 0
row cache objects
941,375,571 1,215,395 852 1,214,606 0 0 0
object queue header operation
763,607,977 949,376 30,484 919,782 0 0 0
cache buffers lru chain
376,874,990 705,162 3,192 702,090 0 0 0
Here the top latch is cache buffers chains. Cache Buffers Chains latches protect the buffers in the buffer cache that hold data that we have retrieved from disk. This is a perfectly normal latch to see when data is being read. When this becomes stressed, the sleeps figure tends to rise as sessions start to wait to get the buffers they require. Contention can be caused by poorly tuned SQL reading the same buffers. In our example, although the gets are high at 2.8 billion buffer gets, the sleeps at 41,336 is low.
Average number of sleeps per miss ratio (Avg Slps/Miss) is low. The reason for this is that the server is able to deal with this volume of data and so there is no significant contention on Cache Buffers Chains  latches at this point.
Note:413942.1 How to Identify Which Latch is Associated with a "latch free" wait
Notable timed and wait events:
•CPU time events
Just because CPU comes as top timed event in AWR may not indicate a problem. However, if performance is slow with high CPU usage, then start investigating the wait. First, check to see if a sql is taking most CPU under SQL ordered by CPU Time in AWR:
SQL ordered by CPU Time
-> Resources reported for PL/SQL code includes the resources used by all SQL
statements called by the code.
-> % Total is the CPU Time divided into the Total CPU Time times 100
-> Total CPU Time (s): 56,207
-> Captured SQL account for 114.6% of Total
CPU Elapsed CPU per % Total
Time (s) Time (s) Executions Exec (s) % Total DB Time SQL Id
---------- ---------- ------------ ----------- ------- ------- -------------
20,349 24,884 168 121.12 36.2 9.1 7bbhgqykv3cm9
Module: DBMS_SCHEDULER
DECLARE job BINARY_INTEGER := :job; next_date TIMESTAMP WITH TIME ZONE := :myda
te; broken BOOLEAN := FALSE; job_name VARCHAR2(30) := :job_name; job_subname
VARCHAR2(30) := :job_subname; job_owner VARCHAR2(30) := :job_owner; job_start
TIMESTAMP WITH TIME ZONE := :job_start; job_scheduled_start TIMESTAMP WITH TIME
Analysis:
-> Total CPU Time (s): 56,207   This represents 15 minutes of CPU time in total. Whether this is significant depends on the report
duration.
◦The top CPU using SQL uses 20,349 second ◦ (around 5 minutes),
◦ Total DB of time this represents is 9.1%.
Executions is 168 - being as this execution count is the same as 2 of the 3 SQLs identified earlier, these may be related and this task may well be the scheduling job that runs the SQLs.
Other Potential CPU related Issues:
◦Check to see if other waits follow the high CPU timed event.
For example, cursor: pin S waits may cause the high CPU with following known issue:
Note:6904068.8 Bug 6904068 - High CPU usage when there are "cursor: pin S" waits
◦High External CPU usage
If a process outside of the database is taking high CPU then this could be preventing database processes from getting the CPU they require and affecting the database performance. In this case, run oswatcher or other os diagnostic tools to find which process is taking high CPU. Note:433472.1 OS Watcher For Windows (OSWFW) User Guide
◦ Troubleshooting CPU usage
The following note outlines how to further diagnose high CPU usage:
Note:164768.1 Troubleshooting: High CPU Utilization
'Log file sync' waits
When a user session commits or rolls back, the log writer flushes the redo from log buffer to the redo logs. AWR reports are very useful for determination if this is a problem and whether the cause of the problem is I/O or in some other area. The following articles deal specifically with this symptom: Document 1376916.1 Troubleshooting: "Log File Sync" Waits
Note:34592.1WAITEVENT: "log file sync"
•Buffer busy waits
This is the event waited on when a session is trying to get a buffer from the buffer cache but the buffer is busy - either being read by another session or another session is holding it in incompatible mode. In order to find which block is busy and why, use the following notes: Document 155971.1 Resolving Intense and "Random" Buffer Busy Wait Performance
Problems:Note:34405.1 WAITEVENT: "buffer busy waits"
Troubleshooting Other Issues
Use of ADDM Reports alongside AWR
ADDM reports can be reviewed along with AWR to assist in diagnosis since they provide specific recommendations which can help point at potential problems. The following is a sample ADDM report taken from:
Note:250655.1How to use the Automatic Database Diagnostic Monitor:  Example Output:
DETAILED ADDM REPORT FOR TASK 'SCOTT_ADDM' WITH ID 5
----------------------------------------------------
Analysis Period: 17-NOV-2003 from 09:50:21 to 10:35:47
Database ID/Instance: 494687018/1
Snapshot Range: from 1 to 3
Database Time: 4215 seconds
Average Database Load: 1.5 active sessions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
FINDING 1: 65% impact (2734 seconds)
------------------------------------
PL/SQL execution consumed significant database time.
RECOMMENDATION 1: SQL Tuning, 65% benefit (2734 seconds)
ACTION: Tune the PL/SQL block with SQL_ID fjxa1vp3yhtmr. Refer to
the "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL
User's Guide and Reference"
RELEVANT OBJECT: SQL statement with SQL_ID fjxa1vp3yhtmr
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;
FINDING 2: 35% impact (1456 seconds)
------------------------------------
SQL statements consuming significant database time were found.
RECOMMENDATION 1: SQL Tuning, 35% benefit (1456 seconds)
ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
gt9ahqgd5fmm2.
RELEVANT OBJECT: SQL statement with SQL_ID gt9ahqgd5fmm2 and
PLAN_HASH 547793521
UPDATE bigemp SET empno = ROWNUM
FINDING 3: 20% impact (836 seconds)
-----------------------------------
The throughput of the I/O subsystem was significantly lower than expected.
RECOMMENDATION 1: Host Configuration, 20% benefit (836 seconds)
ACTION: Consider increasing the throughput of the I/O subsystem.
Oracle's recommended solution is to stripe all data file using
the SAME methodology. You might also need to increase the
number of disks for better performance.
RECOMMENDATION 2: Host Configuration, 14% benefit (584 seconds)
ACTION: The performance of file
D:\ORACLE\ORADATA\V1010\UNDOTBS01.DBF was significantly worse
than other files. If striping all files using the SAME
methodology is not possible, consider striping this file over
multiple disks.
RELEVANT OBJECT: database file
"D:\ORACLE\ORADATA\V1010\UNDOTBS01.DBF"
SYMPTOMS THAT LED TO THE FINDING:
Wait class "User I/O" was consuming significant database time.
(34% impact [1450 seconds])
FINDING 4: 11% impact (447 seconds)
-----------------------------------
Undo I/O was a significant portion (33%) of the total database I/O.
NO RECOMMENDATIONS AVAILABLE
SYMPTOMS THAT LED TO THE FINDING:
The throughput of the I/O subsystem was significantly lower than
expected. (20% impact [836 seconds])
Wait class "User I/O" was consuming significant database time.
(34% impact [1450 seconds])
FINDING 5: 9.9% impact (416 seconds)
------------------------------------
Buffer cache writes due to small log files were consuming significant
database time.
RECOMMENDATION 1: DB Configuration, 9.9% benefit (416 seconds)
ACTION: Increase the size of the log files to 796 M to hold at
least 20 minutes of redo information.
ADDM report gives possible recommendations in more readable format than AWR. However, ADDM should be interpreted along with AWR statistics for accurate diagnostics.
Document 786554.1 How to Read PGA Memory Advisory Section in AWR and Statspack Reports
Document 754639.1 How to Read Buffer Cache Advisory Section in AWR and Statspack Reports
Document 1301503.1 Troubleshooting: AWR Snapshot Collection issues
Document 1363422.1 Automatic Workload Repository (AWR) Reports - Start Point
Statspack
AWR reports supercede legacy reports such as statspack and bstat/estat. For reference, the following is a link to and article outlining how to read statspack reports:
http://www.oracle.com/technetwork/database/focus-areas/performance/statspack-opm4-134117.pdf
Document 94224.1 FAQ- Statspack Complete Reference
Document 394937.1 Statistics Package (STATSPACK) Guide

431. Why and when should one tune?
One of the biggest responsibilities of a DBA is to ensure that the Oracle database is tuned properly. The Oracle RDBMS is highly tunable and allows the database to be monitored and adjusted to increase its performance.
One should do performance tuning for the following reasons:
The speed of computing might be wasting valuable human time (users waiting for response);
Enable your system to keep-up with the speed business is conducted; and
Optimize hardware usage to save money (companies are spending millions on hardware). 

432. What database aspects should be monitored?
One should implement a monitoring system to constantly monitor the following aspects of a database. Writing custom scripts, implementing Oracle’s Enterprise Manager, or buying a third-party monitoring product can achieve this. If an alarm is triggered, the system should automatically notify the DBA (e-mail, page, etc.) to take appropriate action.
Infrastructure availability:
• Is the database up and responding to requests
• Are the listeners up and responding to requests
• Are the Oracle Names and LDAP Servers up and responding to requests
• Are the Web Listeners up and responding to requests
Things that can cause service outages:
• Is the archive log destination filling up?
• Objects getting close to their max extents
• Tablespaces running low on free space/ Objects what would not be able to extend
• User and process limits reached

433. Where should the tuning effort be directed?
Consider the following areas for tuning. The order in which steps are listed needs to be maintained to prevent tuning side effects. For example, it is no good increasing the buffer cache if you can reduce I/O by rewriting a SQL statement.
Database Design (if it’s not too late):
Poor system performance usually results from a poor database design. One should generally normalize to the 3NF. Selective denormalization can provide valuable performance improvements. When designing, always keep the “data access path” in mind. Also look at proper data partitioning, data replication, aggregation tables for decision support systems, etc.
Application Tuning:
Experience showed that approximately 80% of all Oracle system performance problems are resolved by coding optimal SQL. Also consider proper scheduling of batch tasks after peak working hours.
Memory Tuning:
Properly size your database buffers (shared pool, buffer cache, log buffer, etc) by looking at your buffer hit ratios. Pin large objects into memory to prevent frequent reloads.
Disk I/O Tuning:
Database files needs to be properly sized and placed to provide maximum disk subsystem throughput. Also look for frequent disk sorts, full table scans, missing indexes, row chaining, data fragmentation, etc.
Eliminate Database Contention:
Study database locks, latches and wait events carefully and eliminate where possible.
Tune the Operating System:
Monitor and tune operating system CPU, I/O and memory utilization. For more information, read the related Oracle FAQ dealing with your specific operating system.

434. What tuning indicators can one use?
The following high-level tuning indicators can be used to establish if a database is performing optimally or not:
• Buffer Cache Hit Ratio
Formula: Hit Ratio = (Logical Reads – Physical Reads) / Logical Reads
Action: Increase DB_CACHE_SIZE (DB_BLOCK_BUFFERS prior to 9i) to increase hit ratio
• Library Cache Hit Ratio
Action: Increase the SHARED_POOL_SIZE to increase hit ratio

435. What tools/utilities does Oracle provide to assist with performance tuning?
Oracle provide the following tools/ utilities to assist with performance monitoring and tuning:
• TKProf
• UTLBSTAT.SQL and UTLESTAT.SQL – Begin and end stats monitoring
• Statspack
• Oracle Enterprise Manager – Tuning Pack

436. What is STATSPACK and how does one use it?
Statspack is a set of performance monitoring and reporting utilities provided by Oracle from Oracle8i and above. Statspack provides improved BSTAT/ESTAT functionality, though the old BSTAT/ESTAT scripts are still available. For more information about STATSPACK, read the documentation in file $ORACLE_HOME/rdbms/admin/spdoc.txt.
Install Statspack:
cd $ORACLE_HOME/rdbms/admin
sqlplus “/ as sysdba” @spdrop.sql — Install Statspack -
sqlplus “/ as sysdba” @spcreate.sql– Enter tablespace names when prompted
Use Statspack:
sqlplusperfstat/perfstat
exec statspack.snap; — Take a performance snapshots
exec statspack.snap;
• Get a list of snapshots
select SNAP_ID, SNAP_TIME from STATS$SNAPSHOT;
@spreport.sql — Enter two snapshot id’s for difference report
Other Statspack Scripts:
• sppurge.sql – Purge a range of Snapshot Id’s between the specified begin and end Snap Id’s
• spauto.sql – Schedule a dbms_job to automate the collection of STATPACK statistics
• spcreate.sql – Installs the STATSPACK user, tables and package on a database (Run as SYS).
• spdrop.sql – Deinstall STATSPACK from database (Run as SYS)
• sppurge.sql – Delete a range of Snapshot Id’s from the database
• spreport.sql – Report on differences between values recorded in two snapshots
• sptrunc.sql – Truncates all data in Statspack tables

437. When is cost based optimization triggered?
It’s important to have statistics on all tables for the CBO (Cost Based Optimizer) to work correctly. If one table involved in a statement does not have statistics, Oracle has to revert to rule-based optimization for that statement. So you really want for all tables to have statistics right away; it won’t help much to just have the larger tables analyzed.
Generally, the CBO can change the execution plan when you:
1. Change statistics of objects by doing an ANALYZE;
2. Change some initialization parameters (for example: hash_join_enabled, sort_area_size, db_file_multiblock_read_count).

438. How can one optimize %XYZ% queries?
It is possible to improve %XYZ% queries by forcing the optimizer to scan all the entries from the index instead of the table. This can be done by specifying hints.  If the index is physically smaller than the table (which is usually the case) it will take less time to scan the entire index than to scan the entire table.

439. Where can one find I/O statistics per table?

The UTLESTAT report shows I/O per tablespace but one cannot see what tables in the tablespace has the most I/O.
The $ORACLE_HOME/rdbms/admin/catio.sql script creates a sample_io procedure and table to gather the required information. After executing the procedure, one can do a simple SELECT * FROM io_per_object; to extract the required information.
For more details, look at the header comments in the $ORACLE_HOME/rdbms/admin/catio.sql script.