1. Application user is complaining the database is slow.How
would you find the performance issue of SQL queries?
High performance is common expectation for end user, in fact the database is never slow or fast in most of the case session connected to the database slow down when they receives unexpected hit. Thus to solve this issue you need to find those unexpected hit. To know exactly what the session is doing join your query v$session with v$session_wait.
SELECT NVL(s.username,'(oracle)') as username,s.sid,s.serial#,sw.event,sw.wait_time, sw.seconds_in_wait, sw.state FROM v$session_waitsw,v$session s WHERE s.sid=sw.sid and s.username= '&username'ORDER BY sw.seconds_in_wait DESC;
1.Check the events that are waiting for something.
2.Try to find out the objects locks for that particular session.
3.Locking is not only the cause to effects the performance. Disk I/O contention is another case. When a session retrieves data from the database datafiles on disk to the buffer cache, it has to wait until the disk sends the data. The wait event shows up for the session as "db file sequential read" (for index scan) or "db file scattered read" (for full table scan).When you see the event, you know that the session is waiting for I/O from the disk to complete. To improve session performance, you have to reduce that waiting period. The exact step depends on specific situation, but the first technique “reducing the number of blocks retrieved by a SQL statement” almost always works.Reduce the number of blocks retrieved by the SQL statement. Examine the SQL statement to see if it is doing a full-table scan when it should be using an index, if it is using a wrong index, or if it can be rewritten to reduce the amount of data it retrieves.
4.Run SQL Tuning Advisor (@$ORACLE_HOME/rdbms/admin/sqltrpt.sql) by providing SQL_ID as the input for generating the findings and recommendations.
SQL Tuning Advisor seems to be doing logical optimization mainly by checking your SQL structure and statistics.
SQL Tuning Advisor suggests indexes that might be very useful.
SQL Tuning Advisor suggests query rewrites.
SQL Tuning Advisor suggests SQL profile.
More:
1.Run TOP command in Linux to check CPU usage.
2.Run VMSTAT, SAR, PRSTAT command to get more information on CPU, memory usage and possible blocking.
3.Enable the trace file before running your queries, then check the trace file using tkprof create output file.
According to explain plan check the elapsed time for each query,then tune them respectively.
High performance is common expectation for end user, in fact the database is never slow or fast in most of the case session connected to the database slow down when they receives unexpected hit. Thus to solve this issue you need to find those unexpected hit. To know exactly what the session is doing join your query v$session with v$session_wait.
SELECT NVL(s.username,'(oracle)') as username,s.sid,s.serial#,sw.event,sw.wait_time, sw.seconds_in_wait, sw.state FROM v$session_waitsw,v$session s WHERE s.sid=sw.sid and s.username= '&username'ORDER BY sw.seconds_in_wait DESC;
1.Check the events that are waiting for something.
2.Try to find out the objects locks for that particular session.
3.Locking is not only the cause to effects the performance. Disk I/O contention is another case. When a session retrieves data from the database datafiles on disk to the buffer cache, it has to wait until the disk sends the data. The wait event shows up for the session as "db file sequential read" (for index scan) or "db file scattered read" (for full table scan).When you see the event, you know that the session is waiting for I/O from the disk to complete. To improve session performance, you have to reduce that waiting period. The exact step depends on specific situation, but the first technique “reducing the number of blocks retrieved by a SQL statement” almost always works.Reduce the number of blocks retrieved by the SQL statement. Examine the SQL statement to see if it is doing a full-table scan when it should be using an index, if it is using a wrong index, or if it can be rewritten to reduce the amount of data it retrieves.
4.Run SQL Tuning Advisor (@$ORACLE_HOME/rdbms/admin/sqltrpt.sql) by providing SQL_ID as the input for generating the findings and recommendations.
SQL Tuning Advisor seems to be doing logical optimization mainly by checking your SQL structure and statistics.
SQL Tuning Advisor suggests indexes that might be very useful.
SQL Tuning Advisor suggests query rewrites.
SQL Tuning Advisor suggests SQL profile.
More:
1.Run TOP command in Linux to check CPU usage.
2.Run VMSTAT, SAR, PRSTAT command to get more information on CPU, memory usage and possible blocking.
3.Enable the trace file before running your queries, then check the trace file using tkprof create output file.
According to explain plan check the elapsed time for each query,then tune them respectively.
2. What is the use of
iostat/vmstat/netstat command in Linux?
Iostat – reports on terminal, disk and tape I/O activity.
Vmstat – reports on virtual memory statistics for processes, disk, tape and CPU activity.
Netstat – reports on the contents of network data structures.
Iostat – reports on terminal, disk and tape I/O activity.
Vmstat – reports on virtual memory statistics for processes, disk, tape and CPU activity.
Netstat – reports on the contents of network data structures.
3. If you are getting high “Busy Buffer waits”, how can you
find the reason behind it?
Buffer busy wait means that the queries are waiting for the blocks to be read into the db cache. There could be the reason when the block may be busy in the cache and session is waiting for it. It could be undo/data block or segment header wait.
Run the below two query to find out the P1, P2 and P3 of a session causing buffer busy wait then after another query by putting the above P1, P2 and P3 values.
Select p1 "File #",p2 "Block #",p3 "Reason Code" from v$session_wait Where event = 'buffer busy waits';
Select owner, segment_name, segment_type from dba_extents Where file_id = &P1 and &P2 between block_id and block_id + blocks -1;
Buffer busy wait means that the queries are waiting for the blocks to be read into the db cache. There could be the reason when the block may be busy in the cache and session is waiting for it. It could be undo/data block or segment header wait.
Run the below two query to find out the P1, P2 and P3 of a session causing buffer busy wait then after another query by putting the above P1, P2 and P3 values.
Select p1 "File #",p2 "Block #",p3 "Reason Code" from v$session_wait Where event = 'buffer busy waits';
Select owner, segment_name, segment_type from dba_extents Where file_id = &P1 and &P2 between block_id and block_id + blocks -1;
4. What to Look for in AWR Report and STATSPACK Report?
Many DBAs already know how to use STATSPACK but
are not always sure what to check regularly. Remember to separate OLTP and
Batch activity when you run STATSPACK, since they usually generate different
types of waits. The SQL script “spauto.sql” can be used to run STATSPACK every hour on the hour. See the
script in $ORACLE_HOME/rdbms/admin/spauto.sql for more information (note that JOB_QUEUE_PROCESSES
must be set > 0). Since every system is different,
this is only a general list of things you should regularly check in your
STATSPACK output:
¦ Top 5 wait events (timed events)
¦ Load profile
¦ Instance efficiency hit ratios
¦ Wait events
¦ Latch waits
¦ Top SQL
¦ Instance activity
¦ File I/O and segment statistics
¦ Memory allocation
¦ Buffer waits
5. What is the difference between DB file sequential read and
DB File Scattered Read?
DB file sequential read is associated with index read where as DB File Scattered Read has to do with full table scan.
The DB file sequential read, reads block into contiguous memory and DB File scattered read gets from multiple block and scattered them into buffer cache.
6. Which factors are to be considered for creating index on Table? How to select column for index?
Creation of index on table depends on size of table, volume of data. If size of table is large and we need only few data for selecting or in report then we need to create index. There are some basic reason of selecting column for indexing like cardinality and frequent usage in where condition of select query. Business rule is also forcing to create index like primary key, because configuring primary key or unique key automatically create unique index.
It is important to note that creation of so many indexes would affect the performance of DML on table because in single transaction should need to perform on various index segments and table simultaneously.
7. Is creating index online possible?
YES. You can create and rebuild indexes online. This enables you to update base tables at the same time you are building or rebuilding indexes on that table. You can perform DML operations while the index building is taking place, but DDL operations are not allowed. Parallel execution is not supported when creating or rebuilding an index online.
CREATE INDEX emp_name ON emp (mgr, emp1, emp2, emp3) ONLINE;
DB file sequential read is associated with index read where as DB File Scattered Read has to do with full table scan.
The DB file sequential read, reads block into contiguous memory and DB File scattered read gets from multiple block and scattered them into buffer cache.
6. Which factors are to be considered for creating index on Table? How to select column for index?
Creation of index on table depends on size of table, volume of data. If size of table is large and we need only few data for selecting or in report then we need to create index. There are some basic reason of selecting column for indexing like cardinality and frequent usage in where condition of select query. Business rule is also forcing to create index like primary key, because configuring primary key or unique key automatically create unique index.
It is important to note that creation of so many indexes would affect the performance of DML on table because in single transaction should need to perform on various index segments and table simultaneously.
7. Is creating index online possible?
YES. You can create and rebuild indexes online. This enables you to update base tables at the same time you are building or rebuilding indexes on that table. You can perform DML operations while the index building is taking place, but DDL operations are not allowed. Parallel execution is not supported when creating or rebuilding an index online.
CREATE INDEX emp_name ON emp (mgr, emp1, emp2, emp3) ONLINE;
8. Temp Tablespace is 100% FULL and there is no space
available to add datafiles to increase temp tablespace. What can you do in that
case to free up TEMP tablespace?
Try to close some of the idle sessions connected to the database will help you to free some TEMP space. Otherwise you can also use ‘Alter Tablespace PCTINCREASE 1’ followed by ‘Alter Tablespace PCTINCREASE 0’
Try to close some of the idle sessions connected to the database will help you to free some TEMP space. Otherwise you can also use ‘Alter Tablespace PCTINCREASE 1’ followed by ‘Alter Tablespace PCTINCREASE 0’
9. v$sql_bind_capture, when it
samples?
Sampling is
comprehensive.
* In 10.2 the bind values used at parse time are
available, in 9.2 and 10.1 they are not.
* The values for user binds are available, the values
for system binds are not (cursor_sharing).
* Literals replaced by system binds are used for
optimization even if _optim_peek_user_binds = false.
To display the values of user bind variables that were peeked when
a statement was parsed, use DBMS_XPLAN.DISPLAY_CURSOR with the undocumented 'ADVANCED' (or 'PEEKED_BINDS') format. The peeked bind values are stored and extracted from the other_xml column of v$sql_plan.
10. Can
bind capture be activated on purpose?
It will be activated
when setting statistics_level
= typical and deactivated setting
them to basic. It can be deactivated without affecting other awr activity using
event awr_flush_table_off, level 56
11. Fragmentation within Tablespaces
( Doc ID 186826.1)
A tablespace is fragmented
when enough space exists in the tablespace to satisfy space request but it is
spread across smaller pieces of free space and therefore cannot be used to
satisfy the request.
Here are some symptoms that
may indicate fragmentation in a database:
- any ORA errors regarding
allocation of extents or space(i.e. ora-1547, ora-1562)
- DBA_FREE_SPACE shows a lot
of free space left in a tablespace, but users receive space errors
By following a simple set of rules, fragmentation at the extent
level can be eliminated :
2.1. All the segments in a
tablespace should have exactly the same extent size. This ensures that any free
extent in a tablespace can always be used for any segment in the tablespace:
set the default PCTINCREASE to 0 if you are in a version prior
8.1.5.
2.2. All segments should
inherit the storage values from the storage clause of thetablespace. The MINEXTENTS clause
guarantees the amount of space for one segment at creation time.
2.3. LOCALLY MANAGED tablespaces
should be used instead of DICTIONARY MANAGED tablespaces. LOCALLY MANAGED tablespaces
were introduced with 8.1.5. The locally
managed extents can have a uniform size or variable sizes determined
automatically by the system.
2.4. The number of extents
should be kept to a minimum. Some DDL operations such as DROP and TRUNCATE are
sensitive to the number of extents and therefore performance can be affected
with a high number of extents. This is only applicable for dictionary managed
tablespace. When using locally managed with autoallocate, Oracle will allocate
extent sizes and numbers as appropriate.
There is no real advantage in
having all the space for a segment allocated in one extent. Most segments are
accessed through indexes and not by scan operations. In case of scan
operations, the real advantage is gained by using large extents and issuing
large contiguous disk reads.
2.5. Very large segments
(bigger that 4G) should be partitioned into multiple segments. They should be
placed in different tablespaces that do not contain any other data.
2.6. Temporary segments
created by sort operations should be placed in TEMPORARY tablespaces.
No permanent objects can be created in these tablespaces. TEMPORARY tablespaces
use special efficient space management algorithms that avoid fragmentation.
2.7. Rollback segments should
have OPTIMAL
parameter set and should be placed in dedicated tablespaces for
OLTP applications.
2.8. User data should not be
placed in the SYSTEM tablespace. The extent sizes for
segments in the SYSTEM tablespace are managed by Oracle
considering that these segments are never dropped or truncated.
2.9. The size of a file
should be a multiple of the extent size for the tablespace plus one block for
the header.
2.10.The segments in a
uniform extent size tablespace should not be reorganized. Even though free
space may be spread throughout the tablespace it is no less usable than space
at the end of the file.
2.11.The non-uniform extent
size tablespace should be defragmented via export/import.
Oracle provides views to
determine the space available in the database:
DBA_FREE_SPACE, DBA_FREE_SPACE_COALESCED.
12. Fragmentation within Heap
Tables (Doc ID 186826.1)
Fragmentation
within heap segments occurs over time as rows get deleted, updated or inserted.
A segment is fragmented when some free space exists in the blocks but not
sufficient space in any of them for Oracle to be able to insert a new one. This
causes the segment to grow.
When
a row that originally fitted into one data block is updated so that the overall
row length increases, and the block's free space is already completely filled,
Oracle migrates the data for the row to a new data block: but Oracle must scan
more than one data block to retrieve the information for that row.
Also,
even if some blocks are empty after deletes, the high water mark (HWM) is not
moved. Full table scans typically read up to the high water mark.
3.1.
Tuning the setting of PCTFREE and PCTUSED parameters
can help on heap space utilization. For example, a 'mostly insert' application
would require a small PCTFREE while for a 'mostly update'
application PCTFREE should be based
on how much the row size is expected to change.
For
applications where rows are inserted/deleted, the PCTFREE should be small
and PCTUSED should be
very small.
3.2.
Starting with Oracle9i, free and used space within segment blocks is tracked
via bitmaps rather than free lists. DBAs do not have to compute PCTUSED,
FREELISTS and FREELIST GROUPS parameters anymore. Within each segment,
BMBs
(BitMap Blocks) show the space utilization of the block.
3.3.
Starting with Oracle8, partitions provide a means of deleting the old data and
leaves the remaining rows tightly packed with no fragmentation.
In
the following cases, it may be worthwhile to
a.
rebuild a table:
-->
CHAIN_CNT is getting worse due to migrated rows
-->
AVG_SPACE is getting worse because of deletes
b.
retrieve unused blocks when the HWM (high water mark) remains far below the size
of the segment: EMPTY_BLOCKS stays high
The
following views provide statistics after statistics collection is performed: DBA_TABLES,
DBA_TAB_PARTITIONS.
13. Why is a particular query
slow?
If
a new query has been created that is slow on 10g and above, then Oracle Support
advises users to utilise the 10g SQL Access Advisor to suggest some suitable
modifications.
Note
that the 10g SQL Access Advisor is part of The Oracle Tuning Pack accessed
through Enterprise Manager licensed links
14. Why is a
particular query slower than it used to be?
There
are a number of factors that can change a query's response time. If a query's
performance has changed then it follows that something in the environment has
changed to initiate this. There could me more data involved, incomplete or
inaccurate statistics, i/o performance changes, access path changes among
others.
Occasionally,
changes in Oracle software through upgrading or patching can cause execution
suboptimal execution plans.
15. Why is a particular query slower on one machine than
another?
In
this case, again, there are a number of factors that can affect a query's
response time. There could me more data involved, incomplete or inaccurate
statistics, i/o performance changes, access path changes among others. Assuming
that the query access paths are the same in the different environments.
Differences
in parameters, specification, memory, disk i/o speed, cpu speed can all have an
effect on the performance of queries.
16. Why is a particular query slower since upgrading?
As
Oracle continues to develop the Database Software, changes are introduced in
the optimizer that are designed to give better performance. Occasionally,
changes that provide improved performance for many, can have an adverse effect
for a small number of others. New versions can also require different
approaches in system management to maintain, or achieve, better performance.
Bug
fixes or security patches may also mean a previously fast execution plan is no
longer possible. For example, queries may have being using a bugged access
method that could cause security issues or other problems in specific
circumstances. In resolving the bug, and removing the issue, the access path
that was previously used may no longer be available.
17. Why
does a particular query's runtime vary?
Given the same inputs, a
statement will always return the same outputs. For performance to vary, one of
the inputs must have changed. There are a very large number of variables that
can affect the performance of a query. See:
Note:372431.1 * TROUBLESHOOTING:
Tuning a New Query
Note:179668.1 * TROUBLESHOOTING:
Tuning Slow Running Queries
18. Why does the execution plan for a
particular query change?
The explain plan for a query
can change if one of the factors used by the CBO has changed. These can
include:
Initialisation parameters - certain
initialisation parameter changes can result in access path changes due to
perceived statistical differences. For example, changes to the <> may
make multi-block I/O requests look more/less attractive to the optimizer. at a
certain threshold, this may switch access paths towards different access
methods that make use of these changes. If the system is unable to service the
I/O rate that the parameter indicates, this can cause poor plan selections.
Object statistics - The Cost
Based Optimizer (CBO) uses statistics to determine the execution plan for a
particular query. Differences in statistics can cause the CBO to generate
different execution plans, which in turn can lead to differences in performance
of the same query. Potentially, with reduced sample sizes, sampling could
produce different statistics due to chance groupings of data that may be the
result of differing loading methods etc. On
10g and above it is recommended to use an estimate sample size of 100% (if it
is possible for this to fit within the maintenance window), even if that means
that statistics are gathered on a reduced frequency. If 100% is not feasible,
try using at least an estimate of 30%. Generally, the accuracy of the
statistics overall outweighs the day to day changes in most applications. Note
that the defaults for different versions of statistics gathering are not necessarily
the same, for example:
Differing System statistics - System
statistics gathered using the DBMS_STATS.GATHER_SYSTEM_STATS procedure
and stored in the SYS.AUX_STATS$ table will be used by default when
the "_OPTIMIZER_COST_MODEL" (hidden)
parameter (which defaults to CHOOSE) is set to CHOOSE or CPU. These
statistics can cause access paths for queries to change dependent upon the
recorded capability of a system. Systems with different capabilities may use
different access paths for the same query. See:
Dynamic Sampling - If
dynamic sampling is enabled then the optimizer may sample information at runtime
to determine a better plan for a query. If data distribution or volume is different
then this sampling could return different statistics and this could clearly
make a difference to the chosen execution plan.
Predicate value changes
Bind Variable Value Differences: See: Why
does a particular query's runtime vary with bind variables rather than
literals?
* ESTIMATE_PERCENT: defaults:
o 9i : 100%
o 10g :
DBMS_STATS.AUTO_SAMPLE_SIZE (using very small estimate percentage)
o 11g : DBMS_STATS.AUTO_SAMPLE_SIZE
(using larger estimate percentage)
* METHOD_OPT: defaults:
o 9i : "FOR ALL COLUMNS SIZE
1" effectively no detailed column statistics.
o 10g and 11g : AUTO - DBMS_STATS
decides in which columns a histogram may help to produce a better plan.
19. Which
optimizer is a query using?
Often it is important to be
able to determine which optimizer is being used by a query. On 10g and above
this is somewhat redundant as there is only one supported optimizer but on
earlier releases, the following article helps with that:
Note:66484.1Which Optimizer is Being Used ?
20. Can Optimizer Parameters be hinted?
From
10g Release 2, optimizer parameters can be set with a hint. See: Note:377333.1
OPT_PARAM Hint in 10g R2
21. Why is partition elimination not occurring?
Partition
elimination or Pruning occurs when the predicates and join order for a query
provide sufficient information to allow a set of partitions to be eliminated
from the result set without accessing the data. Details of different pruning
methods can be found in the following articles:
Note:179518.1Partition
Pruning and Joins
Note:166118.1Partition
Pruning/Elimination
Note:209070.1Partition
Pruning based on Joins to Partitioning Criteria Stored in Dimension Tables
22. What is the
importance of Global Statistics?
It
is important to collect Global Statistics as well as partition level statistics
when gathering statistics against partitioned tables. Oracle Corporation
recommends setting the GRANULARITY parameter to AUTO to gather
both types of partition statistics.
23. Create Table statement only
on a specific Tablespace is performing much slower than others.
A
tablespace full condition was encountered causing a DELETE of the recyclebin
objects to occur
before the CREATE. Purge the recyclebin before
the CREATE.
24. Database Performance Degradation - Recommended Actions
There
are a number of reasons why the performance of a database may have degraded.
The cause needs to be pinpointed, and then resolved in order for the database
to return to normal. Since the database activity can be affected by a single
process or a group of processes, causes can be very broad and can range from
resource contention to heavy SQL taking the majority of server resource to application
or software changes.
The
following diagnostics at a minimum required for issue diagnosis. If possible,
gather diagnostics that illustrate both the good and the bad case. This will
speed up diagnosis.
1.
When did the poor performance start?
Were
there any changes that may have taken you from a situation of good to bad
performance?
If
the problems started immediately after a Database Upgrade, then have you tried
setting the following initialization parameter:
OPTIMIZER_FEATURES_ENABLE
= --previous version-
2.
Please check for any general performance issues on your system.
To
do this use the Automatic Workload Repository (AWR) and specifically the
Automatic Database Diagnostic Monitor (ADDM)
tool to assist you.
25. Diagnostics for Query Tuning Problems
1.
Query Text - The full text of the problematic query
2. Gather Explain Plan
3. Trace of query execution
4. Which OPTIMIZER_GOAL /
OPTIMIZER_MODE is being used?
SELECT name,value FROM v$parameter WHERE name = 'optimizer_mode';
5. Table/View definitions, Analyze level and
parallelism information
6.
Initialisation Parameters
The
following select will extract the majority of parameters relevant to query
optimization, the alert log is also a good source of non default parameter
settings:
SELECT name, value FROM v$parameter WHERE name like 'optimizer%' OR
name like 'parallel%' OR name in ('cursor_sharing',
'db_file_multiblock_read_count', 'hash_area_size', 'hash_join_enabled',
'query_rewrite_enabled', 'query_rewrite_integrity', 'sort_area_size',
'star_transformation_enabled', 'bitmap_merge_area_size', 'partition_view_enabled')
ORDER BY name;
contains useful CBO parameter information.
7.
Row Count and Timing Information
If
possible, supply the number of rows in each table and also expected proportion
of rows returned from each table after the predicates have been applied.
Collecting session statistics (especially timings Parameter:TIMED_STATISTICS=TRUE) can be
very useful, especially in conjunction with autotrace or tkprof..
8. Relevent Application Details
26. a query
that previously worked efficiently now performs badly and the execution plans
are the same before and after the performance change,
To
diagnose the cause of the problem use <Parameter:SQL_TRACE> to trace
the session running the query and then generate TKProf output.
Compare the figures displayed in the TKProf
output for the particular query:
Have
the figures for "cpu" and "elapsed" changed and do they
reflect the performance difference that you have encountered?
Has
the value for "query" increased?
Is
this change attributable to increased data volumes?
Has the value for "disk" increased
but "query" remained static?
27. Top Nine Performance Mistakes Found in Oracle Systems. [ID
858539.1]
Bad Connection Management
Bad Use of Cursors and the Shared Pool
Bad SQL
Use of Nonstandard Initialization Parameters
Getting Database I/O Wrong
Redo Log Setup Problems
Long Full Table Scans
High Amounts of Recursive (SYS) SQL
Deployment and Migration Errors
28. Tuning Queries: 'Quick and Dirty' Solutions
OPTIMIZER_DYNAMIC_SAMPLING
OPTIMIZER_INDEX_COST_ADJ
DB_FILE_MULTIBLOCK_READ_COUNT
PGA_AGGREGATE_TARGET
OPTIMIZER_FEATURES_ENABLE
29. What is the best way to gather stats
in 11g?
There is no universal
"optimal" automatic and out-of-the-box way to gather stats. The
Automatic statistics gathering is a good start and works fine for the majority
of systems, but you need to understand
the lifecycle and scale of the data to make a determination of what is 'best'
for you. as are the version specific recommendations in:
Document 1226841.1 How To:
Gather Statistics for the Cost Based Optimizer
30. What
can I do if data is extremely changeable?
Dynamic sampling can prove
useful in this case. During the compilation of a SQL statement, the optimizer
decides if the available statistics are sufficient to generate a good execution
plan or if it should consider using dynamic sampling. Dynamic sampling is used
to compensate for missing or insufficient statistics that would otherwise lead
to a very bad plan. Additionally, in 12c, among other technologies, adaptive
plans have been introduce that can detect if cardinality estimates differ
greatly from the actual number of rows seen by the operations in the plan. If there
is a significant difference, then the plan or a portion of it can be automatically
adapted to avoid suboptimal performance on the first execution of a SQL
statement.
For more details see the
following whitepaper: "What to expect from the Optimizer in Oracle
Database 12c" that you can find in the following article: Document
1337116.1 White Papers and Blog Entries for Oracle Optimizer Gathering
Frequency
31. Is there a way to decide what
frequency to gather statistics by analyzing changes in data for a particular
table
Yes. There is a view called dba_tab_modifications that you
can use to do that. This view is also used by DBMS_STATS to identify
tables with stale statistics. You can also use the 'LIST STALE' option of DBMS_STATS to show
stale statistics. For an example see:
Document 457666.1 How to list
the objects having stale statistics (Use dbms_stats.gather_schema_stats
options=>'LIST STALE')
32. How can you tell whether statistics
gathering should be run more or less frequently?
Statistics need to be
gathered as often as the data requires. If the data changes a lot, then
gathering has to happen often. If the data is mostly static then you can gather
less often. In general, if the data changes more than 10%
(insert/update/delete) then re-gather. Essentially this is what the gather
stale procedure of DBMS_STATS does by default
33. Can you tell whether statistics are
"too old"?
As above, statistics need to
be gathered as often as the data requires. By default we define 'STALE' data as
around 10% of that data . If the data changes more than 10% then we recommend
that you re-gather. With your data and application you may find that a smaller
percentage is enough to make a difference or that you can get away with a lot
more dependent on the data distribution.
34. What
does "Data changing a lot" mean? Is it a higher Number of insert and
updates or The distinct values/null/not null values in individual columns are
changing frequently ?
The changes could be any or
all of these. All these things can affect the statistics.
35. What
could be regarded as frequent gathering?
There is no standard
definition of frequent or infrequent gathering. It totally depends on your
data. You should tailor your gathering to collect as frequently as your data needs
it without producing poor plans. This is what the automatic statistics
gathering attempts to achieve.
36. What is the best sample size to use?
The optimizer uses statistics
to produce accurate and timely execution plans. If statistics are inaccurate then
poor plans can result. From that point of view you would not want to do
anything other than use a 100% sample for the utmost accuracy. However we are
constrained by time and resources and it may be we can get a 'good enough' set
of statistics with a smaller sample.
Support suggests using
automatic statistics gathering allowing Oracle to decide when and what level of
statistics to gather as the best policy. We also have specific recommendations
based on best practices for each version in the following documents:
Document 1226841.1 How
To: Gather Statistics for the Cost Based Optimizer
Document 749227.1
Recommendations for Gathering Optimizer Statistics on 11g
Document 605439.1
Recommendations for Gathering Optimizer Statistics on 10g
Document 388474.1 Recommendations
for Gathering Optimizer Statistics on 9i
37. What is the default
auto_sample_size?
There is no specific default.
Oracle determines this based a number of factors such as the object itself, its
size etc.
38. What would be good sampling size, 5
or 10 percent?
The sample size you need
really depends on your data and how uniform it is across the whole table.
Without using a 100% sample, any data collection is a balance between gathering
performance and accuracy. With any incomplete sample, it is possible that you
won’t sample representative blocks and thus the statistics produced do not
reflect the actual data accurately. This may produce poor plans. You may be
able to avoid this by sampling more or less blocks.
If you run with 5 percent then it will be
quicker to collect but there is the possibility of having worse statistics than
with 10 percent. On 11g we recommend that you use auto sample size. On 10g try
to use a larger sample (for example 30% or larger if possible)
39. What is a good starting value for
sample size on 10g?
On 10g, if you can, use 100%.
If that takes too long try with 30%. Only if 30% time is too long then use 10%.
These recommendations are in place because in some cases 10g 30% sampling does
not produce statistics that are close enough to a 100% to give good plans. In
11g the sampling accuracy of smaller samples has been improved.
40. Would
you count 10% as a small sample size or less such as 2%?
Assuming you cannot do a 100%
sample, there is no hard and fast way of knowing what the 'correct' sample size
is. For some situations 10% could be too small in others it could give the same
results as a 100% sample simply because of the distribution of the data in the
table. If the sampled blocks happen to reflect the distribution of the whole
table, then you will get good results. If you happen to pick mainly empty
blocks or unrepresentative ones you may not. The only way to know would be to
gather stats in different samples sizes and compare the resulting stats against
100% sample size.
No comments:
Post a Comment