41. Does
the recommendation to not use AUTO_SAMPLE_SIZE on 10g also apply on a 11g
database with the optimizer_features_enable initialisation parameter set to
10g?
No. It does not apply
because optimizer_features_enable only applies to the optimizer and not to the
DBMS_STATS job used to collect the statistics. AUTO_SAMPLE_SIZE is not recommended
on 10g because it produces an extremely small sample.
42. How
can I tell the difference in accuracy between auto_sample or 10% or 60%?
In 11g you can compare
statistics using DBMS_STATS. The DIFF_TABLE_STATS_* statistics can be used to
compare statistics for a table from two different sources. There are 3
different functions available:
DIFF_TABLE_STATS_IN_HISTORY Function:
Compares statistics for a table from two timestamps in past and compare the
statistics as of that timestamps
DIFF_TABLE_STATS_IN_PENDING Function:
Compares pending statistics and statistics as of a timestamp or statistics from
dictionary
DIFF_TABLE_STATS_IN_STATTAB Function:
Compares statistics for a table from two different sources
Alternatively you can use
SQLTXPLAIN to compare them using a SQLT Compare report based on a particular
statement.
43. What
do you recommend regarding statistics for new partitions?
If your partitions are
generally similar in their distribution we would recommend that you copy the
statistics from an existing partition since this will tend to reflect the final
data and give a suitable plan for that. This is better than collecting at the
start since very low data volumes, such as with an empty partition, may well
generate plans that are sub-optimal when the true volume of data has been
added. If the partition sizes are not similar, it is better to leave the
partition with no statistics at first since then Oracle will use dynamic
sampling, then collect when you have a representative data set.
44. Do
You recommend gathering statistics only on partition level and let Oracle use
aggregated stats for global one ?
In all cases you want the
best statistics that you can, constrained by the resource and time you have to
gather them. From that point of view we recommend that you collect global
statistics instead of aggregating them because you can lose important
information in the aggregation process. Using incremental statistics can help a
lot with achieving this due to the reduce resource requirement. See: Document
236935.1 Global statistics - An Explanation
45. Can
you gather statistics just for a range of partitions?
With partitions you can
gather statistics for individual partitions and you can collect global
statistics about all the partitions in the table. If using incremental stats,
global statistics are updated incrementally by scanning only the partitions
that have changed
46. What
are the best practices for running incremental stats on partitioned data?
What's the recommended frequency ? Monthly or weekly?
Advice about the frequency
of Gathering statistics is not time based. Gathering frequency has to be about
how often the data changes. For more information specifically about incremental
statistics see:
https://blogs.oracle.com/optimizer/entry/maintaining_statistics_on_large_partitioned_tables
https://blogs.oracle.com/optimizer/entry/incremental_statistics_maintenance_what_statistics
47. Are
Incremental stats gathered automatically after a change in the data?
No, Incremental statistics
are not automatically gathered when data is changed. Gathering is done by an
automatic task or manually. At this point it checks for candidates for
gathering based upon which partitions have been modified.
48. Do we
need to include cascade=true with incremental Statistics?
The default for cascade in
DBMS_STATS is DBMS_STATS.AUTO_CASCADE which allows Oracle to determine whether
index statistics to be collected or not. This is no different for Incremental
statistics. If you do not want to cascade to indexes then you will need to
specify that in the command used to collect the statistics.
49. How
do incremental stats work on partitions without data at the month end, but then
data is added during 1st week . Do we need to run incremental stats again?
You should gather statistics
on any partition where data has changed significantly (for example more than
10%). The timing of the change is not relevant.
50. How
does Oracle handle NDVs in partition stats vs global stats? For example, if
your query accesses only a couple of partitions from hundreds and the NDV in
those 2 partitions is radically different to the global NDV, does Oracle change
access paths to reflect this?
The optimizer has the
capability to look at partition (and sub-partition on later versions)
statistics. So the access path can be change but it depends on the actual
predicates used.
51. Does 11g collect histograms when
gathering the default statistics ?
The default for the METHOD_OPT
parameter to DBMS_STATS is "FOR ALL COLUMNS SIZE AUTO"
which will gather column statistics if the data is skewed. To avoid this you
either need to change the default or manually specify another option that
precludes column histograms such as: "FOR ALL COLUMNS SIZE 1" .
52. Are
there any tools that I can use to help decide whether a column has skewed data
or conforms to a normal distribution?
The easiest way to do this
is to allow Oracle to decide by using the default for the METHOD_OPT parameter to DBMS_STATS
is "FOR ALL COLUMNS SIZE AUTO" which will gather column statistics
if the data is skewed. 1445372.1 Histograms: An Overview (10g and Above)
53. Do we
need histograms for non-indexed columns ?
It depends whether those
columns are referenced in predicates or not. Histograms are designed to improve
column cardinality which has an affect on the choice of plan over the whole query.
Histograms are not specifically useful for indexes alone. If you have a
predicate on a non-indexed column and the data on that column is skewed then
histograms will help.
54. Is
there a size recommendation for gathering histograms?
If column data is skewed it
is likely that you will need column statistics to reflect that otherwise you
can get poor plans. There is no rule of thumb however for the number of
buckets; typically we would suggest that you let Oracle decide by using the
default of "FOR ALL COLUMNS SIZE AUTO". To set specific values you
need to understand the data and the application to know whether histograms are
needed or not and their parameters.
55. How can I identify whether a column
has a histogram?
The histogram column in dba_tab_col_statistics
provides that information. For example:
SQL> SELECT table_name,
histogram FROM dba_tab_col_statistics WHERE histogram IS NOT NULL AND owner
='SCOTT'
AND table_name ='EMP'
TABLE_NAME HISTOGRAM
------------------------------
---------------
EMP NONE
EMP HEIGHT BALANCED
EMP FREQUENCY
EMP FREQUENCY
EMP FREQUENCY
EMP FREQUENCY
EMP NONE
EMP FREQUENCY
EMP HEIGHT BALANCED
9 rows selected.
56. If my
Number of Distinct Values (NDV) is higher than 254, does that mean a histogram
is poor?
Not necessarily no. This is the
only statistic you have anyway - you can still get benefit from histograms even
with high cardinalities.
57. If
histograms can be beneficial and harmful at the same time, what are scenarios
where using histogram will be a good thing or a bad thing?
Whether a histogram is going
to give you an inaccurate representation of the data or not is dependent on
that data. Histograms with more than 255 distinct values are more likely to be inaccurate
as the NDV increases because cardinality estimations will become more
inaccurate as more are added.
58. Are
Indexed Column statistics a better options than column stats on all columns?
Statistics are used to
calculate cardinality estimations. The better the statistics the better the
estimations. The key to collecting statistics on the correct columns is to know
which columns queries use for join and for filter operations. This is not
necessarily limited to indexed columns. If you gather statistics on the other
columns then they will be of no use.
59. What
are system statistics?
System statistics provide
information to the optimizer about the hardware and environment that the
database is running on to allow it to adjust some core figures to reflect the
specifics and loading of that environment. There are 2 types of system
statistics that are collected: No workload and Workload. No workload statistics
are gathered automatically on instance startup and collect basic system
information such as CPU performance. Workload statistics are not automatically
gathered but need to be manually collected during representative workload. For
more details see:
Document 149560.1 How to
Collect and Display System Statistics (CPU and IO) for CBO use
Document 470316.1 Using
Actual System Statistics (Collected CPU and IO information)
Document 153761.1 System
Statistics: Scaling the System to Improve CBO optimizer
Document 427939.1 System
Statistics: How to gather system stats for a particular batch of work
60. When
should I gather system statistics?
In 11g we recommend that you
should always collect these statistics to give you better plans.
61. What
is the best way to gather system stats ?
System statistics record the
activity on a system for a particular period of time. So choose a suitable
period of activity, start gathering the stats, then wait for a period long
enough to accurately capture the activity you are trying to measure (this may
be 2-3 hours or longer or shorter depending on the activity). So if you want to
collect system stats to reflect your normal daytime working activity, collect
stats during normal working hours. If you want to collect them for a batch job,
collect them while that job is running. Always perform a sanity check on the new
stats after they are collected.
62. What
gives best load information for system statistics? Should we run for single
business day, multiple days, full work week?
As with all statistics,
there is no general recommendation that is suitable for all systems; statistics
are by their nature, very data dependent. System statistics are very flexible
in that they allow you to collect statistics to a very low granularity.
What you need to do is
collect a set of statistics for a period that is applicable and relevant to the
whole job you are going to run with
those statistics in place.
That may take 5 minutes or several hours. Unless you are looking for a system
setting that covers all eventualities it is unlikely that you need anything
longer than a business day since it is likely that there will be some downtime
somewhere in that period which may affect the statistics. If the system is
permanently busy on a consistent basis then a short
sample at any time would
give you the same sort of sample as a longer one anyway. If the load is
different at different times then consider collecting and applying different
sets of statistics to reflect those activities.
63. How
often should we gather system stats?
System statistics record the
activity on a system for a particular period of time. Assuming that activity
does not change for the same period in future, you only need to gather these
statistics once. If the activity changes, then you need to re-gather.
64. After
an upgrade, how important is it to get system stats immediately on new
platform?
It depends. It may work fine
without them. Alternatively it may not. Generally we would recommend you gather
them. You need to gather them for a period that is representative of your normal
activity. A business day would normally be the maximum.
65. Has 11gR2
introduced any automated mechanism to gather system statistics?
No.
66. What
is the difference between System Statistics and Dictionary Statistics
System Statistics record and
represent the performance of the underlying hardware that a database resides
upon. Dictionary statistics are statistics collected on the Oracle data
dictionary tables.
67. Do
You recommend gathering sys schema statistics via Auto_task in 10gr2 ?
No. In normal circumstances
the data dictionary would not be expected to change significantly in terms of
the distribution of data. Gathering statistics once or a few times over the
year should be enough for most cases.
68. How
often should you gather dictionary statistics?
You would not expect the
dictionary to be particularly volatile so not frequently - maybe once or twice
a year - but again, it depends on its volatility and the nature of the
applications being used.
69. Should
I use DBMS_STATS.GATHER_SCHEMA_STATS('SYS') to gather dictionary statistics?
No, use DBMS_STATS.GATHER_DICTIONARY_STATS
instead since it is specifically designed for that purpose.
70. Whats
the difference between GATHER_DICTIONARY_STATS and GATHER_SCHEMA_STATS('SYS')
These are basically the same
thing.
71. How
often do fixed table stats need to be gathered?
Almost never. Unless you
have a very specific situation where it has been recommended to gather them,
leave them alone. See:
Document 798257.1
GATHER_FIXED_OBJECTS_STATS Considerations
Document 457926.1 How to
gather statistics on SYS objects and fixed_objects?
72. Can I
copy statistics between different databases?
Yes. There are numerous
functions for exporting and copy statistics.
73. Can I
copy statistics between different RAC Instances with differnt numbers of nodes?
Yes. There are no specific
RCA statistics stored at the table level.
74. Is
11gR2 statistics gathering stats faster than 10gR2?
Essentially the underlying
algorithms are the same for statistics collection. However 11g has a number of
new collection features that can speed up collection such as incremental and concurrent
statistics gathering
75. If
table statistics are locked, if the environment is changed, should we export
stats and import again to be sure we have the original statistics in place?
No this should not be
necessary. When the stats are gathered they are backed up in the data
dictionary
76. When
we refresh our TEST environment do we have to import Production stats for
optimal performance?
If the data and statistics
in the TEST environment is not the same as production and you use that to do
assessments of the likely Production access paths and thus performance, then
you may be fooled into thinking that the performance is good. To mimic
production, you need to have the same data size and statistics. You could also
use Real application testing to test real workloads for better results.
77. What is default value for
"no_invalidate" option for DBMS_STATS.GATHER_SCHEMA_STATS?
The default is to use
DBMS_STATS.AUTO_INVALIDATE. This means that Oracle decides when to invalidate
dependent cursors.
78. Is
there any way to gather statistics for specific portion of the table?
There is no way of gathering
statistics for a section of an non-partitioned table. A non-100% sample
"randomly" samples sections of the table but you cannot specify which
part.
79. We
have found that gathering statistics on sys.syn$ has helped when all_synonmys
or user_synonyms was slow
If statistics are needed on
sys.syn$ then this implies that the data in this table is not growing as
expected and it is likely that the application creates/updates/deletes synonyms
often. If significant synonym maintenance is going on then this may be more of
an application design issue than an issue with the views themselves. From a
prevention point of view, if gathering statistics on syn$ helps, then you may
want to gather statistics on it frequently to prevent bad plans. As with any
other sql you should to review execution plans and estimated cardinalities for each
operation. There have been some changes in the views and in the optimizer that
can make queries on those views and other dictionary views slow. If you
encounter slow dictionary view performance generally, support recommends that
you search for the name of the view in MOS looking for articles addressing
workarounds and potential fixes. Document 1242844.1 Selects Against
ALL_SYNONYMS Very Slow in 11.2.0.1. and 11.2.0.2.
80. How
can you tell which columns are candidates for grouping?
There is a procedure in
DBMS_STATS called DBMS_STATS.SEED_COL_USAGE which collects column usage
information that allows such candidates to be determined. 1430317.1 How To Find
Candidate Columns For Creating Extended Statistics
Additionally, you can use
the SQL Tuning Advisor on queries that have known bad cardinality estimations
to see if there is correlation that can be overcome by column groups.
81. I see
difference between exadata statistics and normal database? Any plans to include
the storage level stats/plans also for exadata?
We do not have any
specifics, but development is aware of this concern Application Specific
Recommendations
82. What
is the difference between DBMS_STATS and FND_STATS.
FND_STATS is a wrapper built
on top of DBMS_STATS. It adds some functionality for EBS such as "which
columns in EBS need histograms" and provide settings tailored to the EBS
objects.
83. We
are using Peoplesoft and one of the recommendations said to disable histograms
by using columns size 1. Does that sound reasonable?
In general, if an
application has specific recommendations, those would override the general
database recommendations. We would recommend that you honour application
specific settings.
Recommendations applicable
specifically for Peoplesoft can be found in the following document:
Document 1322888.1
pscbo_stats - Improving Statistics in Oracle RDBMS for PeopleSoft Enterprise
For other applications:
For Oracle EBS use
FND_STATS. Troubleshoot any issues with: Document 169935.1 Troubleshooting
Oracle Applications Performance Issues
For SIEBEL use: Document
781927.1 Performance Tuning Guidelines for Siebel CRM Application on Oracle
Database
84. Is
Incremental mode available with specific a applications? For example EBS using
FND_STATS?
Incremental is a property of
the table, not of FND_STATS. Specific applications may have restrictions with
regards to table setup.
85. What
is SQLTXPLAIN, also known as SQLT
SQLT is a tool that can be
used to diagnose why a specific SQL statement may be performing poorly. It
takes a single SQL statement as input and outputs a set of diagnostics that can
be used to improve the statement's performance.
SQLT provides large amounts
of information to a SQL Tuner on a wide range of topics pertinent to that SQL including
execution plans, Cost-based Optimizer CBO statistics, schema objects metadata,
performance statistics, configuration parameters, and other elements that
influence the performance of the one SQL being analyzed.
86. What
is the origin of SQLT?
SQLTXPLAIN is a tool
provided by Oracle Server Technologies Center of Expertise - ST CoE. It have
been in usage in support for over 10 years following its creation in 1999. It
has been developed and improved over that time to reflect changes in the Oracle
Database and to add additional functionality. It is constantly evolving and
being updated.
87. Is
there a light weight way of gathering SQLT-like information?
Although it does not have
the depth or scope of SQLT, there is a script called SQLHC which provides a
subset of the SQL used by the SQLT script. SQLHC is shipped along with SQLT and
can be executed using the following 2 scripts:
sqlhc.sql -- Runs from a SQL_ID
sqlhcxec.sql -- Runs from a Script
file
for more details regarding
SQLHC, see: Document 1366133.1 SQL Tuning Health-Check Script (SQLHC)s
88. IS
SQLHC a complement or a substitute for SQLT?
SQLT is far more advanced
than SQLHC; while it incorporates similar checks to those done in SQLHC it
offers far wider functionality. SQLHC is simply a quick health check for a statement
and is a subset of the SQL used by the SQLT script but unlike SQLT, has no
database footprint. It is a relatively short SQL script that generates a HTML
output. It extracts database statistic information using selects or standard
APIs (such as DBMS_XPLAN) from the database and combines that with AWR output
if the correct licenses are available. 1417774.1
FAQ: SQL Health Check (SQLHC) Frequently Asked Questions
89. What
would you use first? SQLHC or SQLT?
When you initially encounter
a problem with a SQL statement, we would advise that you use SQLHC.
If that finds nothing wrong
then follow that up with SQLT. Of course you can start with SQLT but you may
not need all the functionality that it provides and with SQLHC you do not have
to worry about database footprint and all that entails.
90. Can
SQLT be used safely in production?
Yes. But this is something
you would need to validate If you decide to use it, initially you would
probably want to install it on test. If you can reproduce the issue you are
seeing there, then you do not need to install on production. If it only happens
in production, then once you are happy in Test then you can schedule the
installation on production and capture the information from there.
91. What
level of expertise is required to use SQLT?
SQLT assumes that the user
knows about SQL Tuning. It is not designed to take you step by step through an
issue if you do not know anything about tuning SQL. It will not teach you SQL
Tuning. It is similar to reading an AWR report - if you are familiar with
Database Performance concepts then AWR will be familiar to you and SQLT is
similar.
92. How
Do I install / re-install / un-install SQLT?
You can find the install /
re-install / un-install instructions in file sqlt_instructions.html included in
the zip file downloaded above.
93. How
Much space does SQLT take up?
In the majority of cases,
100Mb is plenty of space. However, if the queries that you are analyzing with
SQLT access tables with thousands of partitions, you may need more than that. SQLT
does not store application data. It only stores metadata on the SQL and the
dependent objects.
94. If
SQLT is already installed, how Do I determine the version?
The following SQL, run as
SYS or the SQLTXPLAIN user will provide version information:
COL sqlt_version FOR A40;
SELECT
'SQLT version number:
'||sqltxadmin.sqlt$a.get_param('tool_version')||CHR(10)||
'SQLT version date :
'||sqltxadmin.sqlt$a.get_param('tool_date')||CHR(10)||
'Installation date :
'||sqltxadmin.sqlt$a.get_param('install_date') sqlt_version
FROM DUAL
/
SQLT_VERSION
----------------------------------------
SQLT version number: 12.1.05
SQLT version date :
2013-12-11
Installation date : 2014-01-27/10:25:37
95. What
does SQLT do during installation?
SQLT creates a repository in
the database under the SQLTXPLAIN user to record information concerning the
queries that are analyzed by the tool and also creates a number of directories
on disk.
96. Can
SQLT be installed into an existing schema instead of creating a new schema?
SQLT requires its own schema
to be installed
97. Do
you need the SYS password to run SQLT?
In order to install SQLT,
you need to be connected internal. To run SQLT once it is installed, No.
98. Can I
reset the password for the SQLTXPLAIN user?
Yes. Although the SQLT
schema (SQLTXPLAIN) is fixed, the password can be changed just like any normal
Oracle password.
99. If
the Applications password is unavailable, is SYS/SYSTEM a viable alternative?
You can run with these users
but this is NOT the preferred way running SQLT; you should try to get the
applications password if possible. The primary issue is that without the application
password some pieces of information will be missing such as the 'explain plan
for' output and although in most cases this is not required, in some cases this
information is important.
100. Which
Schema should I register as the application scema? For example we have 3
schemas in the DB with cross permissions - Schema A is the main schema but uses
objects from B and C. How does that work with SQLT?
When you install SQLT it
asks who is the main application user. You can pick any schema for this, it
does not matter. When you want to examine a SQL with SQLT then you need to register
the schema that actually runs the SQL. So, once you have installed SQLT,
connect as SYS and grant the SQLT user role to the schema you run the SQL as.
Then use those schemas to run the SQLT against the SQL that those schemas run.
If you don't grant the role to all the schemas and have only referenced one,
SQLT will still be able to pick up the references to tables and object in the
SQL that are from any other schemas. All the register does is setup the runtime
environment to run as that particular user.
No comments:
Post a Comment