Manoj Simar - The Technology Expert: Interview Q and A for Oracle Performance Tuning Part

41. Does the recommendation to not use AUTO_SAMPLE_SIZE on 10g also apply on a 11g database with the optimizer_features_enable initialisation parameter set to 10g?

No. It does not apply because optimizer_features_enable only applies to the optimizer and not to the DBMS_STATS job used to collect the statistics. AUTO_SAMPLE_SIZE is not recommended on 10g because it produces an extremely small sample.

42. How can I tell the difference in accuracy between auto_sample or 10% or 60%?

In 11g you can compare statistics using DBMS_STATS. The DIFF_TABLE_STATS_* statistics can be used to compare statistics for a table from two different sources. There are 3 different functions available:

DIFF_TABLE_STATS_IN_HISTORY Function: Compares statistics for a table from two timestamps in past and compare the statistics as of that timestamps

DIFF_TABLE_STATS_IN_PENDING Function: Compares pending statistics and statistics as of a timestamp or statistics from dictionary

DIFF_TABLE_STATS_IN_STATTAB Function: Compares statistics for a table from two different sources

Alternatively you can use SQLTXPLAIN to compare them using a SQLT Compare report based on a particular statement.

43. What do you recommend regarding statistics for new partitions?

If your partitions are generally similar in their distribution we would recommend that you copy the statistics from an existing partition since this will tend to reflect the final data and give a suitable plan for that. This is better than collecting at the start since very low data volumes, such as with an empty partition, may well generate plans that are sub-optimal when the true volume of data has been added. If the partition sizes are not similar, it is better to leave the partition with no statistics at first since then Oracle will use dynamic sampling, then collect when you have a representative data set.

44. Do You recommend gathering statistics only on partition level and let Oracle use aggregated stats for global one ?

In all cases you want the best statistics that you can, constrained by the resource and time you have to gather them. From that point of view we recommend that you collect global statistics instead of aggregating them because you can lose important information in the aggregation process. Using incremental statistics can help a lot with achieving this due to the reduce resource requirement. See: Document 236935.1 Global statistics - An Explanation

45. Can you gather statistics just for a range of partitions?

With partitions you can gather statistics for individual partitions and you can collect global statistics about all the partitions in the table. If using incremental stats, global statistics are updated incrementally by scanning only the partitions that have changed

46. What are the best practices for running incremental stats on partitioned data? What's the recommended frequency ? Monthly or weekly?

Advice about the frequency of Gathering statistics is not time based. Gathering frequency has to be about how often the data changes. For more information specifically about incremental statistics see:

https://blogs.oracle.com/optimizer/entry/maintaining_statistics_on_large_partitioned_tables

https://blogs.oracle.com/optimizer/entry/incremental_statistics_maintenance_what_statistics

47. Are Incremental stats gathered automatically after a change in the data?

No, Incremental statistics are not automatically gathered when data is changed. Gathering is done by an automatic task or manually. At this point it checks for candidates for gathering based upon which partitions have been modified.

48. Do we need to include cascade=true with incremental Statistics?

The default for cascade in DBMS_STATS is DBMS_STATS.AUTO_CASCADE which allows Oracle to determine whether index statistics to be collected or not. This is no different for Incremental statistics. If you do not want to cascade to indexes then you will need to specify that in the command used to collect the statistics.

49. How do incremental stats work on partitions without data at the month end, but then data is added during 1st week . Do we need to run incremental stats again?

You should gather statistics on any partition where data has changed significantly (for example more than 10%). The timing of the change is not relevant.

50. How does Oracle handle NDVs in partition stats vs global stats? For example, if your query accesses only a couple of partitions from hundreds and the NDV in those 2 partitions is radically different to the global NDV, does Oracle change access paths to reflect this?

The optimizer has the capability to look at partition (and sub-partition on later versions) statistics. So the access path can be change but it depends on the actual predicates used.

51. Does 11g collect histograms when gathering the default statistics ?

The default for the METHOD_OPT parameter to DBMS_STATS is "FOR ALL COLUMNS SIZE AUTO" which will gather column statistics if the data is skewed. To avoid this you either need to change the default or manually specify another option that precludes column histograms such as: "FOR ALL COLUMNS SIZE 1" .

52. Are there any tools that I can use to help decide whether a column has skewed data or conforms to a normal distribution?

The easiest way to do this is to allow Oracle to decide by using the default for the METHOD_OPT parameter to DBMS_STATS is "FOR ALL COLUMNS SIZE AUTO" which will gather column statistics if the data is skewed. 1445372.1 Histograms: An Overview (10g and Above)

53. Do we need histograms for non-indexed columns ?

It depends whether those columns are referenced in predicates or not. Histograms are designed to improve column cardinality which has an affect on the choice of plan over the whole query. Histograms are not specifically useful for indexes alone. If you have a predicate on a non-indexed column and the data on that column is skewed then histograms will help.

54. Is there a size recommendation for gathering histograms?

If column data is skewed it is likely that you will need column statistics to reflect that otherwise you can get poor plans. There is no rule of thumb however for the number of buckets; typically we would suggest that you let Oracle decide by using the default of "FOR ALL COLUMNS SIZE AUTO". To set specific values you need to understand the data and the application to know whether histograms are needed or not and their parameters.

55. How can I identify whether a column has a histogram?

The histogram column in dba_tab_col_statistics provides that information. For example:

SQL> SELECT table_name, histogram FROM dba_tab_col_statistics WHERE histogram IS NOT NULL AND owner ='SCOTT'

AND table_name ='EMP'

TABLE_NAME HISTOGRAM

------------------------------ ---------------

EMP NONE

EMP HEIGHT BALANCED

EMP FREQUENCY

EMP NONE

EMP FREQUENCY

EMP HEIGHT BALANCED

9 rows selected.

56. If my Number of Distinct Values (NDV) is higher than 254, does that mean a histogram is poor?

Not necessarily no. This is the only statistic you have anyway - you can still get benefit from histograms even with high cardinalities.

57. If histograms can be beneficial and harmful at the same time, what are scenarios where using histogram will be a good thing or a bad thing?

Whether a histogram is going to give you an inaccurate representation of the data or not is dependent on that data. Histograms with more than 255 distinct values are more likely to be inaccurate as the NDV increases because cardinality estimations will become more inaccurate as more are added.

58. Are Indexed Column statistics a better options than column stats on all columns?

Statistics are used to calculate cardinality estimations. The better the statistics the better the estimations. The key to collecting statistics on the correct columns is to know which columns queries use for join and for filter operations. This is not necessarily limited to indexed columns. If you gather statistics on the other columns then they will be of no use.

59. What are system statistics?

System statistics provide information to the optimizer about the hardware and environment that the database is running on to allow it to adjust some core figures to reflect the specifics and loading of that environment. There are 2 types of system statistics that are collected: No workload and Workload. No workload statistics are gathered automatically on instance startup and collect basic system information such as CPU performance. Workload statistics are not automatically gathered but need to be manually collected during representative workload. For more details see:

Document 149560.1 How to Collect and Display System Statistics (CPU and IO) for CBO use

Document 470316.1 Using Actual System Statistics (Collected CPU and IO information)

Document 153761.1 System Statistics: Scaling the System to Improve CBO optimizer

Document 427939.1 System Statistics: How to gather system stats for a particular batch of work

60. When should I gather system statistics?

In 11g we recommend that you should always collect these statistics to give you better plans.

61. What is the best way to gather system stats ?

System statistics record the activity on a system for a particular period of time. So choose a suitable period of activity, start gathering the stats, then wait for a period long enough to accurately capture the activity you are trying to measure (this may be 2-3 hours or longer or shorter depending on the activity). So if you want to collect system stats to reflect your normal daytime working activity, collect stats during normal working hours. If you want to collect them for a batch job, collect them while that job is running. Always perform a sanity check on the new stats after they are collected.

62. What gives best load information for system statistics? Should we run for single business day, multiple days, full work week?

As with all statistics, there is no general recommendation that is suitable for all systems; statistics are by their nature, very data dependent. System statistics are very flexible in that they allow you to collect statistics to a very low granularity.

What you need to do is collect a set of statistics for a period that is applicable and relevant to the whole job you are going to run with

those statistics in place. That may take 5 minutes or several hours. Unless you are looking for a system setting that covers all eventualities it is unlikely that you need anything longer than a business day since it is likely that there will be some downtime somewhere in that period which may affect the statistics. If the system is permanently busy on a consistent basis then a short

sample at any time would give you the same sort of sample as a longer one anyway. If the load is different at different times then consider collecting and applying different sets of statistics to reflect those activities.

63. How often should we gather system stats?

System statistics record the activity on a system for a particular period of time. Assuming that activity does not change for the same period in future, you only need to gather these statistics once. If the activity changes, then you need to re-gather.

64. After an upgrade, how important is it to get system stats immediately on new platform?

It depends. It may work fine without them. Alternatively it may not. Generally we would recommend you gather them. You need to gather them for a period that is representative of your normal activity. A business day would normally be the maximum.

65. Has 11gR2 introduced any automated mechanism to gather system statistics?

No.

66. What is the difference between System Statistics and Dictionary Statistics

System Statistics record and represent the performance of the underlying hardware that a database resides upon. Dictionary statistics are statistics collected on the Oracle data dictionary tables.

67. Do You recommend gathering sys schema statistics via Auto_task in 10gr2 ?

No. In normal circumstances the data dictionary would not be expected to change significantly in terms of the distribution of data. Gathering statistics once or a few times over the year should be enough for most cases.

68. How often should you gather dictionary statistics?

You would not expect the dictionary to be particularly volatile so not frequently - maybe once or twice a year - but again, it depends on its volatility and the nature of the applications being used.

69. Should I use DBMS_STATS.GATHER_SCHEMA_STATS('SYS') to gather dictionary statistics?

No, use DBMS_STATS.GATHER_DICTIONARY_STATS instead since it is specifically designed for that purpose.

70. Whats the difference between GATHER_DICTIONARY_STATS and GATHER_SCHEMA_STATS('SYS')

These are basically the same thing.

71. How often do fixed table stats need to be gathered?

Almost never. Unless you have a very specific situation where it has been recommended to gather them, leave them alone. See:

Document 798257.1 GATHER_FIXED_OBJECTS_STATS Considerations

Document 457926.1 How to gather statistics on SYS objects and fixed_objects?

72. Can I copy statistics between different databases?

Yes. There are numerous functions for exporting and copy statistics.

73. Can I copy statistics between different RAC Instances with differnt numbers of nodes?

Yes. There are no specific RCA statistics stored at the table level.

74. Is 11gR2 statistics gathering stats faster than 10gR2?

Essentially the underlying algorithms are the same for statistics collection. However 11g has a number of new collection features that can speed up collection such as incremental and concurrent statistics gathering

75. If table statistics are locked, if the environment is changed, should we export stats and import again to be sure we have the original statistics in place?

No this should not be necessary. When the stats are gathered they are backed up in the data dictionary

76. When we refresh our TEST environment do we have to import Production stats for optimal performance?

If the data and statistics in the TEST environment is not the same as production and you use that to do assessments of the likely Production access paths and thus performance, then you may be fooled into thinking that the performance is good. To mimic production, you need to have the same data size and statistics. You could also use Real application testing to test real workloads for better results.

77. What is default value for "no_invalidate" option for DBMS_STATS.GATHER_SCHEMA_STATS?

The default is to use DBMS_STATS.AUTO_INVALIDATE. This means that Oracle decides when to invalidate dependent cursors.

78. Is there any way to gather statistics for specific portion of the table?

There is no way of gathering statistics for a section of an non-partitioned table. A non-100% sample "randomly" samples sections of the table but you cannot specify which part.

79. We have found that gathering statistics on sys.syn$ has helped when all_synonmys or user_synonyms was slow

If statistics are needed on sys.syn$ then this implies that the data in this table is not growing as expected and it is likely that the application creates/updates/deletes synonyms often. If significant synonym maintenance is going on then this may be more of an application design issue than an issue with the views themselves. From a prevention point of view, if gathering statistics on syn$ helps, then you may want to gather statistics on it frequently to prevent bad plans. As with any other sql you should to review execution plans and estimated cardinalities for each operation. There have been some changes in the views and in the optimizer that can make queries on those views and other dictionary views slow. If you encounter slow dictionary view performance generally, support recommends that you search for the name of the view in MOS looking for articles addressing workarounds and potential fixes. Document 1242844.1 Selects Against ALL_SYNONYMS Very Slow in 11.2.0.1. and 11.2.0.2.

80. How can you tell which columns are candidates for grouping?

There is a procedure in DBMS_STATS called DBMS_STATS.SEED_COL_USAGE which collects column usage information that allows such candidates to be determined. 1430317.1 How To Find Candidate Columns For Creating Extended Statistics

Additionally, you can use the SQL Tuning Advisor on queries that have known bad cardinality estimations to see if there is correlation that can be overcome by column groups.

81. I see difference between exadata statistics and normal database? Any plans to include the storage level stats/plans also for exadata?

We do not have any specifics, but development is aware of this concern Application Specific Recommendations

82. What is the difference between DBMS_STATS and FND_STATS.

FND_STATS is a wrapper built on top of DBMS_STATS. It adds some functionality for EBS such as "which columns in EBS need histograms" and provide settings tailored to the EBS objects.

83. We are using Peoplesoft and one of the recommendations said to disable histograms by using columns size 1. Does that sound reasonable?

In general, if an application has specific recommendations, those would override the general database recommendations. We would recommend that you honour application specific settings.

Recommendations applicable specifically for Peoplesoft can be found in the following document:

Document 1322888.1 pscbo_stats - Improving Statistics in Oracle RDBMS for PeopleSoft Enterprise

For other applications:

For Oracle EBS use FND_STATS. Troubleshoot any issues with: Document 169935.1 Troubleshooting Oracle Applications Performance Issues

For SIEBEL use: Document 781927.1 Performance Tuning Guidelines for Siebel CRM Application on Oracle Database

84. Is Incremental mode available with specific a applications? For example EBS using FND_STATS?

Incremental is a property of the table, not of FND_STATS. Specific applications may have restrictions with regards to table setup.

85. What is SQLTXPLAIN, also known as SQLT

SQLT is a tool that can be used to diagnose why a specific SQL statement may be performing poorly. It takes a single SQL statement as input and outputs a set of diagnostics that can be used to improve the statement's performance.

SQLT provides large amounts of information to a SQL Tuner on a wide range of topics pertinent to that SQL including execution plans, Cost-based Optimizer CBO statistics, schema objects metadata, performance statistics, configuration parameters, and other elements that influence the performance of the one SQL being analyzed.

86. What is the origin of SQLT?

SQLTXPLAIN is a tool provided by Oracle Server Technologies Center of Expertise - ST CoE. It have been in usage in support for over 10 years following its creation in 1999. It has been developed and improved over that time to reflect changes in the Oracle Database and to add additional functionality. It is constantly evolving and being updated.

87. Is there a light weight way of gathering SQLT-like information?

Although it does not have the depth or scope of SQLT, there is a script called SQLHC which provides a subset of the SQL used by the SQLT script. SQLHC is shipped along with SQLT and can be executed using the following 2 scripts:

sqlhc.sql -- Runs from a SQL_ID

sqlhcxec.sql -- Runs from a Script file

for more details regarding SQLHC, see: Document 1366133.1 SQL Tuning Health-Check Script (SQLHC)s

88. IS SQLHC a complement or a substitute for SQLT?

SQLT is far more advanced than SQLHC; while it incorporates similar checks to those done in SQLHC it offers far wider functionality. SQLHC is simply a quick health check for a statement and is a subset of the SQL used by the SQLT script but unlike SQLT, has no database footprint. It is a relatively short SQL script that generates a HTML output. It extracts database statistic information using selects or standard APIs (such as DBMS_XPLAN) from the database and combines that with AWR output if the correct licenses are available. 1417774.1 FAQ: SQL Health Check (SQLHC) Frequently Asked Questions

89. What would you use first? SQLHC or SQLT?

When you initially encounter a problem with a SQL statement, we would advise that you use SQLHC.

If that finds nothing wrong then follow that up with SQLT. Of course you can start with SQLT but you may not need all the functionality that it provides and with SQLHC you do not have to worry about database footprint and all that entails.

90. Can SQLT be used safely in production?

Yes. But this is something you would need to validate If you decide to use it, initially you would probably want to install it on test. If you can reproduce the issue you are seeing there, then you do not need to install on production. If it only happens in production, then once you are happy in Test then you can schedule the installation on production and capture the information from there.

91. What level of expertise is required to use SQLT?

SQLT assumes that the user knows about SQL Tuning. It is not designed to take you step by step through an issue if you do not know anything about tuning SQL. It will not teach you SQL Tuning. It is similar to reading an AWR report - if you are familiar with Database Performance concepts then AWR will be familiar to you and SQLT is similar.

92. How Do I install / re-install / un-install SQLT?

You can find the install / re-install / un-install instructions in file sqlt_instructions.html included in the zip file downloaded above.

93. How Much space does SQLT take up?

In the majority of cases, 100Mb is plenty of space. However, if the queries that you are analyzing with SQLT access tables with thousands of partitions, you may need more than that. SQLT does not store application data. It only stores metadata on the SQL and the dependent objects.

94. If SQLT is already installed, how Do I determine the version?

The following SQL, run as SYS or the SQLTXPLAIN user will provide version information:

COL sqlt_version FOR A40;

SELECT

'SQLT version number: '||sqltxadmin.sqlt$a.get_param('tool_version')||CHR(10)||

'SQLT version date : '||sqltxadmin.sqlt$a.get_param('tool_date')||CHR(10)||

'Installation date : '||sqltxadmin.sqlt$a.get_param('install_date') sqlt_version

FROM DUAL

SQLT_VERSION

----------------------------------------

SQLT version number: 12.1.05

SQLT version date : 2013-12-11

Installation date : 2014-01-27/10:25:37

95. What does SQLT do during installation?

SQLT creates a repository in the database under the SQLTXPLAIN user to record information concerning the queries that are analyzed by the tool and also creates a number of directories on disk.

96. Can SQLT be installed into an existing schema instead of creating a new schema?

SQLT requires its own schema to be installed

97. Do you need the SYS password to run SQLT?

In order to install SQLT, you need to be connected internal. To run SQLT once it is installed, No.

98. Can I reset the password for the SQLTXPLAIN user?

Yes. Although the SQLT schema (SQLTXPLAIN) is fixed, the password can be changed just like any normal Oracle password.

99. If the Applications password is unavailable, is SYS/SYSTEM a viable alternative?

You can run with these users but this is NOT the preferred way running SQLT; you should try to get the applications password if possible. The primary issue is that without the application password some pieces of information will be missing such as the 'explain plan for' output and although in most cases this is not required, in some cases this information is important.

100. Which Schema should I register as the application scema? For example we have 3 schemas in the DB with cross permissions - Schema A is the main schema but uses objects from B and C. How does that work with SQLT?

When you install SQLT it asks who is the main application user. You can pick any schema for this, it does not matter. When you want to examine a SQL with SQLT then you need to register the schema that actually runs the SQL. So, once you have installed SQLT, connect as SYS and grant the SQLT user role to the schema you run the SQL as. Then use those schemas to run the SQLT against the SQL that those schemas run. If you don't grant the role to all the schemas and have only referenced one, SQLT will still be able to pick up the references to tables and object in the SQL that are from any other schemas. All the register does is setup the runtime environment to run as that particular user.

Manoj Simar - The Technology Expert

About Me

Sunday, 10 June 2018

Interview Q and A for Oracle Performance Tuning Part - 2

No comments:

Post a Comment