Thursday, January 17, 2019

DYNAMIC SAMPLING



Introduction 

Oracle Database introduced Dynamic Sampling to allow the optimizer to gather
additional information at parse time.

If database statistics were missing, stale or insufficient to produce a good execution
plan.

Dynamic sampling was controlled by the OPTIMIZER_DYNAMIC_SAMPLING
initialization parameter.

They can be initialized at instance or session level, or for individual queries using the
DYNAMIC_SAMPLING hint.

with available values ranging between "0" (off) to "10" (aggressive sampling) with
default value of "2".

Dynamic Sampling in 12C 

Dynamic sampling has been renamed to Dynamic Statistics in Oracle Database 12c.

Much of the functionality is the same, but a new sample level of 11 has been added.

The new setting allows the optimizer to decide if dynamic statistics should be sampled
and if so, what sample level to use.

Why Use Dynamic Sampling

Dynamic sampling enables the server to improve performance by :

Estimating single-table predicate selectivities where available statistics are missing or 
may lead to bad estimations.

Estimating statistics for tables and indexes with missing statistics.

Estimating statistics for tables and indexes with out of date statistics.

At compile-time Oracle determines if dynamic sampling would improve query
performance.

If so it issues recursive statements to estimate the necessary statistics.

Dynamic sampling benefits:

Dynamic sampling can be beneficial in the following ways

The sample time is small compared to the overall query execution time.

T he current database statistics alone would not create an optimal plan, so dynamic
sampling results in a better performing query.

The query may be executed multiple times.

In addition to the OPTIMIZER_DYNAMIC_SAMPLING system parameter the dynamic 
sampling level can be set.

We use DYNAMIC_SAMPLING optimizer hint for specific queries like the following.

SELECT /*+ dynamic_sampling(emp 10) */
       empno, ename, job, sal
FROM   emp
WHERE  deptno = 30;

The results of dynamic sampling are repeatable provided no rows are inserted,
updated or deleted from the sampled table.

When to Sample

The following situations will trigger automatic sampling to gather dynamic statistics

Missing Statistics  
Dynamic statistics are sampled if there are missing database statistics.

They may be missing because they are newly created objects, or had their statistics
locked before any statistics were gathered.

Although the dynamic statistics should help the optimizer, these statistics are 
considered low quality compared to conventional database statistics.

Stale Statistics

Statistics are considered stale when 10% or more of the rows in the table have
changed since the statistics were last gathered.

Stale statistic can affect cardinality estimates.

Because of changes to the number of rows in the table and inaccuracies in column
statistics, such as number of distinct values, high and low column values.

Insufficient Statistics

Existing database statistics may not be sufficient to generate an optimal execution
plan.

Dynamic statistics can make up for the absence of extended statistics for column
groups and expressions, as well as missing histograms that would identify data skew.

Even when all the necessary statistics are present it may not be possible to correctly
estimate cardinalities.

For some complex predicates, operations or joins, so dynamic sampling may still be
necessary.

Parallel Execution
Parallel execution is typically used to speed up long running processes.

For a long running process, the time associated with sampling dynamic statistics is
compared to the query execution time.

So it may be worth spending a little more time to make sure the execution plan is
optimal.

SQL Plan Directives

The presence of one or more usable SQL plan directives will trigger the sampling of
dynamic statistics.

SQL plan directives are created when the optimizer identifies misestimates.

The cardinality of operations or degree of parallelism (DOP) from previous executions
of the statement, or other statements using similar query expressions.

Controlling Dynamic Statistics

Dynamic statistics can be controlled by the

OPTIMIZER_DYNAMIC_SAMPLING initialization parameter and the
DYNAMIC_SAMPLING hint.

The following examples show how to control dynamic statistics at system, session and 
statement level.

   --System level. Don't do this!
   ALTER SYSTEM SET OPTIMIZER_DYNAMIC_SAMPLING=11;
   --Session level.
   ALTER SESSION SET OPTIMIZER_DYNAMIC_SAMPLING=11;
   -- Statement level
   SELECT /*+ dynamic_sampling(emp 11) */
   empno, ename, job, sal FROM emp WHERE  deptno = 30;
     
You should not need to change the default value of "2".   

If the optimizer identifies misestimates, it can create SQL plan directives to force
dynamic sampling.

The presence of SQL plan directives influence the way DBMS_STATS gathers
statistics, which potentially fixes the root cause of the problems.

Reuse of Dynamic Statistics

Dynamic statistics can be stored in the SGA as OPT_ESTIMATE hints in the V$SQL_REOPTIMIZATION_HINTS view.

2 comments: