It’s been more than a week since
Oracle Database 12c officially became available. I spent the very
first night on reading the documentation, New Features chapter in
particular.
New version, new features,
improvements, enhancements, some more exciting than others. One
however, intrigued me a bit .That is Session-specific statistics for
global temporary tables. From time to time I am involved in
optimizing queries that use global temporary tables so I am well
aware of the challenges that come with them.
As the name suggest temporary tables
contain temporary data. The data is visible only to the current
session and persist either to the next commit or until the session is
terminated. Either way, two sessions may populate the same temporary
table with different data and concurrently run identical queries against it.
As we all know queries are executed
using an execution plan generated by the optimizer. The efficiency of
the execution plan depends upon the information the optimizer has
about the data at the time the query is parsed. In my opinion, in
order to get good execution plans(I am deliberately avoiding to say
optimal plans) , accurate statistics are the most important piece of
information you want to feed the optimizer with .
However, when temporary tables are involved, it is often a challenge for
the optimizer to generate efficient execution plan(s) that will be appropriate for all use cases.
In some cases when temporary tables are
populated with data having similar characteristics, it is often a
practice to have appropriate statistics collected and fixed (dynamic
sampling may also fit in this case) so the optimizer produces
efficient and stable execution plan(s). (one size fits all)
However, often temporary tables do not have statistics simply because it is difficult to find ones that will cover all use cases.
Consider the following example:
There is table MY_OBJECTS created using
the following CTAS statement:
CREATE TABLE my_objects AS
SELECT * FROM all_objects
, (SELECT * FROM dual CONNECT BY rownum<=10);
A
table MY_TYPES that contains all distinct data types derived from
ALL_OBJECTS
CREATE TABLE my_types AS
SELECT DISTINCT object_type FROM all_objects;
A table T that has the same structure as MY_OBJECTS
CREATE TABLE t AS
SELECT * FROM my_objects WHERE 1=2;
And a global temporary table TEMP_OBJECTS that has only one column
which will accept OBJECT_IDs
CREATE GLOBAL TEMPORARY TABLE temp_objects
(object_id NUMBER NOT NULL)
ON COMMIT PRESERVE ROWS;
Collect statistics for MY_OBJECTS
and MY_TYPES tables
EXEC dbms_stats.gather_table_stats(ownname=>'TEST',tabname=>'MY_OBJECTS',cascade=>true);
EXEC dbms_stats.gather_table_stats(ownname=>'TEST',tabname=>'MY_TYPES',cascade=>true);
Let's suppose there are two sessions that load some data in
TEMP_OBJECTS and then execute the following insert statement:
INSERT INTO t
SELECT /*+ gather_plan_statistics */ o.*
FROM my_objects o
, my_types t
, temp_objects tt
WHERE o.object_type=t.object_type
AND o.object_id=tt.object_id;
The first session (SID 22) inserts one
record in TEMP_OBJECTS and then executes the insert statement from
above by invoking insert_1.sql script.
test[22,39]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum<=1);
1 row created.
test[22,39]> commit;
Commit complete.
test[22,39]> @insert_1
10 rows created.
Let's check the execution plan being used. It is based on Merge
Cartesian Join, which is somewhat acceptable in this case.
test[22,39]> @explain_last
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------
SQL_ID 8a1c7phuna9vn, child number 0
-------------------------------------
insert into t select /*+ gather_plan_statistics */ o.* from my_objects
o, my_types t, temp_objects tt where o.object_type=t.object_type and
o.object_id=tt.object_id
Plan hash value: 3579371359
---------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:02.56 | 14067 | 14028 | | | |
| 1 | LOAD TABLE CONVENTIONAL | | 1 | | 0 |00:00:02.56 | 14067 | 14028 | | | |
|* 2 | HASH JOIN | | 1 | 10 | 10 |00:00:00.04 | 14036 | 14028 | 1519K| 1519K| 1521K (0)|
| 3 | MERGE JOIN CARTESIAN | | 1 | 38 | 38 |00:00:00.02 | 6 | 2 | | | |
| 4 | TABLE ACCESS FULL | TEMP_OBJECTS | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | | | |
| 5 | BUFFER SORT | | 1 | 38 | 38 |00:00:00.02 | 3 | 2 | 73728 | 73728 | |
| 6 | TABLE ACCESS FULL | MY_TYPES | 1 | 38 | 38 |00:00:00.02 | 3 | 2 | | | |
| 7 | TABLE ACCESS FULL | MY_OBJECTS | 1 | 882K| 882K|00:00:02.09 | 14030 | 14026 | | | |
---------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE" AND "O"."OBJECT_ID"="TT"."OBJECT_ID")
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
30 rows selected.
Session with SID (251) kicks in a
moment later loading 1000 rows in TEMP_OBJECTS and runs the same
insert statement. Since the same SQL statement has already been
parsed , this session will reuse the existing cursor. However, in
this case the plan being used is the one that should be avoided since
cartesian join operation may severely impact performances:
test[251,65]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum<=1000);
1000 rows created.
test[251,65]> @insert_1
10000 rows created.
test[251,65]> @explain_last
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8a1c7phuna9vn, child number 0
-------------------------------------
insert into t select /*+ gather_plan_statistics */ o.* from my_objects
o, my_types t, temp_objects tt where o.object_type=t.object_type and
o.object_id=tt.object_id
Plan hash value: 3579371359
---------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:03.87 | 15495 | 14026 | | | |
| 1 | LOAD TABLE CONVENTIONAL | | 1 | | 0 |00:00:03.87 | 15495 | 14026 | | | |
|* 2 | HASH JOIN | | 1 | 10 | 10000 |00:00:00.22 | 14037 | 14026 | 3162K| 2024K| 3101K (0)|
| 3 | MERGE JOIN CARTESIAN | | 1 | 38 | 38000 |00:00:00.73 | 7 | 0 | | | |
| 4 | TABLE ACCESS FULL | TEMP_OBJECTS | 1 | 1 | 1000 |00:00:00.02 | 4 | 0 | | | |
| 5 | BUFFER SORT | | 1000 | 38 | 38000 |00:00:00.21 | 3 | 0 | 73728 | 73728 | |
| 6 | TABLE ACCESS FULL | MY_TYPES | 1 | 38 | 38 |00:00:00.01 | 3 | 0 | | | |
| 7 | TABLE ACCESS FULL | MY_OBJECTS | 1 | 882K| 882K|00:00:01.97 | 14030 | 14026 | | | |
---------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE" AND "O"."OBJECT_ID"="TT"."OBJECT_ID")
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
30 rows selected.
As
expected, the same cursor was reused by both sessions.
sys[263,185]> select plan_hash_value, child_number, CHILD_ADDRESS, executions, invalidations, is_shareable, is_obsolete from v$sql where sql_id='8a1c7phuna9vn';
PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS EXECUTIONS INVALIDATIONS I I
--------------- ------------ ---------------- ---------- ------------- - -
3579371359 0 00000000618B54B8 2 0 Y N
sys[263,185]>
As you can sense, plan stability is one
of the biggest challenges when dealing with queries that depend on
temporary tables.
There are several
strategies available, that I am aware of, on how to attack this
problem. They include: use of stored outlines; fixed statistics;
baselines; hints (cardinality hint for example); periodically invalidating plans; using separate
workspaces (schemas) for different types of workloads; and couple of
others more or less effective. Each of these comes with its own limitations.
With Oracle 12c users have an ability
to gather session specific statistics for global temporary tables.
This feature seems to nicely address the challenges from above.
When session specific statistics are
used, each session collects “private” statistics and have CBO
generate execution plan/cursor based on them. These cursors are not
shared with other sessions. Also a cursor that’s been generated
using session specific statistics is invalidated when statistics are
re-collected within the same session.
Session specific statistics are enabled
when GLOBAL_TEMP_TABLE_STATS preference is set to SESSION.
exec dbms_stats.set_table_prefs(ownname=>'TEST', tabname=>'TEMP_OBJECTS',pname=>'GLOBAL_TEMP_TABLE_STATS',pvalue=>'SESSION');
Once the preference is set all one
needs to do is to gather stats for session based statistics to be
collected.
And that is all, now let’s see which
plans will be generated for each of the sessions if we repeat the
example from above when session statistics are being used.
Session with SID 251 loads 1000 rows into TEMP_OBJECTS table,
collects stats and executes the same insert statement.
test[251,69]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum<=1000);
1000 rows created.
test[251,69]> exec dbms_stats.gather_table_stats(ownname=>'TEST', tabname=>'TEMP_OBJECTS');
PL/SQL procedure successfully completed.
test[251,69]> @insert_1
10000 rows created.
Now,
the execution plan used is based on a HASH JOIN rather than MERGE
CARTESIAN JOIN operation
test[251,69]> @explain_last
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8a1c7phuna9vn, child number 1
-------------------------------------
insert into t select /*+ gather_plan_statistics */ o.* from my_objects
o, my_types t, temp_objects tt where o.object_type=t.object_type and
o.object_id=tt.object_id
Plan hash value: 4256520316
---------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:02.71 | 15514 | 14026 | | | |
| 1 | LOAD TABLE CONVENTIONAL | | 1 | | 0 |00:00:02.71 | 15514 | 14026 | | | |
|* 2 | HASH JOIN | | 1 | 9858 | 10000 |00:00:00.07 | 14037 | 14026 | 1696K| 1696K| 1583K (0)|
| 3 | TABLE ACCESS FULL | MY_TYPES | 1 | 38 | 38 |00:00:00.01 | 3 | 0 | | | |
|* 4 | HASH JOIN | | 1 | 9858 | 10000 |00:00:00.04 | 14034 | 14026 | 2293K| 2293K| 1607K (0)|
| 5 | TABLE ACCESS FULL | TEMP_OBJECTS | 1 | 1000 | 1000 |00:00:00.01 | 4 | 0 | | | |
| 6 | TABLE ACCESS FULL | MY_OBJECTS | 1 | 882K| 882K|00:00:01.64 | 14030 | 14026 | | | |
---------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE")
4 - access("O"."OBJECT_ID"="TT"."OBJECT_ID")
Note
-----
- Global temporary table session private statistics used
30 rows selected.
Note
the “Note” - Global temporary table session private statistics
used.
If
you check v$sqlarea or v$sql you will see there are two child cursors
for the same sql id.
sys[263,185]> select plan_hash_value, child_number, CHILD_ADDRESS, executions, invalidations, is_shareable, is_obsolete from v$sql where sql_id='8a1c7phuna9vn';
PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS EXECUTIONS INVALIDATIONS I I
--------------- ------------ ---------------- ---------- ------------- - -
3579371359 0 00000000618B54B8 2 0 Y N
4256520316 1 00000000625A3928 1 0 Y N
sys[263,185]>
It is worth to
mention that sessions that use global statistics will keep using
without interfering with the session specific stats and cursors.
For example, session with SID 26 loads
1000 rows, but doesn't collect stats on TEMP_OBJECTS table. The
insert statement in this case will use the globally available cursors
(child cursor 0)
test[26,145]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum<=1000);
1000 rows created.
test[26,145]> commit;
Commit complete.
test[26,145]> @insert_1
10000 rows created.
test[26,145]> @explain_last
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8a1c7phuna9vn, child number 0
-------------------------------------
insert into t select /*+ gather_plan_statistics */ o.* from my_objects
o, my_types t, temp_objects tt where o.object_type=t.object_type and
o.object_id=tt.object_id
Plan hash value: 3579371359
---------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:03.71 | 15513 | 14026 | | | |
| 1 | LOAD TABLE CONVENTIONAL | | 1 | | 0 |00:00:03.71 | 15513 | 14026 | | | |
|* 2 | HASH JOIN | | 1 | 10 | 10000 |00:00:00.13 | 14039 | 14026 | 3162K| 2024K| 3114K (0)|
| 3 | MERGE JOIN CARTESIAN | | 1 | 38 | 38000 |00:00:00.64 | 9 | 0 | | | |
| 4 | TABLE ACCESS FULL | TEMP_OBJECTS | 1 | 1 | 1000 |00:00:00.01 | 6 | 0 | | | |
| 5 | BUFFER SORT | | 1000 | 38 | 38000 |00:00:00.19 | 3 | 0 | 73728 | 73728 | |
| 6 | TABLE ACCESS FULL | MY_TYPES | 1 | 38 | 38 |00:00:00.01 | 3 | 0 | | | |
| 7 | TABLE ACCESS FULL | MY_OBJECTS | 1 | 882K| 882K|00:00:01.88 | 14030 | 14026 | | | |
---------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE" AND "O"."OBJECT_ID"="TT"."OBJECT_ID")
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
30 rows selected.
sys[263,185]> select plan_hash_value, child_number, CHILD_ADDRESS, executions, invalidations, is_shareable, is_obsolete from v$sql where sql_id='8a1c7phuna9vn';
PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS EXECUTIONS INVALIDATIONS I I
--------------- ------------ ---------------- ---------- ------------- - -
3579371359 0 00000000618B54B8 3 0 Y N
4256520316 1 00000000625A3928 1 0 Y N
Let's suppose the session with sid 251
modifies the TEMP_OBJECTS table, deletes its content and loads only 1
row followed by refreshing the session specific statistics. In this
case it is expected to see the plan with hash value 3579371359
to be used.
test[251,69]> DELETE FROM temp_objects;
1000 rows deleted.
test[251,69]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum<=1);
1 row created.
test[251,69]> commit;
Commit complete.
test[251,69]> exec dbms_stats.gather_table_stats(ownname=>'TEST', tabname=>'TEMP_OBJECTS');
PL/SQL procedure successfully completed.
test[251,69]>
If you check the cursor state
(valid/invalid) you can see the session private cursor (child cursor
1) was just invalidated.
PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS EXECUTIONS INVALIDATIONS I I
--------------- ------------ ---------------- ---------- ------------- - -
3579371359 0 00000000618B54B8 3 0 Y N
4256520316 1 00000000625A3928 1 1 Y N
test[251,69]> @insert_1
10 rows created.
test[251,69]> @explain_last
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8a1c7phuna9vn, child number 1
-------------------------------------
insert into t select /*+ gather_plan_statistics */ o.* from my_objects
o, my_types t, temp_objects tt where o.object_type=t.object_type and
o.object_id=tt.object_id
Plan hash value: 3579371359
---------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:02.43 | 14051 | 14026 | | | |
| 1 | LOAD TABLE CONVENTIONAL | | 1 | | 0 |00:00:02.43 | 14051 | 14026 | | | |
|* 2 | HASH JOIN | | 1 | 10 | 10 |00:00:00.01 | 14037 | 14026 | 1519K| 1519K| 1426K (0)|
| 3 | MERGE JOIN CARTESIAN | | 1 | 38 | 38 |00:00:00.01 | 7 | 0 | | | |
| 4 | TABLE ACCESS FULL | TEMP_OBJECTS | 1 | 1 | 1 |00:00:00.01 | 4 | 0 | | | |
| 5 | BUFFER SORT | | 1 | 38 | 38 |00:00:00.01 | 3 | 0 | 73728 | 73728 | |
| 6 | TABLE ACCESS FULL | MY_TYPES | 1 | 38 | 38 |00:00:00.01 | 3 | 0 | | | |
| 7 | TABLE ACCESS FULL | MY_OBJECTS | 1 | 882K| 882K|00:00:02.05 | 14030 | 14026 | | | |
---------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE" AND "O"."OBJECT_ID"="TT"."OBJECT_ID")
Note
-----
- Global temporary table session private statistics used
30 rows selected.
As expected the plan with hash value
3579371359 was used.
PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS EXECUTIONS INVALIDATIONS I I
--------------- ------------ ---------------- ---------- ------------- - -
3579371359 0 00000000618B54B8 3 0 Y N
3579371359 1 00000000625A3928 1 1 Y N
You could also use v$sql_shared_cursor
view to observe how the cursors are created and shared.
In order to track statistics one could
use [DBA|USER|ALL]_TAB_STATISTICS views. These views have SCOPE
column that indicate whether the statistics are shared or session
specific:
test[251,69]> SELECT owner, table_name, num_rows, last_analyzed, scope FROM dba_tab_statistics WHERE table_name='TEMP_OBJECTS';
OWNER TABLE_NAME NUM_ROWS LAST_ANAL SCOPE
--------------- --------------- ---------- --------- -------
TEST TEMP_OBJECTS SHARED
TEST TEMP_OBJECTS 1 30-JUN-13 SESSION
Finally, something to think about is the possibility to increase hard parsing. As you
may have observed from above, each parse made after the stats are gathered is a
hard parse. Therefore, depending upon your workload you may expect
to see more hard parsing.
Resources