Monday, April 07, 2014

_small_table_threshold and direct path reads on partitioned tables in 11.2.0.3


I was troubleshooting a performance problem few days ago. The database the problem was experienced on was recently migrated from Oracle 10.2.0.4 to Oracle 11.2.0.3.

Long story short, the problem was described as performance of a query that scans two or more partitions in a table is much worse compared to combined performances of queries accessing each of the partitions separately.

After a short investigation I narrowed down the problem to “direct path reads” being the culprit of the problem.

As you know, due to the adaptive direct read feature introduced in 11g full table scans may utilize PGA instead of the buffer cache as it was a case in the earlier versions.

There are few good articles on this change in behavior among which I personally favor Tanel’s blogpost and hacking session and the post by Alex Fatkulin. You could also check MOS Note 793845.1.

What I observed in 11.2.0.3.0 was quite surprising and a bit different from what I’ve read so far. I know that there are different parameters/variables that influence the decision whether or not direct part reads should be used. I tried to be careful and not to fall in any of these traps.

Please note all the tests were done in a sandbox environment. I advise against trying these tests in any production environment.

The database version was 11.2.0.3.0.

_serial_direct_read = auto

_direct_read_decision_statistics_driven = TRUE

_small_table_threshold = 117


I used ASSM, a tablespace with uniform extent size(64K)

As you may know _small_table_threshold parameter is set to about 2% of the size of the buffer cache. On my test machine I have pretty small buffer cache, 5892 buffers big (117 is 1.98% of 5892)


SQL> SELECT name,block_size,buffers FROM v$buffer_pool;

NAME                                               BLOCK_SIZE    BUFFERS
-------------------------------------------------- ---------- ----------
DEFAULT                                                  8192       5892


I will try to simplify the problem by using a partitioned table, TEST_PART containing 7 partitions.


CREATE TABLE test_part 
( 
    col1 NUMBER NOT NULL
  , col2 VARCHAR2(100)
 ) 
 PARTITION BY LIST (col1) 
 (  
    PARTITION PART_1  VALUES (1)
  , PARTITION PART_2  VALUES (2)
  , PARTITION PART_3  VALUES (3)
  , PARTITION PART_4  VALUES (4)
  , PARTITION PART_5  VALUES (5)
  , PARTITION PART_6  VALUES (6)
  , PARTITION PART_7  VALUES (7)
 ) ;


Each of the 7 partitions will be populated with 4000 rows (for total of 65 blocks allocated per partition) using the following SQL statement:


INSERT INTO test_part (SELECT mod(rownum,7)+1, rpad('A',100,'A') FROM dual CONNECT BY rownum<=28000);


I will collect stats using the statement below:


exec dbms_stats.gather_table_stats(user,'TEST_PART');


As you can see from the output below, each of the partitions has 65 blocks below the HWM:


SQL> select table_name, partition_name, num_rows, blocks, avg_row_len
from user_tab_partitions
where table_name='TEST_PART';  2    3

TABLE_NAME                     PARTITION_NAME                   NUM_ROWS     BLOCKS AVG_ROW_LEN
------------------------------ ------------------------------ ---------- ---------- -----------
TEST_PART                      PART_1                               4000         65         104
TEST_PART                      PART_2                               4000         65         104
TEST_PART                      PART_3                               4000         65         104
TEST_PART                      PART_4                               4000         65         104
TEST_PART                      PART_5                               4000         65         104
TEST_PART                      PART_6                               4000         65         104
TEST_PART                      PART_7                               4000         65         104

7 rows selected.



Observation #1 -
 _small_table_threshold is applied on the total number of blocks expected to be returned by the query (considering all partition segments that will be accessed)

As you can see number of blocks in each of the partitions (65) is lower than _small_table_threshold value (117). Therefore a query that accesses only one of the partitions uses the buffer cache to store the segment blocks.


SQL> select count(1) from test_part where col1 in (1);

  COUNT(1)
----------
      4000


I will use my sese.sql script to check the values for specific session statistics. It simply scans v$sesstat for the current session and a given keyword. If there are statistics that contain the specified keyword and their value is greater than 0 they will be reported. As you can see no direct path reads were performed.


SQL> @sese direct

no rows selected


I expected to see the next query utilizing the buffer cache as well. It scans two partitions. As you know, each of the partitions has 65 blocks which is less than _small_table_threshold value (117), hence I thought I won't see any direct path reads.


SQL> select count(1) from test_part where col1 in (1,2);

  COUNT(1)
----------
      8000

However, direct path reads were performed. Moreover, even though one of the partitions I previously scanned was already in the buffer cache, both partitions were scanned using direct path reads. As shown in the output below, two segments were fully scanned using direct reads for total of 130 direct reads were performed (2x65).


SQL>  @sese direct

       SID         ID NAME                                                    VALUE
---------- ---------- -------------------------------------------------- ----------
         7         76 STAT.consistent gets direct                               130
         7         81 STAT.physical reads direct                                130
         7        380 STAT.table scans (direct read)                              2

Let’s see what happens when I increase _small_table_threshold to 130 and run the last query.


SQL> alter session set "_small_table_threshold"=130;

Session altered.

SQL> select count(1) from test_part where col1 in (1,2);

  COUNT(1)
----------
      8000

SQL>  @sese direct

       SID         ID NAME                                                    VALUE
---------- ---------- -------------------------------------------------- ----------
         7         76 STAT.consistent gets direct                               130
         7         81 STAT.physical reads direct                                130
         7        380 STAT.table scans (direct read)                              2


The number of direct path reads stayed the same, which means no direct path reads were performed.

How about if I we add one more partition to the equation now (_small_table_threshold=130):


SQL> select count(1) from test_part where col1 in (1,2,3);

  COUNT(1)
----------
     12000

SQL> @sese direct

       SID         ID NAME                                                    VALUE
---------- ---------- -------------------------------------------------- ----------
         7         76 STAT.consistent gets direct                               325
         7         81 STAT.physical reads direct                                325
         7        380 STAT.table scans (direct read)                              5


Now since we scan 3 partitions, that is 195 blocks Oracle went back to direct path reads and the statistic numbers went up by 195 (3x65) , 130+195=325 or three new table/segment scans.

Therefore seems the logic behind the decision whether or not to perform direct path reads is:

IF SUM(blocks of all partitions that are accessed)>_small_table_threshold THEN
     perform direct path reads for all partitions that are accessed
ELSE
     utilize buffer cache

Again, just to remind you this behavior is specific to 11.2.0.3.


Observation #2 -
The percentage of cached blocks per partition is not relevant


This brings me to the second observation. If you query X$KCBOQH.NUM_BUF for the partition segments (read Tanel’s blogpost or watch his hacking session ) you can see that even though partitions PART_1 and PART_2 were in the cache, Oracle still performed direct path reads for all three partitions:


SQL> conn /as sysdba
Connected.

SQL> select  o.subobject_name, b.obj#, sum(b.num_buf)
2 from    X$KCBOQH b, dba_objects o
3 where   b.obj#=o.data_object_id
4     and o.object_name='TEST_PART'
5 group by  o.subobject_name, b.obj#
6 order by 1;  

SUBOBJECT_NAME                       OBJ# SUM(B.NUM_BUF)
------------------------------ ---------- --------------
PART_1                             146024             66
PART_2                             146025             66
PART_3                             146026              1 


I ran the output above after the last test. As you can see PART_1 and PART_2 segments are completely in the buffer cache, 66 blocks each (65 blocks for the data and 1 block for the segment header). PART_3 however has only one block in the cache and that is most likely the segment header block.

But, even when all 3 partitions were fully loaded in the buffer cache, Oracle still performed direct path reads:


SQL> conn *****/*****
Connected.
SQL> select count(1) from test_part where col1 in (3);

  COUNT(1)
----------
      4000

SQL> conn /as sysdba
Connected.

SQL> select  o.subobject_name, b.obj#, sum(b.num_buf)
2 from    X$KCBOQH b, dba_objects o
3 where   b.obj#=o.data_object_id
4     and o.object_name='TEST_PART'
5 group by  o.subobject_name, b.obj#
6 order by 1;

SUBOBJECT_NAME                       OBJ# SUM(B.NUM_BUF)
------------------------------ ---------- --------------
PART_1                             146024             66
PART_2                             146025             66
PART_3                             146026             66

SQL> conn *****/*****
Connected.
SQL> @sese direct

no rows selected

SQL> select count(1) from test_part where col1 in (1,2,3);

  COUNT(1)
----------
     12000

SQL> @sese direct

       SID         ID NAME                                                    VALUE
---------- ---------- -------------------------------------------------- ----------
         7         76 STAT.consistent gets direct                               195
         7         81 STAT.physical reads direct                                195
         7        380 STAT.table scans (direct read)                              3

SQL>

I will stop with this post here. Tomorrow I will publish another post where I'll show what difference _direct_read_decision_statistics_driven could make for partitioned tables in 11.2.0.3.
(Update: Part2 - what difference does _direct_read_decision_statistics_driven=FALSE make)

I will repeat the same tests in 11.2.0.4 and 12.1.0.1 and see if the behavior is any different.

Stay tuned.
 

No comments: