Friday, November 15, 2013

Playing with Oracle Spatial - An important consideration when using SDO_NN operator


I've been playing with Oracle Spatial quite a bit lately and ran into some interesting things that I  plan to write about in the next couple of posts.

This post covers an important consideration, I think, one should make when using SDO_NN spatial operator with sdo_num_res parameter.

But first, just briefly on the operator itself. SDO_NN operator is used to identify the nearest neighbors for a particular geometry. The full operator synopsis can be found in the official documentation

Please note, the database I use in the examples below is Oracle 11.2.0.3.0 Enterprise Edition with no additional patches applied and default Oracle Spatial settings.

Let's consider the following example, there is a table RESTAURANTS having only an id and a geometry column (point).

 CREATE TABLE restaurants
(
 id NUMBER NOT NULL,
 point MDSYS.SDO_GEOMETRY
); 

ALTER TABLE restaurants ADD CONSTRAINT restaurants_pk PRIMARY KEY (id);


Let's create a spatial index on restaruants.point column:

 INSERT INTO user_sdo_geom_metadata 
    (   TABLE_NAME
      , COLUMN_NAME
      , DIMINFO
      ,   SRID)
    VALUES 
   (   'restaurants'
      , 'POINT'
      , SDO_DIM_ARRAY(   
           SDO_DIM_ELEMENT('X', 0, 10000000, 0.005),
           SDO_DIM_ELEMENT('Y', 0, 10, 0.005)
        )
      , null   -- SRID
);
 
CREATE INDEX restaurants_sidx ON restaurants(point) INDEXTYPE IS MDSYS.SPATIAL_INDEX; 


For simplicity let's say there are 500,000 restaurants distributed on a single line (x-axis of two dimensional coordinate system, y=0, x=1,2,3,...500,000).

INSERT INTO restaurants 
(SELECT rownum   --id
    ,   SDO_GEOMETRY(
                      2001 -- 2 dimensional point
                   ,  null  -- SDO SRID
                   ,  SDO_POINT_TYPE(rownum,0,null) –-x=rownum, y=0
                   ,  null
                   ,  null
        )
 FROM dual CONNECT BY rownum<=500000
);

In order to find the first 5 nearest neighbors to point (10,0) one could use the following spatial query:

SQL> SELECT p.id, sdo_nn_distance(1) distance
  2  FROM restaurants p
  3  WHERE SDO_NN(p.point,
  4                      SDO_GEOMETRY(
  5                                     2001 -- 2 dimensional point
  6                             ,       null
  7                             ,       SDO_POINT_TYPE(10,0,null)
  8                             ,       null
  9                             ,       null
 10                             ),
 11                      'sdo_num_res=5',
 12                      1
 13                     )='TRUE'
 14* ORDER BY 2;

        ID   DISTANCE
---------- ----------
        10          0
         9          1
        11          1
        12          2
         8          2

SQL>

Now suppose each restaurant has 5 menu items that are stored in MENU_ITEMS table. The table has 4 columns, a unique identifier ID; a reference to RESTAURANTS table, rest_id; an identifier that identifies an item within the restaurant menu, item_number; and lastly a description, 100 characters long column that I use in this example to make this table a bit bigger.

CREATE TABLE menu_items
(id NUMBER NOT NULL,
 rest_id NUMBER NOT NULL,
 item_number NUMBER NOT NULL,
 description varchar2(100)
);

ALTER TABLE menu_items ADD CONSTRAINT menu_items_pk PRIMARY KEY (id); 

ALTER TABLE menu_items ADD CONSTRAINT menu_items_rest_FK FOREIGN KEY (rest_id) REFERENCES restaurants(id); 

CREATE INDEX menu_items_rest_idx ON menu_items(rest_id);

 
INSERT INTO menu_items 
(SELECT rownum
      , b.rn
      , a.rn 
      , rpad(rownum,100,'A')
 FROM 
     (SELECT rownum rn FROM dual CONNECT BY rownum<=5) a, --5 menu items per rest.
     (SELECT rownum rn FROM dual CONNECT BY rownum<=500000) b
);

commit;

Suppose you want to get the menu items of the top 5 nearest restaurants to a particular location (10,0) for an example. One way would be with the following spatial query (but first of course collect the necessary stats):

BEGIN
        DBMS_STATS.GATHER_TABLE_STATS(ownname=>user,tabname=>'restaurants',cascade=>TRUE);
        DBMS_STATS.GATHER_TABLE_STATS(ownname=>user,tabname=>'menu_items',cascade=>TRUE, method_opt=>'FOR ALL INDEXED COLUMNS SIZE AUTO');
END;

SELECT t1.id
     , t2.id
     , t2.restaurants_id
     , t2.item_number
FROM restaurants t1
   , menu_items t2
WHERE t1.id=t2.restaurants_id
  AND SDO_NN(t1.point,
              SDO_GEOMETRY(
                            2001 -- 2 dimensional point
                         ,  null 
                         ,  SDO_POINT_TYPE(10,0,null)
                         ,  null
                         ,  null
              ),
             'sdo_num_res=5',
             1 
       )='TRUE'
ORDER BY t1.id, t2.item_number;


The query produces the desired result. However, let's look how it performed. I re-run the query, but this time I used gather_plan_statistics hint to get the execution statistics for the query.

SQL>SELECT /*+ gather_plan_statistics */ t1.id, t2.id, t2.restaurants_id, t2.item_number
  2  FROM restaurants t1
  3     , menu_items t2
  4  WHERE t1.id=t2.restaurants_id
  5    AND SDO_NN(t1.point,
  6                      SDO_GEOMETRY(
  7                                     2001 -- 2 dimensional point
  8                             ,       null
  9                             ,       SDO_POINT_TYPE(10,0,null)
 10                             ,       null
 11                             ,       null
 12                             ),
 13                      'sdo_num_res=5',
 14                      1
 15                     )='TRUE'
 16  ORDER BY t1.id, t2.item_number; 

        ID         ID RESTAURANTS_ID ITEM_NUMBER
---------- ---------- -------------- -----------
         8          8              8           1
         8     500008              8           2
         8    1000008              8           3
         8    1500008              8           4
         8    2000008              8           5
         9          9              9           1
         9     500009              9           2
         9    1000009              9           3
         9    1500009              9           4
         9    2000009              9           5
        10         10             10           1
        10     500010             10           2
        10    1000010             10           3
        10    1500010             10           4
        10    2000010             10           5
        11         11             11           1
        11     500011             11           2
        11    1000011             11           3
        11    1500011             11           4
        11    2000011             11           5
        12         12             12           1
        12     500012             12           2
        12    1000012             12           3
        12    1500012             12           4
        12    2000012             12           5

25 rows selected.


The query took 1.32 seconds to complete as you can see from the execution statistics:

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  gwpqub3k0awqm, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ t1.id, t2.id, t2.restaurants_id,
t2.item_number FROM restaurants t1    , menu_items t2 WHERE
t1.id=t2.restaurants_id   AND SDO_NN(t1.point,     SDO_GEOMETRY(
2001 -- 2 dimensional point     ,  null     ,
SDO_POINT_TYPE(10,0,null)     ,  null     ,  null     ),
'sdo_num_res=5',     1    )='TRUE' ORDER BY t1.id, t2.item_number

Plan hash value: 2076547507


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name             | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  |      1 |        |     25 |00:00:01.32 |   41540 |  41465 |       |       |          |
|   1 |  SORT ORDER BY                |                  |      1 |  24678 |     25 |00:00:01.32 |   41540 |  41465 |  2048 |  2048 | 2048  (0)|
|*  2 |   HASH JOIN                   |                  |      1 |  24678 |     25 |00:00:01.32 |   41540 |  41465 |  1517K|  1517K|  886K (0)|
|   3 |    TABLE ACCESS BY INDEX ROWID| RESTAURANTS      |      1 |   5000 |      5 |00:00:00.01 |      64 |      0 |       |       |          |
|*  4 |     DOMAIN INDEX              | RESTAURANTS_SIDX |      1 |        |      5 |00:00:00.01 |      63 |      0 |       |       |          |
|   5 |    TABLE ACCESS FULL          | MENU_ITEMS       |      1 |   2500K|   2500K|00:00:00.62 |   41476 |  41465 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."ID"="T2"."RESTAURANTS_ID")
   4 - access("MDSYS"."SDO_NN"("T1"."POINT","MDSYS"."SDO_GEOMETRY"(2001,NULL,"SDO_POINT_TYPE"(10,0,NULL),NULL,NULL),'sdo_num_res=5',1)='
              TRUE')


29 rows selected.

SQL> 


The problem

Well, one could say that 1.32 seconds is not that bad. However, if you look at the plan more carefully you can notice that CBO used a HASH JOIN method to join RESTAURANTS and MENU_ITEMS tables and as a result it performed full table scan on MENU_ITEMS table. Now imagine if MENU_ITEMS table was quite big (suppose you have stored the items of all restaurants across US)?!

What made CBO to pick a HASH JOIN? Look at the estimated (5,000) vs. actual rows (5) selected from RESTAURANTS table. We were only after the 5 nearest neighbors. We used sdo_num_res=5 in our SDO_NN call. One could expect that CBO would recognize that fact and estimate cardinality of 5. But, no, CBO applied selectivity of 1% (500,000 restaurants x 1% = 5,000)

Why selectivity of 1%?

SDO_NN is an operator that is bound to the implementation of the function NN that is part of PRVT_IDX package owned by MDSYS.

By default, statistics for PRVT_IDX package is defined by SDO_STATISTICS type. That is Extensible Optimizer feature is used to define how cardinality and the cost will be calculated for functions/procedures defined in the package. (if statistics are not associated, CBO uses default selectivity of 1%)

Seems like the logic implemented in ODCIStatsSelectivity() function is not good enough to detect that we are only after the first N rows, as defined with SDO_NUM_RES parameter.

You can clearly see this if you create 10053 trace file in the section where MDSYS.SDO_STATISTICS.ODCIStatsSelectivity procedure is used to calculate selectivity.



  MDSYS.SDO_STATISTICS.ODCIStatsFunctionCost returned:
    CPUCost      : 100000000000000
    IOCost       : 100000000000000
    NetworkCost  : -1
  Calling user-defined selectivity function...
    predicate: "MDSYS"."SDO_NN"("T1"."POINT","MDSYS"."SDO_GEOMETRY"(2001,NULL,"SDO_POINT_TYPE"(10,0,NULL),NULL,NULL),'sdo_num_res=5',1)='TRUE'
  declare
     sel number;
     obj0 "MDSYS"."SDO_GEOMETRY" := "MDSYS"."SDO_GEOMETRY"(NULL, NULL, NULL, NULL, NULL);

    begin
      :1 := "MDSYS"."SDO_STATISTICS".ODCIStatsSelectivity(
                     sys.ODCIPREDINFO('MDSYS',
                            'PRVT_IDX',
                            'NN',
                            45),
                     sel,
                     sys.ODCIARGDESCLIST(sys.ODCIARGDESC(3, NULL, NULL, NULL, NULL, NULL, NULL), sys.ODCIARGDESC(3, NULL, NULL, NULL, NULL, NULL, NULL), sys.ODCIARGDESC(2, 'RESTAURANTS', 'TEST', '"POINT"', NULL, NULL, NULL), sys.ODCIARGDESC(3, NULL, NULL, NULL, NULL, NULL, NULL), sys.ODCIARGDESC(3, NULL, NULL, NULL, NULL, NULL, NULL)),
                     :3,
                     :4
                     , obj0, "MDSYS"."SDO_GEOMETRY"(2001,NULL,"SDO_POINT_TYPE"(10,0,NULL),NULL,NULL), :5,
                     sys.ODCIENV(:6,:7,:8,:9));
      if sel IS NULL then
        :2 := -1.0;
      else
        :2 := sel;
      end if;
      exception
        when others then
          raise;
    end;
Bind :3 Value 'TRUE'
Bind :4 Value 'TRUE'
Bind :5 Value 'sdo_num_res=5'
ODCIEnv Bind :6 Value 0
ODCIEnv Bind :7 Value 0
ODCIEnv Bind :8 Value 0
ODCIEnv Bind :9 Value 3
  MDSYS.SDO_STATISTICS.ODCIStatsSelectivity returned selectivity: 1.00000000%
  Table: RESTAURANTS  Alias: T1
    Card: Original: 500000.000000  Rounded: 5000  Computed: 5000.00  Non Adjusted: 5000.00




If statistics are not associated CBO would also use selectivity of 1%. So what is the differences? Under what circumstances one could take advantage of the extensible optimizer feature and generate better selectivity when SDO_NN operator is used along with sdo_num_res parameter? I couldn't find any article or documentation piece that will help me answer these questions.

What I do know however, is that it shouldn't be that difficult to implement a piece of logic that will make SDO_STATISTICS.ODCIStatsSelectivity procedure generate more accurate numbers.

As a matter of fact, I played a bit with this.

Disclaimer: Please be aware the following examples are for demonstration purposes only and shouldn't be used in an actual, supported, database environment. The logic I show below is quite simple and is for illustration only. Please contact Oracle Support if you want to implement something similar in your environment.

I created a simple type named SDO_STATISTICS_CUSTOM under MDSYS schema using the Extensible Optimizer interface (link )

I implemented ODCIStatsSelectivity procedure as shown below (one more time, this implementation is to demonstrate that with appropriate logic ODCIStatsSelectivity can return desired values):



STATIC FUNCTION ODCIStatsSelectivity (
                      pred_info      IN  SYS.ODCIPredInfo,
                      sel          OUT NUMBER,
                      args           IN  SYS.ODCIArgDescList,
                      p_start          IN  VARCHAR2,
                      p_stop           IN  VARCHAR2,
                      geometry1      IN MDSYS.SDO_GEOMETRY,
        geometry2      IN MDSYS.SDO_GEOMETRY,
               param          IN VARCHAR2,
                      env            IN  SYS.ODCIEnv
                      ) RETURN NUMBER IS
      BEGIN
   
    *******************  
         if param=NULL then 
  sel:=1;
  else if lower(substr(param,1,instr(param,'=')-1)) = 'sdo_num_res' then
   SELECT  decode(num_rows,null,1,
     decode(to_number(trim(substr(param,instr(param,'=')+1))),0,1,to_number(trim(substr(param,instr(param,'=')+1)))/num_rows)*100 
    )
           INTO   sel
           FROM   dba_tables where owner=args(3).TABLESCHEMA and table_name=args(3).TABLENAME;

   if sel>100 then 
    sel:=100 ;
   end if;   

  end if;
  end if;
      
****************     
          
         RETURN ODCIConst.success;
      END ODCIStatsSelectivity; 
 
 

The logic I chose is simple, if sdo_num_res parameter is defined, get the number of records from the statistics available for the table and set the selectivity variable (sel) to (sdo_num_res/num_rows)*100.

Let's see how it works:


SQL> EXPLAIN PLAN FOR
  2  SELECT t1.id, t2.id, t2.restaurants_id, t2.item_number
  3  FROM restaurants t1
  4     , menu_items t2
  5  WHERE t1.id=t2.restaurants_id
  6    AND SDO_NN(t1.point,
  7                           SDO_GEOMETRY(
  8                                          2001 -- 2 dimensional point
  9                                  ,       null
 10                                  ,       SDO_POINT_TYPE(10,0,null)
 11                                  ,       null
 12                                  ,       null
 13                                  ),
 14                           'sdo_num_res=5',
 15                           1
 16                          )='TRUE'
 17* ORDER BY t1.id, t2.item_number
SQL> /

Explained.

SQL> set line 200 pagesize 9999
SQL> SELECT * FROM table(DBMS_XPLAN.DISPLAY);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2341889131

------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                            |    25 |  1000 |    36   (3)| 00:00:01 |
|   1 |  SORT ORDER BY                |                            |    25 |  1000 |    36   (3)| 00:00:01 |
|   2 |   NESTED LOOPS                |                            |    25 |  1000 |    35   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| RESTAURANTS                |     5 |    80 |     0   (0)| 00:00:01 |
|*  4 |     DOMAIN INDEX              | RESTAURANTS_SIDX           |       |       |     0   (0)| 00:00:01 |
|   5 |    TABLE ACCESS BY INDEX ROWID| MENU_ITEMS                 |     5 |   120 |     7   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN          | MENU_ITEMS_RESTAURANTS_IDX |     5 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("MDSYS"."SDO_NN"("T1"."POINT","MDSYS"."SDO_GEOMETRY"(2001,NULL,"SDO_POINT_TYPE"(10,0,N
              ULL),NULL,NULL),'sdo_num_res=5',1)='TRUE')
   6 - access("T1"."ID"="T2"."RESTAURANTS_ID")

20 rows selected.


As you can see the cardinality is correctly calculated which resulted CBO to produce better execution plan. With this plan in place, the query completes in less than 1 cs.

If you look into 10053 trace file you can also see that the calculated selectivity is 0.001%:




  MDSYS.SDO_STATISTICS_CUSTOM.ODCIStatsFunctionCost returned:
    CPUCost      : 10000
    IOCost       : 10000
    NetworkCost  : 10000
  Calling user-defined selectivity function...
    predicate: "MDSYS"."SDO_NN"("T1"."POINT","MDSYS"."SDO_GEOMETRY"(2001,NULL,"SDO_POINT_TYPE"(10,0,NULL),NULL,NULL),'sdo_num_res=5',1)='TRUE'
  declare
     sel number;
     obj0 "MDSYS"."SDO_GEOMETRY" := "MDSYS"."SDO_GEOMETRY"(NULL, NULL, NULL, NULL, NULL);

    begin
      :1 := "MDSYS"."SDO_STATISTICS_CUSTOM".ODCIStatsSelectivity(
                     sys.ODCIPREDINFO('MDSYS',
                            'PRVT_IDX',
                            'NN',
                            45),
                     sel,
                     sys.ODCIARGDESCLIST(sys.ODCIARGDESC(3, NULL, NULL, NULL, NULL, NULL, NULL), sys.ODCIARGDESC(3, NULL, NULL, NULL, NULL, NULL, NULL), sys.ODCIARGDESC(2, 'RESTAURANTS', 'TEST', '"POINT"', NULL, NULL, NULL), sys.ODCIARGDESC(3, NULL, NULL, NULL, NULL, NULL, NULL), sys.ODCIARGDESC(3, NULL, NULL, NULL, NULL, NULL, NULL)),
                     :3,
                     :4
                     , obj0, "MDSYS"."SDO_GEOMETRY"(2001,NULL,"SDO_POINT_TYPE"(10,0,NULL),NULL,NULL), :5,
                     sys.ODCIENV(:6,:7,:8,:9));
      if sel IS NULL then
        :2 := -1.0;
      else
        :2 := sel;
      end if;
      exception
        when others then
          raise;
    end;
Bind :3 Value 'TRUE'
Bind :4 Value 'TRUE'
Bind :5 Value 'sdo_num_res=5'
ODCIEnv Bind :6 Value 0
ODCIEnv Bind :7 Value 0
ODCIEnv Bind :8 Value 0
ODCIEnv Bind :9 Value 3
  MDSYS.SDO_STATISTICS_CUSTOM.ODCIStatsSelectivity returned selectivity: 0.00100000%


Anyway, since the current implementation of SDO_STATISTICS type returns selectivity of 1%, the next logical question is what one can do in order to workaround the problem.

Workaround

The problem can be addressed by providing CBO that extra bit of information, that is how many rows we expect to get back after applying SDO_NN operator. In order to do this one could consider using the cardinality hint or rewrite the query and add one more predicate ROWNUM<=5 that would tell CBO how many rows we are expecting to select from RESTAURANTS table. (there may be other strategies available too)

Use cardinality hint:

SELECT /*+ cardinality(t1 5) gather_plan_statistics */ t1.id, t2.id, t2.restaurants_id, t2.item_number
FROM restaurants t1
   , menu_items t2
WHERE t1.id=t2.restaurants_id
  AND SDO_NN(t1.point,
              SDO_GEOMETRY(
                            2001 -- 2 dimensional point
                         ,  null 
                         ,  SDO_POINT_TYPE(10,0,null)
                         ,  null
                         ,  null
              ),
             'sdo_num_res=5',
             1 
      )='TRUE'
ORDER BY t1.id, t2.item_number;


The execution statistics in this case look like:

Plan hash value: 2341889131


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                       | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                            |      1 |        |     25 |00:00:00.01 |      92 |      5 |       |       |          |
|   1 |  SORT ORDER BY                |                            |      1 |     25 |     25 |00:00:00.01 |      92 |      5 |  2048 |  2048 | 2048  (0)|
|   2 |   NESTED LOOPS                |                            |      1 |     25 |     25 |00:00:00.01 |      92 |      5 |       |       |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| RESTAURANTS                |      1 |      5 |      5 |00:00:00.01 |      55 |      0 |       |       |          |
|*  4 |     DOMAIN INDEX              | RESTAURANTS_SIDX           |      1 |        |      5 |00:00:00.01 |      54 |      0 |       |       |          |
|   5 |    TABLE ACCESS BY INDEX ROWID| MENU_ITEMS                 |      5 |      5 |     25 |00:00:00.01 |      37 |      5 |       |       |          |
|*  6 |     INDEX RANGE SCAN          | MENU_ITEMS_RESTAURANTS_IDX |      5 |      5 |     25 |00:00:00.01 |      12 |      0 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("MDSYS"."SDO_NN"("T1"."POINT","MDSYS"."SDO_GEOMETRY"(2001,NULL,"SDO_POINT_TYPE"(10,0,NULL),NULL,NULL),'sdo_num_res=5',1)='TRUE')
   6 - access("T1"."ID"="T2"."RESTAURANTS_ID")


29 rows selected. 


Rewrite the query and use ROWNUM<=n predicate where n is the number associated with sdo_num_rows parameter:

SELECT /*+ gather_plan_statistics */ t1.id, t2.id, t2.restaurants_id, t2.item_number
FROM (SELECT * 
      FROM restaurants 
      WHERE
           SDO_NN(point,
                  SDO_GEOMETRY(
                                   2001 -- 2 dimensional point
                                ,  null 
                                ,  SDO_POINT_TYPE(10,0,null)
                                ,  null
                                ,  null
                    ),
                 'sdo_num_res=5',
                 1 
           )='TRUE'
           AND rownum<=5
    ) t1
   , menu_items t2
WHERE t1.id=t2.restaurants_id    
ORDER BY t1.id, t2.item_number;


… and the execution statistics :

Plan hash value: 3570932640

----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                       | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                            |      1 |        |     25 |00:00:00.01 |     101 |       |       |          |
|   1 |  SORT ORDER BY                   |                            |      1 |     25 |     25 |00:00:00.01 |     101 |  2048 |  2048 | 2048  (0)|
|   2 |   NESTED LOOPS                   |                            |      1 |        |     25 |00:00:00.01 |     101 |       |       |          |
|   3 |    NESTED LOOPS                  |                            |      1 |     25 |     25 |00:00:00.01 |      76 |       |       |          |
|   4 |     VIEW                         |                            |      1 |      5 |      5 |00:00:00.01 |      64 |       |       |          |
|*  5 |      COUNT STOPKEY               |                            |      1 |        |      5 |00:00:00.01 |      64 |       |       |          |
|   6 |       TABLE ACCESS BY INDEX ROWID| RESTAURANTS                |      1 |      5 |      5 |00:00:00.01 |      64 |       |       |          |
|*  7 |        DOMAIN INDEX              | RESTAURANTS_SIDX           |      1 |        |      5 |00:00:00.01 |      63 |       |       |          |
|*  8 |     INDEX RANGE SCAN             | MENU_ITEMS_RESTAURANTS_IDX |      5 |      5 |     25 |00:00:00.01 |      12 |       |       |          |
|   9 |    TABLE ACCESS BY INDEX ROWID   | MENU_ITEMS                 |     25 |      5 |     25 |00:00:00.01 |      25 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter(ROWNUM<=5)
   7 - access("MDSYS"."SDO_NN"("POINT","MDSYS"."SDO_GEOMETRY"(2001,NULL,"SDO_POINT_TYPE"(10,0,NULL),NULL,NULL),'sdo_num_res=5',1)='TRUE')
   8 - access("T1"."ID"="T2"."RESTAURANTS_ID")


34 rows selected.


As you can see, Oracle CBO made correct cardinality estimates in the last two examples and used NESTED LOOPS method to join both tables that resulted with performance improvement.

I hope you got some valuable information out of this post.

Saturday, July 27, 2013

CURSOR_SHARING=SIMILAR available in Oracle 12c, or not ?

It was announced before Oracle Database 12c was released that CURSOR_SHARING=SIMILAR will be deprecated (MOS Note 1169017.1). Moreover, according to the same note the ability to set this value will be removed.

And indeed, when I looked into 12c documentation, I found EXACT and FORCE being the only available values where the former is default value. (link)

I decided to play around a bit and immediately figured that SIMILAR could still be assigned to CURSOR_SHARING.

test[12,133]> ALTER SESSION SET CURSOR_SHARING=SIMILAR;

Session altered. 

When I tried to assign an invalid value to CURSOR_SHARING using SQL*Plus I got an error message that says EXACT, SIMILAR and FORCE are acceptable values for CURSOR_SHARING parameter.

test[12,133]> ALTER SESSION SET CURSOR_SHARING=foo;
ERROR:
ORA-00096: invalid value FOO for parameter cursor_sharing, must be 
from among SIMILAR, EXACT, FORCE 

Couple of reasons I can think of as why SIMILAR is still allowed. The first, it may be some sort of backward compatibility (so that applications that explicitly set this value don't break when migrated to 12c); or it may be that CURSOR_SHARING=SIMILAR is still a valid option if COMPATIBLE parameter is set to an earlier release where the value was supported. (didn't have time to play with COMPATIBLE parameter)

Anyway, my main question was how Oracle will behave if CURSOR_SHARING is set to SIMILAR in 12c.

I ran a quick test. I created one table, named TAB1 and then executed one sub-optimally shareable statement 5 times passing 5 different literal values for each of the executions.

When CURSOR_SHARING is set to SIMILAR and a suboptimal statement using literals is executed, Oracle would not share any of the existing cursors and will perform a hard parse.

CREATE TABLE tab1 AS (SELECT * FROM user_objects);

ALTER SESSION SET CURSOR_SHARING=SIMILAR;

SELECT COUNT(1) FROM tab1 WHERE object_id>1;

SELECT COUNT(1) FROM tab1 WHERE object_id>2;

SELECT COUNT(1) FROM tab1 WHERE object_id>3;

SELECT COUNT(1) FROM tab1 WHERE object_id>4;

SELECT COUNT(1) FROM tab1 WHERE object_id>5; 

For each of these statements the literals were replaced with bind variables hence the SQL that was parsed has sql id c73v21bgp4956 and text like the one below:

SELECT COUNT(:"SYS_B_0") FROM tab1 WHERE object_id>:"SYS_B_1"

After running the statements from above, I observed the content in v$sqlarea, v$sql and v$sql_shared_cursor views to see if Oracle did a hard parsing.
 
set line 200
col sql_text format a80
SELECT sql_id, sql_text, executions, child_number FROM v$sql WHERE sql_id='c73v21bgp4956';

SQL_ID        SQL_TEXT                                                                         EXECUTIONS CHILD_NUMBER
------------- -------------------------------------------------------------------------------- ---------- ------------
c73v21bgp4956 SELECT COUNT(:"SYS_B_0") FROM tab1 WHERE object_id>:"SYS_B_1"                             5            0


As you can see Oracle reused the same child cursor 5 times, which means even though CURSOR_SHARING was set to SIMILAR it behaved as it was set to FORCE.

I ran the same example from above in 11.2.0.2 database. This time Oracle behaved as expected when CURSOR_SHARING is set to SIMILAR and used 5 different child cursors.
 
set line 200
col sql_text format a80
SELECT sql_id, sql_text, executions, child_number FROM v$sql WHERE sql_id='c73v21bgp4956';

SQL_ID        SQL_TEXT                                                                         EXECUTIONS CHILD_NUMBER
------------- -------------------------------------------------------------------------------- ---------- ------------
c73v21bgp4956 SELECT COUNT(:"SYS_B_0") FROM tab1 WHERE object_id>:"SYS_B_1"                             1            0
c73v21bgp4956 SELECT COUNT(:"SYS_B_0") FROM tab1 WHERE object_id>:"SYS_B_1"                             1            1
c73v21bgp4956 SELECT COUNT(:"SYS_B_0") FROM tab1 WHERE object_id>:"SYS_B_1"                             1            2
c73v21bgp4956 SELECT COUNT(:"SYS_B_0") FROM tab1 WHERE object_id>:"SYS_B_1"                             1            3
c73v21bgp4956 SELECT COUNT(:"SYS_B_0") FROM tab1 WHERE object_id>:"SYS_B_1"                             1            4


When I checked v$sql_shared_cursor I found that all of the child cursors had Y for HASH_MATCH_FAILED column which is expected when suboptimal statement is executed with CURSOR_SHARING=SIMILAR.


Conclusion


Even though setting SIMILAR to CURSOR_SHARING initialization parameter is still acceptable value, it looks Oracle ignore it and behave as if FORCE is used. (this may not be true if compatible parameter is set to an earlier release, like 11.2.0.2 for example. I didn't have time to test this scenario)


Hope this helps.

Friday, July 26, 2013

ORA-12514 during switchover using Data Guard Broker (Update)

This is just a short update for an earlier post about getting ORA-12514 while performing switchover using DataGuard broker.

There was a comment on whether or not _DGMGRL static service is still required when performing a switchover in 11.2 and onwards.

In order for the broker to be able to successfully start an instance during a switchover operation, static service needs to be registered with the listener. Starting from 11.2.0.1 this service doesn’t have to be "<db_unique_name>_DGMGRL.<db_domain>". Oracle introduced a new instance-level property, StaticConnectIdentifier. As a value, this property accepts a valid net service name (defined in tnsnames.ora) or full connection identifier. Therefore, starting from 11.2.0.1 you have flexibility to use any service, which still needs to be statically registered with the listener.

Here is an example:

There are two databases db112a (primary) and db112b(standby) hosted on hosta and hostb respectively.

Below is the content of the tnsnames.ora. It is identical for both hosts:



DB112A=
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = hosta)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = DB112A)
    )
  )

DB112B=
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = hostb)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = DB112B)
    )
  )


listener.ora on HOSTA has the following content:


SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = db112a)
      (ORACLE_HOME = /oracle/product/11.2/dbms)
      (SID_NAME = db112a)
    )
  )

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = hosta)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
    )
  )


listener.ora for HOSTB has db112b service statically registered:


SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = db112b)
      (ORACLE_HOME = /oracle/product/11.2/dbms)
      (SID_NAME = db112b)
    )
  )

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = hostb)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
    )
  )


Now the only thing left would be to set the StaticConnectIdentifier property for each of the databases.


DGMGRL> connect sys
Password:
Connected.
DGMGRL> show configuration

Configuration - db112

  Protection Mode: MaxPerformance
  Databases:
    dg112a - Primary database
    dg112b - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

DGMGRL>

DGMGRL> edit database db112a set property staticConnectidentifier='db112a'; 
DGMGRL> edit database db112b set property staticConnectidentifier='db112b'; 


And you should be all set to perform a switchover:


DGMGRL> switchover to dg112b
Performing switchover NOW, please wait...
New primary database "dg112b" is opening...
Operation requires shutdown of instance "dg112a" on database "dg112a"
Shutting down instance "dg112a"...
ORACLE instance shut down.
Operation requires startup of instance "dg112a" on database "dg112a"
Starting instance "dg112a"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "dg112b"

DGMGRL> show configuration

Configuration - db112

  Protection Mode: MaxPerformance
  Databases:
    dg112b - Primary database
    dg112a - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS


Should you not want to deal with staticConnectidentifier you still need to have "<db_unique_name>_DGMGRL.<db_domain>" statically registered in order for the broker to be able to start the instance.

Useful resources:

MOS Note # 1305019.1 - 11.2 Data Guard Physical Standby Switchover Best Practices using the Broker


Hope this helps.

Wednesday, July 03, 2013

Oracle 12c - Session-specific statistics for global temporary tables

It’s been more than a week since Oracle Database 12c officially became available. I spent the very first night on reading the documentation, New Features chapter in particular.

New version, new features, improvements, enhancements, some more exciting than others. One however, intrigued me a bit .That is Session-specific statistics for global temporary tables. From time to time I am involved in optimizing queries that use global temporary tables so I am well aware of the challenges that come with them.

As the name suggest temporary tables contain temporary data. The data is visible only to the current session and persist either to the next commit or until the session is terminated. Either way, two sessions may populate the same temporary table with different data and concurrently run identical queries against it.

As we all know queries are executed using an execution plan generated by the optimizer. The efficiency of the execution plan depends upon the information the optimizer has about the data at the time the query is parsed. In my opinion, in order to get good execution plans(I am deliberately avoiding to say optimal plans) , accurate statistics are the most important piece of information you want to feed the optimizer with . 

However, when temporary tables are involved, it is often a challenge for the optimizer to generate efficient execution plan(s) that will be appropriate for all use cases.

In some cases when temporary tables are populated with data having similar characteristics, it is often a practice to have appropriate statistics collected and fixed (dynamic sampling may also fit in this case) so the optimizer produces efficient and stable execution plan(s). (one size fits all)

However, often temporary tables do not have statistics simply because it is difficult to find ones that will cover all use cases.

Consider the following example:

There is table MY_OBJECTS created using the following CTAS statement:

 CREATE TABLE my_objects AS  
 SELECT * FROM all_objects  
       , (SELECT * FROM dual CONNECT BY rownum&lt;=10);  

A table MY_TYPES that contains all distinct data types derived from ALL_OBJECTS

 CREATE TABLE my_types AS  
 SELECT DISTINCT object_type FROM all_objects;  

A table T that has the same structure as MY_OBJECTS

 CREATE TABLE t AS   
 SELECT * FROM my_objects WHERE 1=2;  

And a global temporary table TEMP_OBJECTS that has only one column which will accept OBJECT_IDs

 CREATE GLOBAL TEMPORARY TABLE temp_objects   
 (object_id NUMBER NOT NULL)   
 ON COMMIT PRESERVE ROWS;  

Collect statistics for MY_OBJECTS and MY_TYPES tables

 EXEC dbms_stats.gather_table_stats(ownname=>'TEST',tabname=>'MY_OBJECTS',cascade=>true);  
 EXEC dbms_stats.gather_table_stats(ownname=>'TEST',tabname=>'MY_TYPES',cascade=>true);  

Let's suppose there are two sessions that load some data in TEMP_OBJECTS and then execute the following insert statement:

 INSERT INTO t  
 SELECT /*+ gather_plan_statistics */ o.*   
 FROM my_objects o  
   , my_types t  
   , temp_objects tt  
 WHERE o.object_type=t.object_type   
  AND o.object_id=tt.object_id;  

The first session (SID 22) inserts one record in TEMP_OBJECTS and then executes the insert statement from above by invoking insert_1.sql script.


 test[22,39]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum&lt;=1);  
   
 1 row created.  
   
 test[22,39]> commit;  
   
 Commit complete.  
   
 test[22,39]> @insert_1  
   
 10 rows created.  
   
   

Let's check the execution plan being used. It is based on Merge Cartesian Join, which is somewhat acceptable in this case.

 test[22,39]> @explain_last  
   
 PLAN_TABLE_OUTPUT  
 ----------------------------------------------------------------------
SQL_ID 8a1c7phuna9vn, child number 0  
 -------------------------------------  
 insert into t select /*+ gather_plan_statistics */ o.* from my_objects  
 o, my_types t, temp_objects tt where o.object_type=t.object_type and  
 o.object_id=tt.object_id  
   
 Plan hash value: 3579371359  
   
 ---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |              |      1 |        |      0 |00:00:02.56 |   14067 |  14028 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL |              |      1 |        |      0 |00:00:02.56 |   14067 |  14028 |       |       |          |
|*  2 |   HASH JOIN              |              |      1 |     10 |     10 |00:00:00.04 |   14036 |  14028 |  1519K|  1519K| 1521K (0)|
|   3 |    MERGE JOIN CARTESIAN  |              |      1 |     38 |     38 |00:00:00.02 |       6 |      2 |       |       |          |
|   4 |     TABLE ACCESS FULL    | TEMP_OBJECTS |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |       |       |          |
|   5 |     BUFFER SORT          |              |      1 |     38 |     38 |00:00:00.02 |       3 |      2 | 73728 | 73728 |          |
|   6 |      TABLE ACCESS FULL   | MY_TYPES     |      1 |     38 |     38 |00:00:00.02 |       3 |      2 |       |       |          |
|   7 |    TABLE ACCESS FULL     | MY_OBJECTS   |      1 |    882K|    882K|00:00:02.09 |   14030 |  14026 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------   
 Predicate Information (identified by operation id):  
 ---------------------------------------------------  
   
   2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE" AND "O"."OBJECT_ID"="TT"."OBJECT_ID")  
   
 Note  
 -----  
   - dynamic statistics used: dynamic sampling (level=2)  
   
   
 30 rows selected.  
   


Session with SID (251) kicks in a moment later loading 1000 rows in TEMP_OBJECTS and runs the same insert statement. Since the same SQL statement has already been parsed , this session will reuse the existing cursor. However, in this case the plan being used is the one that should be avoided since cartesian join operation may severely impact performances:

 test[251,65]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum&lt;=1000);  
   
 1000 rows created.  
   
 test[251,65]> @insert_1  
   
 10000 rows created.  
   
 test[251,65]> @explain_last  
   
 PLAN_TABLE_OUTPUT  
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 SQL_ID 8a1c7phuna9vn, child number 0  
 -------------------------------------  
 insert into t select /*+ gather_plan_statistics */ o.* from my_objects  
 o, my_types t, temp_objects tt where o.object_type=t.object_type and  
 o.object_id=tt.object_id  
   
 Plan hash value: 3579371359  
   
 ---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |              |      1 |        |      0 |00:00:03.87 |   15495 |  14026 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL |              |      1 |        |      0 |00:00:03.87 |   15495 |  14026 |       |       |          |
|*  2 |   HASH JOIN              |              |      1 |     10 |  10000 |00:00:00.22 |   14037 |  14026 |  3162K|  2024K| 3101K (0)|
|   3 |    MERGE JOIN CARTESIAN  |              |      1 |     38 |  38000 |00:00:00.73 |       7 |      0 |       |       |          |
|   4 |     TABLE ACCESS FULL    | TEMP_OBJECTS |      1 |      1 |   1000 |00:00:00.02 |       4 |      0 |       |       |          |
|   5 |     BUFFER SORT          |              |   1000 |     38 |  38000 |00:00:00.21 |       3 |      0 | 73728 | 73728 |          |
|   6 |      TABLE ACCESS FULL   | MY_TYPES     |      1 |     38 |     38 |00:00:00.01 |       3 |      0 |       |       |          |
|   7 |    TABLE ACCESS FULL     | MY_OBJECTS   |      1 |    882K|    882K|00:00:01.97 |   14030 |  14026 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------
   
 Predicate Information (identified by operation id):  
 ---------------------------------------------------  
   
   2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE" AND "O"."OBJECT_ID"="TT"."OBJECT_ID")  
   
 Note  
 -----  
   - dynamic statistics used: dynamic sampling (level=2)  
   
   
 30 rows selected.  
   

As expected, the same cursor was reused by both sessions.

 sys[263,185]> select plan_hash_value, child_number, CHILD_ADDRESS, executions, invalidations, is_shareable, is_obsolete from v$sql where sql_id='8a1c7phuna9vn';  
   
 PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS  EXECUTIONS INVALIDATIONS I I  
 --------------- ------------ ---------------- ---------- ------------- - -  
    3579371359      0 00000000618B54B8     2       0 Y N  
   
 sys[263,185]>  


As you can sense, plan stability is one of the biggest challenges when dealing with queries that depend on temporary tables.

There are several strategies available, that I am aware of, on how to attack this problem. They include: use of stored outlines; fixed statistics; baselines; hints (cardinality hint for example); periodically invalidating plans; using separate workspaces (schemas) for different types of workloads; and couple of others more or less effective. Each of these comes with its own limitations.

With Oracle 12c users have an ability to gather session specific statistics for global temporary tables. This feature seems to nicely address the challenges from above.

When session specific statistics are used, each session collects “private” statistics and have CBO generate execution plan/cursor based on them. These cursors are not shared with other sessions. Also a cursor that’s been generated using session specific statistics is invalidated when statistics are re-collected within the same session.

Session specific statistics are enabled when GLOBAL_TEMP_TABLE_STATS preference is set to SESSION.

 exec dbms_stats.set_table_prefs(ownname=>'TEST', tabname=>'TEMP_OBJECTS',pname=>'GLOBAL_TEMP_TABLE_STATS',pvalue=>'SESSION');  

Once the preference is set all one needs to do is to gather stats for session based statistics to be collected.

And that is all, now let’s see which plans will be generated for each of the sessions if we repeat the example from above when session statistics are being used.

Session with SID 251 loads 1000 rows into TEMP_OBJECTS table, collects stats and executes the same insert statement.

 test[251,69]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum&lt;=1000);  
   
 1000 rows created.  
   
 test[251,69]> exec dbms_stats.gather_table_stats(ownname=>'TEST', tabname=>'TEMP_OBJECTS');  
   
 PL/SQL procedure successfully completed.  
   
 test[251,69]> @insert_1  
   
 10000 rows created.  

Now, the execution plan used is based on a HASH JOIN rather than MERGE CARTESIAN JOIN operation

 test[251,69]> @explain_last  
   
 PLAN_TABLE_OUTPUT  
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 SQL_ID 8a1c7phuna9vn, child number 1  
 -------------------------------------  
 insert into t select /*+ gather_plan_statistics */ o.* from my_objects  
 o, my_types t, temp_objects tt where o.object_type=t.object_type and  
 o.object_id=tt.object_id  
   
 Plan hash value: 4256520316  
   
 ---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |              |      1 |        |      0 |00:00:02.71 |   15514 |  14026 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL |              |      1 |        |      0 |00:00:02.71 |   15514 |  14026 |       |       |          |
|*  2 |   HASH JOIN              |              |      1 |   9858 |  10000 |00:00:00.07 |   14037 |  14026 |  1696K|  1696K| 1583K (0)|
|   3 |    TABLE ACCESS FULL     | MY_TYPES     |      1 |     38 |     38 |00:00:00.01 |       3 |      0 |       |       |          |
|*  4 |    HASH JOIN             |              |      1 |   9858 |  10000 |00:00:00.04 |   14034 |  14026 |  2293K|  2293K| 1607K (0)|
|   5 |     TABLE ACCESS FULL    | TEMP_OBJECTS |      1 |   1000 |   1000 |00:00:00.01 |       4 |      0 |       |       |          |
|   6 |     TABLE ACCESS FULL    | MY_OBJECTS   |      1 |    882K|    882K|00:00:01.64 |   14030 |  14026 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------- 
   
 Predicate Information (identified by operation id):  
 ---------------------------------------------------  
   
   2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE")  
   4 - access("O"."OBJECT_ID"="TT"."OBJECT_ID")  
   
 Note  
 -----  
   - Global temporary table session private statistics used  
   
   
 30 rows selected.  

Note the “Note” - Global temporary table session private statistics used.

If you check v$sqlarea or v$sql you will see there are two child cursors for the same sql id.


 sys[263,185]> select plan_hash_value, child_number, CHILD_ADDRESS, executions, invalidations, is_shareable, is_obsolete from v$sql where sql_id='8a1c7phuna9vn';  
   
 PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS  EXECUTIONS INVALIDATIONS I I  
 --------------- ------------ ---------------- ---------- ------------- - -  
    3579371359      0 00000000618B54B8     2       0 Y N  
    4256520316      1 00000000625A3928     1       0 Y N  
   
 sys[263,185]>   


It is worth to mention that sessions that use global statistics will keep using without interfering with the session specific stats and cursors.

For example, session with SID 26 loads 1000 rows, but doesn't collect stats on TEMP_OBJECTS table. The insert statement in this case will use the globally available cursors (child cursor 0)

   
 test[26,145]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum&lt;=1000);  
   
 1000 rows created.  
   
 test[26,145]> commit;  
   
 Commit complete.  
   
 test[26,145]> @insert_1  
   
 10000 rows created.  
   
 test[26,145]> @explain_last  
   
 PLAN_TABLE_OUTPUT  
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 SQL_ID 8a1c7phuna9vn, child number 0  
 -------------------------------------  
 insert into t select /*+ gather_plan_statistics */ o.* from my_objects  
 o, my_types t, temp_objects tt where o.object_type=t.object_type and  
 o.object_id=tt.object_id  
   
 Plan hash value: 3579371359  
   
 ---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |              |      1 |        |      0 |00:00:03.71 |   15513 |  14026 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL |              |      1 |        |      0 |00:00:03.71 |   15513 |  14026 |       |       |          |
|*  2 |   HASH JOIN              |              |      1 |     10 |  10000 |00:00:00.13 |   14039 |  14026 |  3162K|  2024K| 3114K (0)|
|   3 |    MERGE JOIN CARTESIAN  |              |      1 |     38 |  38000 |00:00:00.64 |       9 |      0 |       |       |          |
|   4 |     TABLE ACCESS FULL    | TEMP_OBJECTS |      1 |      1 |   1000 |00:00:00.01 |       6 |      0 |       |       |          |
|   5 |     BUFFER SORT          |              |   1000 |     38 |  38000 |00:00:00.19 |       3 |      0 | 73728 | 73728 |          |
|   6 |      TABLE ACCESS FULL   | MY_TYPES     |      1 |     38 |     38 |00:00:00.01 |       3 |      0 |       |       |          |
|   7 |    TABLE ACCESS FULL     | MY_OBJECTS   |      1 |    882K|    882K|00:00:01.88 |   14030 |  14026 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------  
   
 Predicate Information (identified by operation id):  
 ---------------------------------------------------  
   
   2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE" AND "O"."OBJECT_ID"="TT"."OBJECT_ID")  
   
 Note  
 -----  
   - dynamic statistics used: dynamic sampling (level=2)  
   
   
 30 rows selected.  
   
 sys[263,185]> select plan_hash_value, child_number, CHILD_ADDRESS, executions, invalidations, is_shareable, is_obsolete from v$sql where sql_id='8a1c7phuna9vn';  
   
 PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS  EXECUTIONS INVALIDATIONS I I  
 --------------- ------------ ---------------- ---------- ------------- - -  
    3579371359      0 00000000618B54B8     3       0 Y N  
    4256520316      1 00000000625A3928     1       0 Y N  
   

Let's suppose the session with sid 251 modifies the TEMP_OBJECTS table, deletes its content and loads only 1 row followed by refreshing the session specific statistics. In this case it is expected to see the plan with hash value 3579371359 to be used.

 test[251,69]> DELETE FROM temp_objects;  
   
 1000 rows deleted.  
   
 test[251,69]> INSERT INTO temp_objects (SELECT object_id FROM all_objects WHERE rownum&lt;=1);  
   
 1 row created.  
   
 test[251,69]> commit;  
   
 Commit complete.  
   
 test[251,69]> exec dbms_stats.gather_table_stats(ownname=>'TEST', tabname=>'TEMP_OBJECTS');  
   
 PL/SQL procedure successfully completed.  
   
 test[251,69]>  
   


If you check the cursor state (valid/invalid) you can see the session private cursor (child cursor 1) was just invalidated.

 PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS  EXECUTIONS INVALIDATIONS I I  
 --------------- ------------ ---------------- ---------- ------------- - -  
    3579371359      0 00000000618B54B8     3       0 Y N  
    4256520316      1 00000000625A3928     1       1 Y N  
   
 test[251,69]> @insert_1  
   
 10 rows created.  
   
 test[251,69]> @explain_last  
   
 PLAN_TABLE_OUTPUT  
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 SQL_ID 8a1c7phuna9vn, child number 1  
 -------------------------------------  
 insert into t select /*+ gather_plan_statistics */ o.* from my_objects  
 o, my_types t, temp_objects tt where o.object_type=t.object_type and  
 o.object_id=tt.object_id  
   
 Plan hash value: 3579371359  
   
 ---------------------------------------------------------------------------------------------------------------------------------------  
 | Id | Operation        | Name     | Starts | E-Rows | A-Rows |  A-Time  | Buffers | Reads | OMem | 1Mem | Used-Mem |  
 ---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |              |      1 |        |      0 |00:00:02.43 |   14051 |  14026 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL |              |      1 |        |      0 |00:00:02.43 |   14051 |  14026 |       |       |          |
|*  2 |   HASH JOIN              |              |      1 |     10 |     10 |00:00:00.01 |   14037 |  14026 |  1519K|  1519K| 1426K (0)|
|   3 |    MERGE JOIN CARTESIAN  |              |      1 |     38 |     38 |00:00:00.01 |       7 |      0 |       |       |          |
|   4 |     TABLE ACCESS FULL    | TEMP_OBJECTS |      1 |      1 |      1 |00:00:00.01 |       4 |      0 |       |       |          |
|   5 |     BUFFER SORT          |              |      1 |     38 |     38 |00:00:00.01 |       3 |      0 | 73728 | 73728 |          |
|   6 |      TABLE ACCESS FULL   | MY_TYPES     |      1 |     38 |     38 |00:00:00.01 |       3 |      0 |       |       |          |
|   7 |    TABLE ACCESS FULL     | MY_OBJECTS   |      1 |    882K|    882K|00:00:02.05 |   14030 |  14026 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------
   
 Predicate Information (identified by operation id):  
 ---------------------------------------------------  
   
   2 - access("O"."OBJECT_TYPE"="T"."OBJECT_TYPE" AND "O"."OBJECT_ID"="TT"."OBJECT_ID")  
   
 Note  
 -----  
   - Global temporary table session private statistics used  
   
   
 30 rows selected.  
   

As expected the plan with hash value 3579371359 was used.

 PLAN_HASH_VALUE CHILD_NUMBER CHILD_ADDRESS  EXECUTIONS INVALIDATIONS I I  
 --------------- ------------ ---------------- ---------- ------------- - -  
    3579371359      0 00000000618B54B8     3       0 Y N  
    3579371359      1 00000000625A3928     1       1 Y N   
   

You could also use v$sql_shared_cursor view to observe how the cursors are created and shared.
In order to track statistics one could use [DBA|USER|ALL]_TAB_STATISTICS views. These views have SCOPE column that indicate whether the statistics are shared or session specific:

 test[251,69]> SELECT owner, table_name, num_rows, last_analyzed, scope FROM dba_tab_statistics WHERE table_name='TEMP_OBJECTS';  
   
OWNER           TABLE_NAME        NUM_ROWS LAST_ANAL SCOPE
--------------- --------------- ---------- --------- -------
TEST            TEMP_OBJECTS                         SHARED
TEST            TEMP_OBJECTS             1 30-JUN-13 SESSION
   

Finally, something to think about is the possibility to increase hard parsing. As you may have observed from above, each parse made after the stats are gathered is a hard parse. Therefore, depending upon your workload you may expect to see more hard parsing.

Resources